geost.read_table#

geost.read_table(file: str | Path, as_collection: bool = True, crs: str | int | CRS = None, vertical_datum: str | int | CRS = None, has_inclined: bool = False, coordinate_names: tuple[str, str] = None, include_in_header: str | Iterable[str] | None = None, column_mapper: dict = None, **kwargs) Collection | DataFrame[source]#

Read tabular information from a file (parquet, csv or Excel) for any given survey data into a geost.Collection or pandas.DataFrame.

Parameters:
  • file (str | Path) – Path to the file to be read. Depending on the file extension, the corresponding Pandas read function will automatically be used. This can be either .parquet, .csv or .xlsx. Optional keyword arguments that can be given in the specific Pandas read function can be passed via the kwargs argument.

  • as_collection (bool, optional) – If True, the table will be read as a Collection. If False, a pd.DataFrame is returned. The default is True.

  • crs (str | int | CRS, optional) – EPSG of the data’s horizontal reference. Takes anything that can be interpreted by pyproj.crs.CRS.from_user_input(). The default is None, which means no CRS will be assigned to the resulting Collection. Only used if as_collection=True.

  • vertical_datum (str | int | CRS, optional) – Vertical datum for the collection. The default is None. Only used if as_collection=True.

  • has_inclined (bool, optional) – Indicates whether the collection has inclined data. The default is False.

  • coordinate_names (tuple[str, str], optional) – Tuple specifying the names of the columns to be used as coordinates for the geometry column. The default is None, which means that it automatically tries to find the names of the x and y columns (see POSSIBLE_COLUMN_NAMING in column_names). If not found, no geometry column will be created.

  • include_in_header (str | Iterable[str] | None, optional) – Columns to aditionally include in the header. The default is None, which means that only the default columns ‘nr’, ‘surface’, ‘x’ and ‘y’ or their aliases are included.

  • column_mapper (dict, optional) – Mapping from column names in the input file to GeoST positional column names. Use this when your file uses non-standard names which cannot be recognized automatically as positional columns (e.g. {‘ID’: ‘nr’, ‘X_RD’: ‘x’, ‘Y_RD’: ‘y’, ‘maaiveld’: ‘surface’, ‘van’: ‘top’, ‘tot’: ‘bottom’}). See geost.validation.column_names.POSITIONAL_COLUMN_NAMES for the accepted column names for each positional column type. If no valid survey-id (e.g. “nr”) column is found after mapping, a KeyError is raised. Missing x/y or depth columns trigger warnings and may limit functionality.

  • **kwargs – Optional keyword arguments for Pandas.read_parquet, Pandas.read_csv or Pandas.read_excel depending on the file extension.

Returns:

Instance of Collection when as_collection=True or pd.DataFrame otherwise.

Return type:

Collection or pd.DataFrame

Examples

>>> import geost
>>> file = "path_to_your_data.parquet"
>>> collection_kwargs = { # Options to pass to a `geost.Collection`
...     "crs": 32631,
...     "vertical_reference": 'Ostend height',
...     "include_in_header": ["nr", "x", "y", "surface", "end"]
... }
>>> collection = geost.read_table(
...     file, column_mapper={'maaiveld': 'surface'}, **collection_kwargs
... )