GeoST Accessors#
This page briefly goes into more detail about the header and data accessors available for
Geopandas GeoDataFrames and Pandas DataFrames as shown on the previous page. For header GeoDataFrames the .gsthd
accessor is available and for data DataFrames the .gstda
is available. For these accessors to work, the GeoDataFrame or DataFrame must meet some criteria
which will be explained below. Additionally, we will provide some details about how they
work with several data sources for a better understanding on their use and functionality.
We will use the GeoST borehole and CPT sample data for this exaplanation.
import geost
# Load sample data
cores = geost.data.boreholes_usp()
cpts = geost.data.cpts_usp()
# Separate header and data tables
cores_header, cores_data = cores.header, cores.data
cpts_header, cpts_data = cpts.header, cpts.data
Header .gsthd accessor#
The .gsthd only works on Geopandas GeoDataFrames
which is related to different types of subsurface data that exist. Typically available types
of subsurface data comprise point-like data such as boreholes, CPTs and well-logs and line-like
data such as seismics, GPR and EM. This means that the location of different types of surveys in
some cases are represented by a point and others by a line. In the design, GeoST aims to be
as consistent as possible with creating methods (e.g. select_within_bbox, select_with_points)
but under the hood, some methods may need to work slightly different based on the type of
geometry.
The .gsthd accessor automatically resolves the above need and figures out the correct method
to use based on the datatype in the “geometry” column of a GeoDataFrame. This is done by determining
the geometry type of the first row in the GeoDataFrame and selecting the required backend:
for instance a PointHeader in case of
point data and LineHeader in case of line
data. To illustrate this, we will stepwise show how this works the loaded cores_header.
Let’s first check the geometry type in the header table and see what happens when simply
use the .gsthd attribute.
print(cores_header.geom_type.head()) # Check geometry type and show the first rows
cores_header.gsthd
0 Point
1 Point
2 Point
3 Point
4 Point
dtype: object
<geost.accessors.accessor.Header at 0x7fc15e62d2b0>
As you can see, the geometry type is “Point” and calling the .gsthd attribute prints a
geost.accessors.accessor.Header instance. This is a generic class where the correct backend
(i.e. PointHeader or LineHeader) is chosen based on the geometry type. Since the geometry type
is “Point” we expect that the chosen backend is a PointHeader. We can check this by checking
the resulting backend.
cores_header.gsthd._backend
<geost.accessors.header.PointHeader at 0x7fc15e62d010>
We see that the backend indeed is a PointHeader instance. This makes sure that each called
method when using the accessor is dispatched to the PointHeader instance.
As mentioned before, every GeoST method available in Collection instances is also available
in either the .gsthd or .gstda accessor. The code snippet below shows how methods are called
normally on Collection instances and how this can be done using an accessor.
# Selection with the Collection instance
collection_select = cores.select_within_bbox(139_500, 455_000, 140_000, 455_500)
# Selection with the header accessor
header_select = cores_header.gsthd.select_within_bbox(
139_500, 455_000, 140_000, 455_500
)
# Show the selection results
print(collection_select)
header_select.head()
BoreholeCollection:
# header = 14
| nr | x | y | surface | end | geometry | |
|---|---|---|---|---|---|---|
| 1 | B31H0611 | 139600.0 | 455060.0 | 1.20 | -23.00 | POINT (139600 455060) |
| 2 | B31H0718 | 139950.0 | 455200.0 | 1.30 | -271.20 | POINT (139950 455200) |
| 3 | B31H0803 | 139675.0 | 455087.0 | 2.16 | -4.84 | POINT (139675 455087) |
| 4 | B31H0806 | 139684.0 | 455384.0 | 1.00 | -49.50 | POINT (139684 455384) |
| 5 | B31H0807 | 139684.0 | 455405.0 | 1.00 | -49.50 | POINT (139684 455405) |
Data .gstda accessor#
As described in the Data structures, GeoST mainly distinguishes
between “layered” and “discrete” data. Therefore, some methods that operate on the data table may needs
to work differently with both types of data and the .gstda.
Similar to the header accessor, the correct method is automatically figured out by the .gstda
accessor. This is determined by the presence of the columns “top” and “bottom” (i.e. layered data) or the presence of the column “depth” (i.e. discrete data). Note that the .gstda accessor only works if either of these columns are present.
Similar to the header, the .gstda accessor refers to a generic geost.accessors.accessor.Data
instance where the correct backend is chosen. If we check the backends for the “cores_data” (layered
data) and “cpts_data” (discrete data), we see that this indeed results in different backends.
print(cores_data.gstda._backend)
print(cpts_data.gstda._backend)
<geost.accessors.data.LayeredData object at 0x7fc19f37bb10>
<geost.accessors.data.DiscreteData object at 0x7fc15e69c2d0>
As shown before, using methods with the .gstda accessor only differs slightly from using
Collection instances.
# Selection with the Collection instance
cores_selected = cores.slice_depth_interval(0.5, 1.5) # Between 0.5m and 1.5m depth
# Selection with the data accessor
cores_data_selected = cores_data.gstda.slice_depth_interval(0.5, 1.5)
cpts_data_selected = cpts_data.gstda.slice_depth_interval(0.5, 1.5)
print(cores_selected)
cores_data_selected.head(), cpts_data_selected.head()
BoreholeCollection:
# header = 67
( nr x y surface end top bottom lith zm zmk \
1 B31H0541 139585.0 456000.0 1.2 -9.9 0.50 0.60 K NaN None
2 B31H0541 139585.0 456000.0 1.2 -9.9 0.60 0.95 V NaN None
3 B31H0541 139585.0 456000.0 1.2 -9.9 0.95 1.50 Z NaN ZMFO
9 B31H0611 139600.0 455060.0 1.2 -23.0 0.50 1.10 K NaN None
10 B31H0611 139600.0 455060.0 1.2 -23.0 1.10 1.50 V NaN None
... cons color lutum_pct plants shells kleibrokjes strat_1975 \
1 ... None BR NaN 0 0 0 None
2 ... None BR NaN 0 0 0 None
3 ... None GR NaN 0 0 0 None
9 ... None GR NaN 0 0 0 WE
10 ... None BR NaN 1 0 0 WE
strat_2003 strat_inter desc
1 EC NaN [KLEI#***#****#*] grysbruin.
2 NI NaN [VEEN#***#****#*] donkerbruin.
3 EC NaN [ZAND#***#****#*] FYN TOT matig fyn# iets slib...
9 EC NaN [KLEI#***#****#1] vet# grys# roestig.
10 NI NaN [VEEN#***#****#*] donkerbruin# bosveen MET vee...
[5 rows x 32 columns],
nr x y vertical_datum surface \
3 CPT000000009626 140950.998794 455358.997741 NAP 2.0
4 CPT000000009626 140950.998794 455358.997741 NAP 2.0
5 CPT000000009626 140950.998794 455358.997741 NAP 2.0
6 CPT000000009626 140950.998794 455358.997741 NAP 2.0
7 CPT000000009626 140950.998794 455358.997741 NAP 2.0
cone_penetration_test_fk cone_penetration_test_result_pk \
3 9579 11690885
4 9579 11690886
5 9579 11690887
6 9579 11690888
7 9579 11690889
penetration_length depth elapsed_time ... magnetic_inclination \
3 0.5 0.5 NaN ... None
4 0.6 0.6 NaN ... None
5 0.7 0.7 NaN ... None
6 0.8 0.8 NaN ... None
7 0.9 0.9 NaN ... None
magnetic_declination local_friction pore_ratio temperature pore_pressure_u1 \
3 None NaN None None None
4 None NaN None None None
5 None NaN None None None
6 None NaN None None None
7 None NaN None None None
pore_pressure_u2 pore_pressure_u3 friction_ratio end
3 NaN None NaN -22.0
4 NaN None NaN -22.0
5 NaN None NaN -22.0
6 NaN None NaN -22.0
7 NaN None NaN -22.0
[5 rows x 33 columns])
Use with generic Geopandas/Pandas#
The .gsthd and .gstda accessors also work on any GeoDataFrame or any DataFrame instance
as long as these have required columns available for specific methods to work. Therefore,
also data that has been loaded or created without GeoST can use the accessors.
We will demonstrate this for the header by creating a simple GeoDataFrame with two point geometries and using a GeoST selection method.
import geopandas as gpd
gdf = gpd.GeoDataFrame(geometry=gpd.points_from_xy([1, 10], [1, 20]), crs=28992)
print(gdf)
print("\nSelection result:")
print(gdf.gsthd.select_within_bbox(0, 0, 2, 2))
geometry
0 POINT (1 1)
1 POINT (10 20)
Selection result:
geometry
0 POINT (1 1)
Note that the above GeoDataFrame only contains a “geometry” column and therefore, not all
methods from the .gsthd accessor will work. See the Data structures section for the required columns for all methods to work.
Also for the data table we will create a simple example DataFrame to show it works.
import pandas as pd
df = pd.DataFrame(
{"nr": ["a", "a"], "top": [0, 1], "bottom": [1, 2], "lith": ["clay", "sand"]}
)
print(df)
print("\nSelection result:")
print(df.gstda.slice_by_values("lith", "clay"))
nr top bottom lith
0 a 0 1 clay
1 a 1 2 sand
Selection result:
nr top bottom lith
0 a 0 1 clay
Note again that for the .gstda accessor to work, either the columns “top” and “bottom” (i.e. layered data) or the column “depth” (i.e. discrete data) must be present. Otherwise an error will
be thrown. Similar to the header accessor, several columns need to be present for all methods to work.
See the Data structures section for all the required columns.