GeoST Accessors#

This page briefly goes into more detail about the header and data accessors available for Geopandas GeoDataFrames and Pandas DataFrames as shown on the previous page. For header GeoDataFrames the .gsthd accessor is available and for data DataFrames the .gstda is available. For these accessors to work, the GeoDataFrame or DataFrame must meet some criteria which will be explained below. Additionally, we will provide some details about how they work with several data sources for a better understanding on their use and functionality.

We will use the GeoST borehole and CPT sample data for this exaplanation.

import geost

# Load sample data
cores = geost.data.boreholes_usp()
cpts = geost.data.cpts_usp()

# Separate header and data tables
cores_header, cores_data = cores.header, cores.data
cpts_header, cpts_data = cpts.header, cpts.data

Header .gsthd accessor#

The .gsthd only works on Geopandas GeoDataFrames which is related to different types of subsurface data that exist. Typically available types of subsurface data comprise point-like data such as boreholes, CPTs and well-logs and line-like data such as seismics, GPR and EM. This means that the location of different types of surveys in some cases are represented by a point and others by a line. In the design, GeoST aims to be as consistent as possible with creating methods (e.g. select_within_bbox, select_with_points) but under the hood, some methods may need to work slightly different based on the type of geometry.

The .gsthd accessor automatically resolves the above need and figures out the correct method to use based on the datatype in the “geometry” column of a GeoDataFrame. This is done by determining the geometry type of the first row in the GeoDataFrame and selecting the required backend: for instance a PointHeader in case of point data and LineHeader in case of line data. To illustrate this, we will stepwise show how this works the loaded cores_header.

Let’s first check the geometry type in the header table and see what happens when simply use the .gsthd attribute.

print(cores_header.geom_type.head())  # Check geometry type and show the first rows
cores_header.gsthd
0    Point
1    Point
2    Point
3    Point
4    Point
dtype: object
<geost.accessors.accessor.Header at 0x7fc15e62d2b0>

As you can see, the geometry type is “Point” and calling the .gsthd attribute prints a geost.accessors.accessor.Header instance. This is a generic class where the correct backend (i.e. PointHeader or LineHeader) is chosen based on the geometry type. Since the geometry type is “Point” we expect that the chosen backend is a PointHeader. We can check this by checking the resulting backend.

cores_header.gsthd._backend
<geost.accessors.header.PointHeader at 0x7fc15e62d010>

We see that the backend indeed is a PointHeader instance. This makes sure that each called method when using the accessor is dispatched to the PointHeader instance.

As mentioned before, every GeoST method available in Collection instances is also available in either the .gsthd or .gstda accessor. The code snippet below shows how methods are called normally on Collection instances and how this can be done using an accessor.

# Selection with the Collection instance
collection_select = cores.select_within_bbox(139_500, 455_000, 140_000, 455_500)

# Selection with the header accessor
header_select = cores_header.gsthd.select_within_bbox(
    139_500, 455_000, 140_000, 455_500
)

# Show the selection results
print(collection_select)
header_select.head()
BoreholeCollection:
# header = 14
nr x y surface end geometry
1 B31H0611 139600.0 455060.0 1.20 -23.00 POINT (139600 455060)
2 B31H0718 139950.0 455200.0 1.30 -271.20 POINT (139950 455200)
3 B31H0803 139675.0 455087.0 2.16 -4.84 POINT (139675 455087)
4 B31H0806 139684.0 455384.0 1.00 -49.50 POINT (139684 455384)
5 B31H0807 139684.0 455405.0 1.00 -49.50 POINT (139684 455405)

Data .gstda accessor#

As described in the Data structures, GeoST mainly distinguishes between “layered” and “discrete” data. Therefore, some methods that operate on the data table may needs to work differently with both types of data and the .gstda. Similar to the header accessor, the correct method is automatically figured out by the .gstda accessor. This is determined by the presence of the columns “top” and “bottom” (i.e. layered data) or the presence of the column “depth” (i.e. discrete data). Note that the .gstda accessor only works if either of these columns are present.

Similar to the header, the .gstda accessor refers to a generic geost.accessors.accessor.Data instance where the correct backend is chosen. If we check the backends for the “cores_data” (layered data) and “cpts_data” (discrete data), we see that this indeed results in different backends.

print(cores_data.gstda._backend)
print(cpts_data.gstda._backend)
<geost.accessors.data.LayeredData object at 0x7fc19f37bb10>
<geost.accessors.data.DiscreteData object at 0x7fc15e69c2d0>

As shown before, using methods with the .gstda accessor only differs slightly from using Collection instances.

# Selection with the Collection instance
cores_selected = cores.slice_depth_interval(0.5, 1.5)  # Between 0.5m and 1.5m depth

# Selection with the data accessor
cores_data_selected = cores_data.gstda.slice_depth_interval(0.5, 1.5)
cpts_data_selected = cpts_data.gstda.slice_depth_interval(0.5, 1.5)

print(cores_selected)
cores_data_selected.head(), cpts_data_selected.head()
BoreholeCollection:
# header = 67
(          nr         x         y  surface   end   top  bottom lith  zm   zmk  \
 1   B31H0541  139585.0  456000.0      1.2  -9.9  0.50    0.60    K NaN  None   
 2   B31H0541  139585.0  456000.0      1.2  -9.9  0.60    0.95    V NaN  None   
 3   B31H0541  139585.0  456000.0      1.2  -9.9  0.95    1.50    Z NaN  ZMFO   
 9   B31H0611  139600.0  455060.0      1.2 -23.0  0.50    1.10    K NaN  None   
 10  B31H0611  139600.0  455060.0      1.2 -23.0  1.10    1.50    V NaN  None   
 
     ...  cons color lutum_pct plants shells  kleibrokjes strat_1975  \
 1   ...  None    BR       NaN      0      0            0       None   
 2   ...  None    BR       NaN      0      0            0       None   
 3   ...  None    GR       NaN      0      0            0       None   
 9   ...  None    GR       NaN      0      0            0         WE   
 10  ...  None    BR       NaN      1      0            0         WE   
 
    strat_2003 strat_inter                                               desc  
 1          EC         NaN                       [KLEI#***#****#*] grysbruin.  
 2          NI         NaN                     [VEEN#***#****#*] donkerbruin.  
 3          EC         NaN  [ZAND#***#****#*] FYN TOT matig fyn# iets slib...  
 9          EC         NaN              [KLEI#***#****#1] vet# grys# roestig.  
 10         NI         NaN  [VEEN#***#****#*] donkerbruin# bosveen MET vee...  
 
 [5 rows x 32 columns],
                 nr              x              y vertical_datum  surface  \
 3  CPT000000009626  140950.998794  455358.997741            NAP      2.0   
 4  CPT000000009626  140950.998794  455358.997741            NAP      2.0   
 5  CPT000000009626  140950.998794  455358.997741            NAP      2.0   
 6  CPT000000009626  140950.998794  455358.997741            NAP      2.0   
 7  CPT000000009626  140950.998794  455358.997741            NAP      2.0   
 
    cone_penetration_test_fk  cone_penetration_test_result_pk  \
 3                      9579                         11690885   
 4                      9579                         11690886   
 5                      9579                         11690887   
 6                      9579                         11690888   
 7                      9579                         11690889   
 
    penetration_length  depth  elapsed_time  ...  magnetic_inclination  \
 3                 0.5    0.5           NaN  ...                  None   
 4                 0.6    0.6           NaN  ...                  None   
 5                 0.7    0.7           NaN  ...                  None   
 6                 0.8    0.8           NaN  ...                  None   
 7                 0.9    0.9           NaN  ...                  None   
 
   magnetic_declination local_friction pore_ratio temperature pore_pressure_u1  \
 3                 None            NaN       None        None             None   
 4                 None            NaN       None        None             None   
 5                 None            NaN       None        None             None   
 6                 None            NaN       None        None             None   
 7                 None            NaN       None        None             None   
 
   pore_pressure_u2 pore_pressure_u3  friction_ratio   end  
 3              NaN             None             NaN -22.0  
 4              NaN             None             NaN -22.0  
 5              NaN             None             NaN -22.0  
 6              NaN             None             NaN -22.0  
 7              NaN             None             NaN -22.0  
 
 [5 rows x 33 columns])

Use with generic Geopandas/Pandas#

The .gsthd and .gstda accessors also work on any GeoDataFrame or any DataFrame instance as long as these have required columns available for specific methods to work. Therefore, also data that has been loaded or created without GeoST can use the accessors.

We will demonstrate this for the header by creating a simple GeoDataFrame with two point geometries and using a GeoST selection method.

import geopandas as gpd

gdf = gpd.GeoDataFrame(geometry=gpd.points_from_xy([1, 10], [1, 20]), crs=28992)
print(gdf)
print("\nSelection result:")
print(gdf.gsthd.select_within_bbox(0, 0, 2, 2))
        geometry
0    POINT (1 1)
1  POINT (10 20)

Selection result:
      geometry
0  POINT (1 1)

Note that the above GeoDataFrame only contains a “geometry” column and therefore, not all methods from the .gsthd accessor will work. See the Data structures section for the required columns for all methods to work.

Also for the data table we will create a simple example DataFrame to show it works.

import pandas as pd

df = pd.DataFrame(
    {"nr": ["a", "a"], "top": [0, 1], "bottom": [1, 2], "lith": ["clay", "sand"]}
)
print(df)
print("\nSelection result:")
print(df.gstda.slice_by_values("lith", "clay"))
  nr  top  bottom  lith
0  a    0       1  clay
1  a    1       2  sand

Selection result:
  nr  top  bottom  lith
0  a    0       1  clay

Note again that for the .gstda accessor to work, either the columns “top” and “bottom” (i.e. layered data) or the column “depth” (i.e. discrete data) must be present. Otherwise an error will be thrown. Similar to the header accessor, several columns need to be present for all methods to work. See the Data structures section for all the required columns.