Retrieving and analysing BRO soil cores#
In this example, we will download soil cores (BRO BHR-P objects) and calculate the total thickness of sand, clay, silt and peat layers for each or the boreholes.
First we import GeoST and retrieve soil cores from the area of Landgoed Oostbroek (located between Utrecht Science Park and Zeist). GeoST directly connects to the BRO REST-API service. This allows the user to download BRO objects from a spatial query and apply GeoST functionality to the loaded objects. In this case we will use a bounding box:
import geost
bhrp_cores = geost.bro_api_read(
"BHR-P", bbox=(141_000, 455_200, 142_500, 456_000)
) # xmin, ymin, xmax, ymax
✅ Invalid surveys were dropped from the DataFrame because geost.config.validation.DROP_INVALID=True
📖 See the user guide section on validation for advanced handling of validation issues: https://deltares-research.github.io/geost/user_guide/validation.html
NOTE: Header has been reset to align with data because AUTO_ALIGN is enabled in the GeoST configuration.
/home/runner/work/geost/geost/geost/validation/validate.py:77: ValidationWarning:
============================================================
⚠️ VALIDATION ISSUE (1/1)
============================================================
Column : 'surface'
Message : Column 'surface' must not contain NaN values, but some values are NaN.
# surveys : 2
# rows : 8
============================================================
warnings.warn(
/home/runner/work/geost/geost/geost/base.py:473: AlignmentWarning: Header covers more/other objects than present in the data table. consider running the method 'reset_header' to update the header.
warnings.warn(
Note that we received a validation warning because two of the boreholes (BHR000000228773 and BHR000000190070) do not have elevation data (column ‘surface’ has NaN values) as they are located in a water body. Objects that do not validate are dropped by default in GeoST. See more on validation and validation settings in the User guide. Let’s take a look now at what we have at this point:
print(bhrp_cores)
bhrp_cores.header.head()
Collection
header (rows, columns) : (85, 6)
data (rows, columns) : (424, 8)
crs: Amersfoort / RD New
vertical datum: NAP height
| nr | x | y | surface | end | geometry | |
|---|---|---|---|---|---|---|
| 0 | BHR000000003923 | 142068.999144 | 455571.998265 | 1.98 | 0.48 | POINT (142068.999 455571.998) |
| 1 | BHR000000004193 | 142171.999332 | 455443.998729 | 1.85 | 0.35 | POINT (142171.999 455443.999) |
| 2 | BHR000000008779 | 142051.999377 | 455657.997234 | 2.33 | 0.83 | POINT (142051.999 455657.997) |
| 3 | BHR000000009051 | 142243.999389 | 455488.998061 | 2.26 | 0.76 | POINT (142243.999 455488.998) |
| 4 | BHR000000021785 | 141966.998934 | 455592.997750 | 2.33 | 0.83 | POINT (141966.999 455592.998) |
BRO objects come in the Rijksdriehoek coordinate reference system by default (EPSG: 28992), but the example project requires us to work in UTM 31N coordinates (EPSG: 32631). Hence we convert the CRS and check the result:
bhrp_cores.to_crs(32631) # This is an in-place operation
print(bhrp_cores.crs)
bhrp_cores.header.head()
EPSG:32631
| nr | x | y | surface | end | geometry | |
|---|---|---|---|---|---|---|
| 0 | BHR000000003923 | 650624.299675 | 5.773135e+06 | 1.98 | 0.48 | POINT (650624.3 5773135.144) |
| 1 | BHR000000004193 | 650731.447988 | 5.773011e+06 | 1.85 | 0.35 | POINT (650731.448 5773010.603) |
| 2 | BHR000000008779 | 650604.482935 | 5.773221e+06 | 2.33 | 0.83 | POINT (650604.483 5773220.535) |
| 3 | BHR000000009051 | 650801.927695 | 5.773058e+06 | 2.26 | 0.76 | POINT (650801.928 5773057.944) |
| 4 | BHR000000021785 | 650521.667780 | 5.773153e+06 | 2.33 | 0.83 | POINT (650521.668 5773152.779) |
Furthermore, we’d only like to work with boreholes that have a minimum length of 1.5 m:
bhrp_cores = bhrp_cores.select_by_length(
1.5
) # Not an in-place operation, so assign to new or overwrite existing variable!
print(bhrp_cores)
Collection
header (rows, columns) : (83, 6)
data (rows, columns) : (414, 8)
crs: WGS 84 / UTM zone 31N
vertical datum: NAP height
It looks like we ditched only two boreholes that were too short.
Next, we need to get the main lithology (Zand, Leem, Klei or Veen = Sand, Silt, Clay or Peat) of each described layer in our collection of boreholes. We can use the “standard_name” column in the data, which is a lithological description including an indication of admixture:
bhrp_cores.data["standardSoilName"].unique()
<ArrowStringArray>
[ 'sterkSiltigeKlei', 'matigSiltigeKlei', 'matigZandigeKlei',
'kleiigZand', 'uiterstSiltigeKlei', 'sterkZandigeKlei',
'zwakSiltigZand', 'sterkZandigeLeem', 'zwakZandigeLeem',
'mineraalarmVeen', 'matigSiltigZand', 'zwakKleiigVeen',
'sterkSiltigZand']
Length: 13, dtype: str
Since we are only interested in the main lithology we need to simplify the names. There is no method for this provided by GeoST, so we will have to create our own. Luckily, GeoST is built on Pandas, which means that you can easily apply your own logic if built-in GeoST methods don’t get you where you want. In this case, we will use a string method of a Pandas Series to use a regex (i.e. regular expression) which extracts the main lithology. This will reduce e.g. “matigSiltigeKlei” to “Klei”.
bhrp_cores.data["main_lithology"] = bhrp_cores.data["standardSoilName"].str.extract(
r"(Klei|Zand|Leem|Veen)$",
expand=False, # Find one of these words at the end of the standardSoilName
)
# Check and show that it worked:
bhrp_cores.data["main_lithology"].unique()
<ArrowStringArray>
['Klei', 'Zand', 'Leem', 'Veen']
Length: 4, dtype: str
Now we can proceed to compute the total thickness of our main lithologies by applying the built-in method “get_cumulative_thickness” on our just created “main_lithology” column.
# By including the "include_in_header=True" argument, the result will be added in-place
# to the header table of the BoreholeCollection. If False, the function will only return
# a Pandas dataframe with the results.
for lithoclass in bhrp_cores.data["main_lithology"].unique():
bhrp_cores.get_cumulative_thickness(
"main_lithology", lithoclass, include_in_header=True
)
# Check if the new columns are present and correct in the header table
bhrp_cores.header.head()
| nr | x | y | surface | end | geometry | Klei_thickness | Zand_thickness | Leem_thickness | Veen_thickness | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | BHR000000003923 | 650624.299675 | 5.773135e+06 | 1.98 | 0.48 | POINT (650624.3 5773135.144) | 1.50 | NaN | NaN | NaN |
| 1 | BHR000000004193 | 650731.447988 | 5.773011e+06 | 1.85 | 0.35 | POINT (650731.448 5773010.603) | 1.20 | 0.30 | NaN | NaN |
| 2 | BHR000000008779 | 650604.482935 | 5.773221e+06 | 2.33 | 0.83 | POINT (650604.483 5773220.535) | 1.50 | NaN | NaN | NaN |
| 3 | BHR000000009051 | 650801.927695 | 5.773058e+06 | 2.26 | 0.76 | POINT (650801.928 5773057.944) | 1.50 | NaN | NaN | NaN |
| 4 | BHR000000021785 | 650521.667780 | 5.773153e+06 | 2.33 | 0.83 | POINT (650521.668 5773152.779) | 0.85 | 0.65 | NaN | NaN |
For the purpose of this example we will quickly plot the results on a map below. Hover your mouse over the points to display header information, including the layer thicknesses that we just added. From this point you would normally export the data or continue with further data analyses.
bhrp_cores.header.explore(style_kwds=dict(color="red", weight=6))