{ "cells": [ { "cell_type": "markdown", "id": "bf2bcb02", "metadata": {}, "source": [ "# GeoST Accessors\n", "This page briefly goes into more detail about the header and data accessors available for\n", "Geopandas GeoDataFrames and Pandas DataFrames as shown on the [previous page](./data_structures.ipynb#geost-accessors). For header GeoDataFrames the [`.gsthd`](../api_reference/header_accessors.rst)\n", "accessor is available and for data DataFrames the [`.gstda`](../api_reference/data_accessors.rst)\n", "is available. For these accessors to work, the GeoDataFrame or DataFrame must meet some criteria\n", "which will be explained below. Additionally, we will provide some details about how they\n", "work with several data sources for a better understanding on their use and functionality.\n", "\n", "We will use the GeoST borehole and CPT sample data for this exaplanation.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "4c531e7d", "metadata": {}, "outputs": [], "source": [ "import geost\n", "\n", "# Load sample data\n", "cores = geost.data.boreholes_usp()\n", "cpts = geost.data.cpts_usp()\n", "\n", "# Separate header and data tables\n", "cores_header, cores_data = cores.header, cores.data\n", "cpts_header, cpts_data = cpts.header, cpts.data" ] }, { "cell_type": "markdown", "id": "901dc456", "metadata": {}, "source": [ "## Header `.gsthd` accessor\n", "The [`.gsthd`](../api_reference/header_accessors.rst) **only works on Geopandas GeoDataFrames**\n", "which is related to different types of subsurface data that exist. Typically available types\n", "of subsurface data comprise point-like data such as boreholes, CPTs and well-logs and line-like\n", "data such as seismics, GPR and EM. This means that the location of different types of surveys in\n", "some cases are represented by a point and others by a line. In the design, GeoST aims to be\n", "as consistent as possible with creating methods (e.g. `select_within_bbox`, `select_with_points`)\n", "but under the hood, some methods may need to work slightly different based on the type of\n", "geometry.\n", "\n", "The `.gsthd` accessor automatically resolves the above need and figures out the correct method\n", "to use based on the datatype in the \"geometry\" column of a GeoDataFrame. This is done by determining\n", "the geometry type of the first row in the GeoDataFrame and selecting the required backend:\n", "for instance a [`PointHeader`](../api_reference/header_accessors.rst#pointheader) in case of\n", "point data and [`LineHeader`](../api_reference/data_accessors.rst#lineheader) in case of line\n", "data. To illustrate this, we will stepwise show how this works the loaded `cores_header`.\n", "\n", "Let's first check the geometry type in the header table and see what happens when simply\n", "use the `.gsthd` attribute." ] }, { "cell_type": "code", "execution_count": null, "id": "fc49650d", "metadata": {}, "outputs": [], "source": [ "print(cores_header.geom_type.head()) # Check geometry type and show the first rows\n", "cores_header.gsthd" ] }, { "cell_type": "markdown", "id": "80d7b7bc", "metadata": {}, "source": [ "As you can see, the geometry type is \"Point\" and calling the `.gsthd` attribute prints a\n", "`geost.accessors.accessor.Header` instance. This is a generic class where the correct backend\n", "(i.e. PointHeader or LineHeader) is chosen based on the geometry type. Since the geometry type\n", "is \"Point\" we expect that the chosen backend is a `PointHeader`. We can check this by checking\n", "the resulting backend." ] }, { "cell_type": "code", "execution_count": null, "id": "86c33b92", "metadata": {}, "outputs": [], "source": [ "cores_header.gsthd._backend" ] }, { "cell_type": "markdown", "id": "85d9a982", "metadata": {}, "source": [ "We see that the backend indeed is a `PointHeader` instance. This makes sure that each called\n", "method when using the accessor is dispatched to the PointHeader instance.\n", "\n", "As mentioned before, every GeoST method available in `Collection` instances is also available\n", "in either the `.gsthd` or `.gstda` accessor. The code snippet below shows how methods are called\n", "normally on Collection instances and how this can be done using an accessor." ] }, { "cell_type": "code", "execution_count": null, "id": "bec5f629", "metadata": {}, "outputs": [], "source": [ "# Selection with the Collection instance\n", "collection_select = cores.select_within_bbox(139_500, 455_000, 140_000, 455_500)\n", "\n", "# Selection with the header accessor\n", "header_select = cores_header.gsthd.select_within_bbox(\n", " 139_500, 455_000, 140_000, 455_500\n", ")\n", "\n", "# Show the selection results\n", "print(collection_select)\n", "header_select.head()" ] }, { "cell_type": "markdown", "id": "0e1c46d0", "metadata": {}, "source": [ "## Data `.gstda` accessor\n", "As described in the [Data structures](./data_structures.ipynb#data-table), GeoST mainly distinguishes\n", "between \"layered\" and \"discrete\" data. Therefore, some methods that operate on the data table may needs\n", "to work differently with both types of data and the [`.gstda`](../api_reference/data_accessors.rst).\n", "Similar to the header accessor, the correct method is automatically figured out by the `.gstda`\n", "accessor. This is determined by the presence of the columns **\"top\"** and **\"bottom\"** (i.e. layered data) or the presence of the column **\"depth\"** (i.e. discrete data). Note that the `.gstda` accessor only works if either of these columns are present.\n", "\n", "Similar to the header, the `.gstda` accessor refers to a generic `geost.accessors.accessor.Data`\n", "instance where the correct backend is chosen. If we check the backends for the \"cores_data\" (layered\n", "data) and \"cpts_data\" (discrete data), we see that this indeed results in different backends." ] }, { "cell_type": "code", "execution_count": null, "id": "640d6283", "metadata": {}, "outputs": [], "source": [ "print(cores_data.gstda._backend)\n", "print(cpts_data.gstda._backend)" ] }, { "cell_type": "markdown", "id": "e51700ef", "metadata": {}, "source": [ "As shown before, using methods with the `.gstda` accessor only differs slightly from using\n", "Collection instances." ] }, { "cell_type": "code", "execution_count": null, "id": "a9ea1333", "metadata": {}, "outputs": [], "source": [ "# Selection with the Collection instance\n", "cores_selected = cores.slice_depth_interval(0.5, 1.5) # Between 0.5m and 1.5m depth\n", "\n", "# Selection with the data accessor\n", "cores_data_selected = cores_data.gstda.slice_depth_interval(0.5, 1.5)\n", "cpts_data_selected = cpts_data.gstda.slice_depth_interval(0.5, 1.5)\n", "\n", "print(cores_selected)\n", "cores_data_selected.head(), cpts_data_selected.head()" ] }, { "cell_type": "markdown", "id": "6a7a8efd", "metadata": {}, "source": [ "## Use with generic Geopandas/Pandas\n", "The `.gsthd` and `.gstda` accessors also work on any GeoDataFrame or any DataFrame instance\n", "as long as these have required columns available for specific methods to work. Therefore,\n", "also data that has been loaded or created without GeoST can use the accessors. \n", "\n", "We will demonstrate this for the header by creating a simple GeoDataFrame with two point\n", "geometries and using a GeoST selection method." ] }, { "cell_type": "code", "execution_count": null, "id": "64e89481", "metadata": {}, "outputs": [], "source": [ "import geopandas as gpd\n", "\n", "gdf = gpd.GeoDataFrame(geometry=gpd.points_from_xy([1, 10], [1, 20]), crs=28992)\n", "print(gdf)\n", "print(\"\\nSelection result:\")\n", "print(gdf.gsthd.select_within_bbox(0, 0, 2, 2))" ] }, { "cell_type": "markdown", "id": "314df9e7", "metadata": {}, "source": [ "Note that the above GeoDataFrame only contains a \"geometry\" column and therefore, not all\n", "methods from the `.gsthd` accessor will work. See the [Data structures](./data_structures.ipynb#header-table) section for the required columns for all methods to work.\n", "\n", "Also for the data table we will create a simple example DataFrame to show it works." ] }, { "cell_type": "code", "execution_count": null, "id": "17fe2dfd", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "df = pd.DataFrame(\n", " {\"nr\": [\"a\", \"a\"], \"top\": [0, 1], \"bottom\": [1, 2], \"lith\": [\"clay\", \"sand\"]}\n", ")\n", "print(df)\n", "print(\"\\nSelection result:\")\n", "print(df.gstda.slice_by_values(\"lith\", \"clay\"))" ] }, { "cell_type": "markdown", "id": "289c4e60", "metadata": {}, "source": [ "Note again that for the `.gstda` accessor to work, either the columns **\"top\"** and **\"bottom\"** (i.e. layered data) or the column **\"depth\"** (i.e. discrete data) must be present. Otherwise an error will\n", "be thrown. Similar to the header accessor, several columns need to be present for all methods to work.\n", "See the [Data structures](./data_structures.ipynb#data-table) section for all the required columns." ] } ], "metadata": { "kernelspec": { "display_name": "default", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.5" } }, "nbformat": 4, "nbformat_minor": 5 }