{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Data structures\n",
    "GeoST uses standardized internal data structures and data validation to ensure that the\n",
    "functionality that GeoST offers can always reliably be applied to parsed data. This user\n",
    "guide sections dives deeper into GeoST data structures. For a more basic overview of the\n",
    "concepts, see the [Introduction to GeoST](../getting_started/introduction.ipynb#concept).\n",
    "\n",
    "## Point and line data\n",
    "To describe point data (e.g. boreholes, well logs, cpts) and line data (e.g. seismics, \n",
    "GPR, EM) you need a minimal amount of information on the identification and position of each\n",
    "point/line (`Header`). For each point/line there are measurements or descriptions available \n",
    "of the subsurface (`Data`). The following header and data objects are used to\n",
    "describe point and line data:\n",
    "\n",
    "**Header objects:**\n",
    "* *[`PointHeader`](../api_reference/point_header.rst)* describes metadata and spatial information of point surveys.\n",
    "* *[`LineHeader`](../api_reference/line_header.rst)* describes metadata and spatial information of line surveys.\n",
    "\n",
    "**Data objects:**\n",
    "* *[`LayeredData`](../api_reference/layered_data.rst)* describes subsurface data in layers defined by tops and bottoms.\n",
    "* *[`DiscreteData`](../api_reference/discrete_data.rst)* describes subsurface data discretized by depth.\n",
    "* *LineData* NOTE: Not yet implemented\n",
    "\n",
    "These basic objects are used to build `Collections`. E.g. a [`BoreholeCollection`](../api_reference/borehole_collection.rst)\n",
    "is built from the combination of a [`PointHeader`](../api_reference/point_header.rst)\n",
    "object and a [`LayeredData`](../api_reference/layered_data.rst) objects. The below \n",
    "figure gives a complete overview of the object hierarchy in GeoST for point and line data.\n",
    "\n",
    "<p align=\"left\">\n",
    "    <img src=\"../_static/object_hierarchy.png\" alt=\"GeoST object hierarchy\" title=\"GeoST object hierarchy\" width=\"1000\" />\n",
    "</p>\n",
    "\n",
    "### Collection objects\n",
    "Collection objects are composed of an instance of Header and Data. The collection provides\n",
    "additional logic to maintain alignment between header and data. A collection object\n",
    "inherits all methods that are provided both through the child header object and the child data\n",
    "object. For example: you can access spatial selection methods (= header operations) as well as\n",
    "data slicing methods (= data operations) directly from the collection. It is recommended to\n",
    "work with collections by default, unless you specifically need only header or data functionality.\n",
    "\n",
    "GeoST currently offers the following collection classes:\n",
    "\n",
    "* *[`BoreholeCollection`](../api_reference/borehole_collection.rst)*: A collection of borehole data, composed of [`PointHeader`](../api_reference/point_header.rst) and [`LayeredData`](../api_reference/layered_data.rst).\n",
    "* *[`CptCollection`](../api_reference/cpt_collection.rst)*: A collection of cone penetration test data, composed of [`PointHeader`](../api_reference/point_header.rst) and [`DiscreteData`](../api_reference/discrete_data.rst).\n",
    "\n",
    "By default, read functions for point/line data return a collection (see: [Reading data](./reading_data.ipynb)). \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "BoreholeCollection:\n",
      "# header = 67\n"
     ]
    }
   ],
   "source": [
    "import geost\n",
    "\n",
    "# Load the Utrecht Science Park example borehole data\n",
    "boreholes_collection = geost.data.boreholes_usp()\n",
    "\n",
    "# boreholes_collection is an instance of BoreholeCollection and contains 67 boreholes\n",
    "print(boreholes_collection)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Header objects\n",
    "GeoST header objects are built on top of a [`Geopandas.Geodataframe`](https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoDataFrame.html)\n",
    "to hold data, including point geometries ([`PointHeader`](../api_reference/point_header.rst))\n",
    "or linestring geometries ([`LineHeader`](../api_reference/line_header.rst)). The Geodataframe \n",
    "is the attribute named `gdf` within a header object. Each entry (row in the Geodataframe) \n",
    "corresponds to one point or line survey, e.g. one borehole or one seismic line. GeoST \n",
    "header objects offer built-in methods for changing horizontal and vertical reference \n",
    "systems, selecting data based on spatial conditions, export of table data and export of \n",
    "geometries for viewing header data in GIS.\n",
    "\n",
    "A [`PointHeader`](../api_reference/point_header.rst) requires a bare minimum of data\n",
    "columns to describe the data and to ensure that all built-in methods can be applied:\n",
    "\n",
    "| Column name | Validation criteria | Description |\n",
    "| ----------- | ------------------- | ----------- |\n",
    "| nr | Must be interpretable as string | Identification name/number/code of the point survey |\n",
    "| x | Must be of numeric type (int or float) | X-coordinate |\n",
    "| y | Must be of numeric type (int or float) | Y-coordinate |\n",
    "| surface | Must be of numeric type (int or float) and higher than end depth | Surface elevation of the point survey in m |\n",
    "| end | Must be of numeric type (int or float) and lower than surface elevation | End depth of the point survey in m |\n",
    "| geometry | Must be of type `shapely.geometry.Point` | Point geometry of the survey location |\n",
    "\n",
    "The header is not limited to just these columns. Any number of columns can be added to give\n",
    "additional information on surveys. Some built-in analysis methods may even add information\n",
    "to the header. For instance, the method [`PointHeader.get_area_labels`](../api_reference/generated/geost.base.PointHeader.get_area_labels.rst)\n",
    "has an argument `include_in_header` which, if set to true, adds a column with results\n",
    "to the header Geodataframe.\n",
    "\n",
    "If you're only interested in survey locations and/or metadata, it is adviced to directly\n",
    "work with the header object to avoid additional overhead caused by a parent collection \n",
    "object (overhead is caused by checks of the header against data after every operation to \n",
    "ensure header/data alignment). Read functions for point and line data (see: [Reading data](./reading_data.ipynb))\n",
    "return a corresponding collection object by default, but you can assign only the header to \n",
    "a variable in order to continue with just the header data. See the example below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "EPSG:28992\n",
      "EPSG:5709\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>nr</th>\n",
       "      <th>x</th>\n",
       "      <th>y</th>\n",
       "      <th>surface</th>\n",
       "      <th>end</th>\n",
       "      <th>geometry</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>B31H0541</td>\n",
       "      <td>139585</td>\n",
       "      <td>456000</td>\n",
       "      <td>1.20</td>\n",
       "      <td>-9.90</td>\n",
       "      <td>POINT (139585 456000)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>B31H0611</td>\n",
       "      <td>139600</td>\n",
       "      <td>455060</td>\n",
       "      <td>1.20</td>\n",
       "      <td>-23.00</td>\n",
       "      <td>POINT (139600 455060)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>B31H0718</td>\n",
       "      <td>139950</td>\n",
       "      <td>455200</td>\n",
       "      <td>1.30</td>\n",
       "      <td>-271.20</td>\n",
       "      <td>POINT (139950 455200)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>B31H0803</td>\n",
       "      <td>139675</td>\n",
       "      <td>455087</td>\n",
       "      <td>2.16</td>\n",
       "      <td>-4.84</td>\n",
       "      <td>POINT (139675 455087)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>B31H0806</td>\n",
       "      <td>139684</td>\n",
       "      <td>455384</td>\n",
       "      <td>1.00</td>\n",
       "      <td>-49.50</td>\n",
       "      <td>POINT (139684 455384)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         nr       x       y  surface     end               geometry\n",
       "0  B31H0541  139585  456000     1.20   -9.90  POINT (139585 456000)\n",
       "1  B31H0611  139600  455060     1.20  -23.00  POINT (139600 455060)\n",
       "2  B31H0718  139950  455200     1.30 -271.20  POINT (139950 455200)\n",
       "3  B31H0803  139675  455087     2.16   -4.84  POINT (139675 455087)\n",
       "4  B31H0806  139684  455384     1.00  -49.50  POINT (139684 455384)"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Load the Utrecht Science Park example borehole data and only assign the header data.\n",
    "boreholes_header = geost.data.boreholes_usp().header\n",
    "\n",
    "# Print horizontal and vertical reference system properties and the first few rows of\n",
    "# the boreholes header data.\n",
    "print(boreholes_header.horizontal_reference)\n",
    "print(boreholes_header.vertical_reference)\n",
    "boreholes_header.gdf.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data objects\n",
    "GeoST data objects are built on top of a [`Pandas.DataFrames`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) \n",
    "to store data. The actual dataframe is named `df` within a data object. Each entry (row)\n",
    "in the data represents a single layer (in case of [`LayeredData`](../api_reference/layered_data.rst))\n",
    "bounded by a top and a bottom or a single measurement (in case of [`DiscreteData`](../api_reference/discrete_data.rst))\n",
    "at a certain depth. One point or line survey (i.e. one row in the header) can be associated\n",
    "with multiple rows of data. E.g. a single borehole with 10 described layers is represented\n",
    "by one row in the header Geodataframe and ten rows in the data DataFrame. GeoST \n",
    "data objects offer built-in methods for conditional selections, slicing, basic\n",
    "analysis and data export.\n",
    "\n",
    "An instance of [`LayeredData`](../api_reference/layered_data.rst) requires a bare minimum of data\n",
    "columns to describe the data and to ensure that all built-in methods can be applied:\n",
    "\n",
    "| Column name | Validation criteria | Description |\n",
    "| ----------- | ------------------- | ----------- |\n",
    "| nr | Must be interpretable as string | Identification name/number/code of the point survey |\n",
    "| x | Must be of numeric type (int or float) | X-coordinate |\n",
    "| y | Must be of numeric type (int or float) | Y-coordinate |\n",
    "| x_bot | Must be of numeric type (int or float) | X-coordinate of layer bottom (only required if survey does not point straight down) |\n",
    "| y_bot | Must be of numeric type (int or float) | X-coordinate of layer bottom (only required if survey does not point straight down) |\n",
    "| surface | Must be of numeric type (int or float) and higher than end depth | Surface elevation of the point survey in m |\n",
    "| end | Must be of numeric type (int or float) and lower than surface elevation | End depth of the point survey in m |\n",
    "| top | Must be of numeric type (int or float); starts at 0; is increasing | Elevation of layer top. The first layer always starts at 0 and increases downwards |\n",
    "| bottom | Must be of numeric type (int or float); is larger than top; is increasing | Elevation of layer bottom |\n",
    "\n",
    "An instance of [`DiscreteData`](../api_reference/discrete_data.rst) requires a bare minimum of data\n",
    "columns to describe the data and to ensure that all built-in methods can be applied:\n",
    "\n",
    "| Column name | Validation criteria | Description |\n",
    "| ----------- | ------------------- | ----------- |\n",
    "| nr | Must be interpretable as string | Identification name/number/code of the point survey |\n",
    "| x | Must be of numeric type (int or float) | X-coordinate |\n",
    "| y | Must be of numeric type (int or float) | Y-coordinate |\n",
    "| surface | Must be of numeric type (int or float) and higher than end depth | Surface elevation of the point survey in m |\n",
    "| end | Must be of numeric type (int or float) and lower than surface elevation | End depth of the point survey in m |\n",
    "| depth | Must be of numeric type (int or float); is increasing | Depth where the measurement was taken |\n",
    "\n",
    "All other columns contain the actual data with measurements for each layer or at each depth.\n",
    "\n",
    "If you're only interested in the measurements and don't need to work with geometries or\n",
    "any other additional header data, it is adviced to directly work with the data object to \n",
    "avoid additional overhead caused by a parent collection object (overhead is caused by \n",
    "checks of the header against data after every operation to ensure header/data alignment). \n",
    "Read functions for point and line data (see: [Reading data](./reading_data.ipynb))\n",
    "return a corresponding collection object by default, but you can assign only the data object to \n",
    "a variable in order to continue with just the data. See the example below. Some\n",
    "read functions, such as [`read_borehole_table`](../api_reference/generated/geost.read_borehole_table.rst)\n",
    "provide the argument `as_collection` which defaults to True, but can be set to False to\n",
    "only return the [`LayeredData`](../api_reference/layered_data.rst) object in this example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>nr</th>\n",
       "      <th>x</th>\n",
       "      <th>y</th>\n",
       "      <th>surface</th>\n",
       "      <th>end</th>\n",
       "      <th>top</th>\n",
       "      <th>bottom</th>\n",
       "      <th>lith</th>\n",
       "      <th>zm</th>\n",
       "      <th>zmk</th>\n",
       "      <th>...</th>\n",
       "      <th>cons</th>\n",
       "      <th>color</th>\n",
       "      <th>lutum_pct</th>\n",
       "      <th>plants</th>\n",
       "      <th>shells</th>\n",
       "      <th>kleibrokjes</th>\n",
       "      <th>strat_1975</th>\n",
       "      <th>strat_2003</th>\n",
       "      <th>strat_inter</th>\n",
       "      <th>desc</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>B31H0541</td>\n",
       "      <td>139585</td>\n",
       "      <td>456000</td>\n",
       "      <td>1.2</td>\n",
       "      <td>-9.9</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.20</td>\n",
       "      <td>K</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>...</td>\n",
       "      <td>None</td>\n",
       "      <td>ON</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>None</td>\n",
       "      <td>EC</td>\n",
       "      <td>NaN</td>\n",
       "      <td>[TEELAARDE#***#****#*] ..........................</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>B31H0541</td>\n",
       "      <td>139585</td>\n",
       "      <td>456000</td>\n",
       "      <td>1.2</td>\n",
       "      <td>-9.9</td>\n",
       "      <td>0.20</td>\n",
       "      <td>0.60</td>\n",
       "      <td>K</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>...</td>\n",
       "      <td>None</td>\n",
       "      <td>BR</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>None</td>\n",
       "      <td>EC</td>\n",
       "      <td>NaN</td>\n",
       "      <td>[KLEI#***#****#*] grysbruin.</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>B31H0541</td>\n",
       "      <td>139585</td>\n",
       "      <td>456000</td>\n",
       "      <td>1.2</td>\n",
       "      <td>-9.9</td>\n",
       "      <td>0.60</td>\n",
       "      <td>0.95</td>\n",
       "      <td>V</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>...</td>\n",
       "      <td>None</td>\n",
       "      <td>BR</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>None</td>\n",
       "      <td>NI</td>\n",
       "      <td>NaN</td>\n",
       "      <td>[VEEN#***#****#*] donkerbruin.</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>B31H0541</td>\n",
       "      <td>139585</td>\n",
       "      <td>456000</td>\n",
       "      <td>1.2</td>\n",
       "      <td>-9.9</td>\n",
       "      <td>0.95</td>\n",
       "      <td>2.80</td>\n",
       "      <td>Z</td>\n",
       "      <td>NaN</td>\n",
       "      <td>ZMFO</td>\n",
       "      <td>...</td>\n",
       "      <td>None</td>\n",
       "      <td>GR</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>None</td>\n",
       "      <td>EC</td>\n",
       "      <td>NaN</td>\n",
       "      <td>[ZAND#***#****#*] FYN TOT matig fyn# iets slib...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>B31H0541</td>\n",
       "      <td>139585</td>\n",
       "      <td>456000</td>\n",
       "      <td>1.2</td>\n",
       "      <td>-9.9</td>\n",
       "      <td>2.80</td>\n",
       "      <td>4.20</td>\n",
       "      <td>Z</td>\n",
       "      <td>NaN</td>\n",
       "      <td>ZFC</td>\n",
       "      <td>...</td>\n",
       "      <td>None</td>\n",
       "      <td>BR</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>None</td>\n",
       "      <td>BXWI</td>\n",
       "      <td>NaN</td>\n",
       "      <td>[ZAND#***#****#*] fyn# grysbruin.</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 32 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         nr       x       y  surface  end   top  bottom lith  zm   zmk  ...  \\\n",
       "0  B31H0541  139585  456000      1.2 -9.9  0.00    0.20    K NaN  None  ...   \n",
       "1  B31H0541  139585  456000      1.2 -9.9  0.20    0.60    K NaN  None  ...   \n",
       "2  B31H0541  139585  456000      1.2 -9.9  0.60    0.95    V NaN  None  ...   \n",
       "3  B31H0541  139585  456000      1.2 -9.9  0.95    2.80    Z NaN  ZMFO  ...   \n",
       "4  B31H0541  139585  456000      1.2 -9.9  2.80    4.20    Z NaN   ZFC  ...   \n",
       "\n",
       "   cons color lutum_pct plants shells  kleibrokjes strat_1975 strat_2003  \\\n",
       "0  None    ON       NaN      0      0            0       None         EC   \n",
       "1  None    BR       NaN      0      0            0       None         EC   \n",
       "2  None    BR       NaN      0      0            0       None         NI   \n",
       "3  None    GR       NaN      0      0            0       None         EC   \n",
       "4  None    BR       NaN      0      0            0       None       BXWI   \n",
       "\n",
       "  strat_inter                                               desc  \n",
       "0         NaN  [TEELAARDE#***#****#*] ..........................  \n",
       "1         NaN                       [KLEI#***#****#*] grysbruin.  \n",
       "2         NaN                     [VEEN#***#****#*] donkerbruin.  \n",
       "3         NaN  [ZAND#***#****#*] FYN TOT matig fyn# iets slib...  \n",
       "4         NaN                  [ZAND#***#****#*] fyn# grysbruin.  \n",
       "\n",
       "[5 rows x 32 columns]"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Load the Utrecht Science Park example borehole data and only assign the data.\n",
    "boreholes_data = geost.data.boreholes_usp().data\n",
    "\n",
    "# Print the first few rows of boreholes data.\n",
    "boreholes_data.df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>nr</th>\n",
       "      <th>x</th>\n",
       "      <th>y</th>\n",
       "      <th>vertical_datum</th>\n",
       "      <th>surface</th>\n",
       "      <th>cone_penetration_test_fk</th>\n",
       "      <th>cone_penetration_test_result_pk</th>\n",
       "      <th>penetration_length</th>\n",
       "      <th>depth</th>\n",
       "      <th>elapsed_time</th>\n",
       "      <th>...</th>\n",
       "      <th>magnetic_inclination</th>\n",
       "      <th>magnetic_declination</th>\n",
       "      <th>local_friction</th>\n",
       "      <th>pore_ratio</th>\n",
       "      <th>temperature</th>\n",
       "      <th>pore_pressure_u1</th>\n",
       "      <th>pore_pressure_u2</th>\n",
       "      <th>pore_pressure_u3</th>\n",
       "      <th>friction_ratio</th>\n",
       "      <th>end</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>CPT000000009626</td>\n",
       "      <td>140950.998794</td>\n",
       "      <td>455358.997741</td>\n",
       "      <td>NAP</td>\n",
       "      <td>2.0</td>\n",
       "      <td>9579</td>\n",
       "      <td>11690882</td>\n",
       "      <td>0.2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>CPT000000009626</td>\n",
       "      <td>140950.998794</td>\n",
       "      <td>455358.997741</td>\n",
       "      <td>NAP</td>\n",
       "      <td>2.0</td>\n",
       "      <td>9579</td>\n",
       "      <td>11690883</td>\n",
       "      <td>0.3</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>CPT000000009626</td>\n",
       "      <td>140950.998794</td>\n",
       "      <td>455358.997741</td>\n",
       "      <td>NAP</td>\n",
       "      <td>2.0</td>\n",
       "      <td>9579</td>\n",
       "      <td>11690884</td>\n",
       "      <td>0.4</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>CPT000000009626</td>\n",
       "      <td>140950.998794</td>\n",
       "      <td>455358.997741</td>\n",
       "      <td>NAP</td>\n",
       "      <td>2.0</td>\n",
       "      <td>9579</td>\n",
       "      <td>11690885</td>\n",
       "      <td>0.5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>CPT000000009626</td>\n",
       "      <td>140950.998794</td>\n",
       "      <td>455358.997741</td>\n",
       "      <td>NAP</td>\n",
       "      <td>2.0</td>\n",
       "      <td>9579</td>\n",
       "      <td>11690886</td>\n",
       "      <td>0.6</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>None</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 33 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                nr              x              y vertical_datum  surface  \\\n",
       "0  CPT000000009626  140950.998794  455358.997741            NAP      2.0   \n",
       "1  CPT000000009626  140950.998794  455358.997741            NAP      2.0   \n",
       "2  CPT000000009626  140950.998794  455358.997741            NAP      2.0   \n",
       "3  CPT000000009626  140950.998794  455358.997741            NAP      2.0   \n",
       "4  CPT000000009626  140950.998794  455358.997741            NAP      2.0   \n",
       "\n",
       "   cone_penetration_test_fk  cone_penetration_test_result_pk  \\\n",
       "0                      9579                         11690882   \n",
       "1                      9579                         11690883   \n",
       "2                      9579                         11690884   \n",
       "3                      9579                         11690885   \n",
       "4                      9579                         11690886   \n",
       "\n",
       "   penetration_length  depth  elapsed_time  ...  magnetic_inclination  \\\n",
       "0                 0.2    NaN           NaN  ...                  None   \n",
       "1                 0.3    NaN           NaN  ...                  None   \n",
       "2                 0.4    NaN           NaN  ...                  None   \n",
       "3                 0.5    NaN           NaN  ...                  None   \n",
       "4                 0.6    NaN           NaN  ...                  None   \n",
       "\n",
       "  magnetic_declination local_friction pore_ratio temperature pore_pressure_u1  \\\n",
       "0                 None            NaN       None        None             None   \n",
       "1                 None            NaN       None        None             None   \n",
       "2                 None            NaN       None        None             None   \n",
       "3                 None            NaN       None        None             None   \n",
       "4                 None            NaN       None        None             None   \n",
       "\n",
       "  pore_pressure_u2 pore_pressure_u3  friction_ratio  end  \n",
       "0              NaN             None             NaN    0  \n",
       "1              NaN             None             NaN    0  \n",
       "2              NaN             None             NaN    0  \n",
       "3              NaN             None             NaN    0  \n",
       "4              NaN             None             NaN    0  \n",
       "\n",
       "[5 rows x 33 columns]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Load the Utrecht Science Park example CPT data and only assign the data.\n",
    "cpt_data = geost.data.cpts_usp().data\n",
    "\n",
    "# Print the first few rows of CPT data.\n",
    "cpt_data.df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model data\n",
    "GeoST supports working with model data and offers methods to combine these data with\n",
    "point and line data. Model data does not follow the same header/data approach as point\n",
    "and line data. Instead there are generic model classes, of which some have an\n",
    "implementation that adds specific functionality for that model. An example of this is\n",
    "the [`VoxelModel`](../api_reference/voxelmodel.rst) as a generic model class and [`GeoTOP`](../api_reference/bro_geotop.rst)\n",
    "being a specific implementation of a voxel model. GeoST currently support the following \n",
    "generic models and implementations:\n",
    "\n",
    "**Generic models and implementations**\n",
    "* *[`VoxelModel`](../api_reference/voxelmodel.rst)*: Class for voxel models, with data \n",
    "stored in the `ds` attribute, an [`Xarray.Dataset`](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html).\n",
    "    * Implementations: [`GeoTOP`](../api_reference/bro_geotop.rst)\n",
    "* *`LayerModel`*: Class for layer models, not yet implemented\n",
    "    * Implementations: None\n",
    "\n",
    "<p align=\"left\">\n",
    "    <img src=\"../_static/object_hierarchy_models.png\" alt=\"GeoST vmodel object hierarchy\" title=\"GeoST model object hierarchy\" width=\"1000\" />\n",
    "</p>\n",
    "\n",
    "### Voxel models\n",
    "The [`VoxelModel`](../api_reference/voxelmodel.rst) class stores data in the `ds` \n",
    "attribute, which is an [`Xarray.Dataset`](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html).\n",
    "A custom voxel model can be instantiated from a NetCDF file. For this, see the documentation of the \n",
    "[`VoxelModel.from_netcdf`](../api_reference/generated/geost.models.VoxelModel.from_netcdf.rst) class constructor.\n",
    "An instance of [`VoxelModel`](../api_reference/voxelmodel.rst) offers basic methods for \n",
    "selecting, slicing and exporting models.\n",
    "\n",
    "For more guidance on using a Voxel model within GeoST, see the [BRO GeoTOP](../user_guide/bro_geotop.ipynb)\n",
    "section in the user guide.\n",
    "\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "default",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}