{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Filter point observations to pre-defined site networks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To launch this notebook interactively in a Jupyter notebook-like browser interface, please click the \"Launch Binder\" button below. Note that Binder may take several minutes to launch.\n", "\n", "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/hydroframe/subsettools-binder/HEAD?labpath=hf_hydrodata/point/example_site_networks.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook showcases functionality of the `get_point_data` and `get_point_metadata` functions to filter sites based on a pre-defined site network. \n", "\n", "For USGS stream gages, the currently-supported set of site networks include:\n", "\n", " - [GAGESII](https://pubs.usgs.gov/publication/70046617) ('gagesii')\n", " - [GAGESII reference gages](https://pubs.usgs.gov/publication/70046617) ('gagesii_reference')\n", " - [HCDN-2009](https://water.usgs.gov/osw/hcdn-2009/) ('hcdn2009')\n", " - [CAMELS](https://ral.ucar.edu/solutions/products/camels) ('camels')\n", " - [NWM](https://essd.copernicus.org/articles/13/3263/2021/) ('nwm')\n", "\n", "For USGS groundwater wells, the currently-supported set of site networks include:\n", "\n", " - [Climate Response Network](https://water.usgs.gov/ogw/networks.html) ('climate_response_network')\n", "\n", "Please see the full [point module](https://hf-hydrodata.readthedocs.io) documentation for information on what data is available, our data collection process, and new features we are working on! Our [Metadata Description](https://hf-hydrodata.readthedocs.io/en/latest/available_metadata.html#point-observations-metadata) page itemizes the fields that get returned from `get_point_metadata`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Import packages\n", "from hf_hydrodata import register_api_pin, get_point_data, get_point_metadata\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# You need to register on https://hydrogen.princeton.edu/pin \n", "# and run the following with your registered information\n", "# before you can use the hydrodata utilities\n", "register_api_pin(\"your_email\", \"your_pin\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that `get_point_data` and `get_point_metadata` require mandatory parameters of `dataset`, `variable`, `temporal_resolution`, and `aggregation` (and `depth_level` if asking for soil moisture data). Please see [the documentation](https://hf-hydrodata.readthedocs.io/en/latest/available_data.html) for information about what point observation datasets are available and the parameters used to query them. \n", "\n", "The [hf_hydrodata API Reference](https://hf-hydrodata.readthedocs.io/en/latest/hf_hydrodata.point.html) includes information on what optional filtering parameters are available. These include filters for things like a geographic region or date range. Those parameters work cumulatively, so if `state` and `site_ids` are both supplied, for example, then only sites within `site_ids` that are *also* in `state` will be returned." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example: Query stream gage data for GAGES-II sites in Colorado" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example, we are interested in querying the stream gages that are part of the GAGES-II network within the state of Colorado (`state = 'CO'`). We'll focus on data within Water Year 2003, so we'll set `date_start='2002-10-01'` and `date_end='2003-09-30'`. Note that we are setting `site_networks='gagesii'` to get only stream gages that are part of the GAGES-II network." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
date066148000662000006659580066969800670000006701500067016200670190006707500...0937100009371010093714920937152009372000393109104464500394308105413800394839104570300401733105392404402114105350101
02002-10-010.0198100.97635NaN0.116879NaN8.4051NaN9.537111.3766...0.00000014.15000.0234890.469780.2747930.0481100.619771.267840.0713160.33677
12002-10-020.0215081.01031NaN0.148009NaN8.3485NaN9.876712.1690...0.06650515.67820.0404690.532040.2858300.2818680.815042.810190.0713160.39903
22002-10-030.0223571.23388NaN0.164140NaN7.1882NaN8.546610.2729...0.15565019.18740.0919751.109360.4726100.2498890.888621.239540.0696180.46695
32002-10-040.0257531.81969NaN0.146877NaN5.3204NaN5.99968.2919...0.35092019.61190.0435820.506570.8942800.2193250.701840.687690.0682030.43299
42002-10-050.0246211.98100NaN0.143198NaN4.4997NaN5.03746.5090...0.06027922.72490.0268850.472610.5858100.1915910.648070.478270.0667880.43865
\n", "

5 rows × 290 columns

\n", "
" ], "text/plain": [ " date 06614800 06620000 06659580 06696980 06700000 06701500 \\\n", "0 2002-10-01 0.019810 0.97635 NaN 0.116879 NaN 8.4051 \n", "1 2002-10-02 0.021508 1.01031 NaN 0.148009 NaN 8.3485 \n", "2 2002-10-03 0.022357 1.23388 NaN 0.164140 NaN 7.1882 \n", "3 2002-10-04 0.025753 1.81969 NaN 0.146877 NaN 5.3204 \n", "4 2002-10-05 0.024621 1.98100 NaN 0.143198 NaN 4.4997 \n", "\n", " 06701620 06701900 06707500 ... 09371000 09371010 09371492 09371520 \\\n", "0 NaN 9.5371 11.3766 ... 0.000000 14.1500 0.023489 0.46978 \n", "1 NaN 9.8767 12.1690 ... 0.066505 15.6782 0.040469 0.53204 \n", "2 NaN 8.5466 10.2729 ... 0.155650 19.1874 0.091975 1.10936 \n", "3 NaN 5.9996 8.2919 ... 0.350920 19.6119 0.043582 0.50657 \n", "4 NaN 5.0374 6.5090 ... 0.060279 22.7249 0.026885 0.47261 \n", "\n", " 09372000 393109104464500 394308105413800 394839104570300 \\\n", "0 0.274793 0.048110 0.61977 1.26784 \n", "1 0.285830 0.281868 0.81504 2.81019 \n", "2 0.472610 0.249889 0.88862 1.23954 \n", "3 0.894280 0.219325 0.70184 0.68769 \n", "4 0.585810 0.191591 0.64807 0.47827 \n", "\n", " 401733105392404 402114105350101 \n", "0 0.071316 0.33677 \n", "1 0.071316 0.39903 \n", "2 0.069618 0.46695 \n", "3 0.068203 0.43299 \n", "4 0.066788 0.43865 \n", "\n", "[5 rows x 290 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Get point observations data\n", "data_df = get_point_data(dataset=\"usgs_nwis\", variable=\"streamflow\", temporal_resolution=\"daily\", aggregation=\"mean\",\n", " date_start=\"2002-10-01\", date_end=\"2003-09-30\", \n", " state=\"CO\", site_networks=\"gagesii\")\n", "\n", "# View the first five records\n", "data_df.head(5)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
site_idsite_namesite_typeagencystatelatitudelongitudefirst_date_data_availablelast_date_data_availablerecord_count...doihuc8conus1_xconus1_yconus2_xconus2_ygagesii_drainage_areagagesii_classgagesii_site_elevationusgs_drainage_area
006614800MICHIGAN RIVER NEAR CAMERON PASS, COstream gaugeUSGSCO40.496094-105.8650121973-10-012023-12-0118322...None101800011054818148117644.02840Ref3188.01.54
106620000NORTH PLATTE RIVER NEAR NORTHGATE, COstream gaugeUSGSCO40.936639-106.3391941904-06-012023-12-0139782...None101800011020870144818173702.63700Non-ref2388.01431.00
206659580SAND CREEK AT COLORADO-WYOMING STATE LINEstream gaugeUSGSCO40.993650-105.7597031968-10-012020-09-0110075...None10180010nannan1496181479.11089Non-ref2323.029.20
306696980TARRYALL CREEK AT UPPER STATION NEAR COMO, COstream gaugeUSGSCO39.339433-105.9116811978-06-012023-10-135420...None1019000110366901466163961.90650Ref3040.023.90
406700000SOUTH PLATTE RIVER ABOVE CHEESMAN LAKE, CO.stream gaugeUSGSCO39.162769-105.3102731924-10-012023-09-309523...None10190002nannan151516174213.53800Non-ref2092.01627.00
\n", "

5 rows × 23 columns

\n", "
" ], "text/plain": [ " site_id site_name site_type \\\n", "0 06614800 MICHIGAN RIVER NEAR CAMERON PASS, CO stream gauge \n", "1 06620000 NORTH PLATTE RIVER NEAR NORTHGATE, CO stream gauge \n", "2 06659580 SAND CREEK AT COLORADO-WYOMING STATE LINE stream gauge \n", "3 06696980 TARRYALL CREEK AT UPPER STATION NEAR COMO, CO stream gauge \n", "4 06700000 SOUTH PLATTE RIVER ABOVE CHEESMAN LAKE, CO. stream gauge \n", "\n", " agency state latitude longitude first_date_data_available \\\n", "0 USGS CO 40.496094 -105.865012 1973-10-01 \n", "1 USGS CO 40.936639 -106.339194 1904-06-01 \n", "2 USGS CO 40.993650 -105.759703 1968-10-01 \n", "3 USGS CO 39.339433 -105.911681 1978-06-01 \n", "4 USGS CO 39.162769 -105.310273 1924-10-01 \n", "\n", " last_date_data_available record_count ... doi huc8 conus1_x \\\n", "0 2023-12-01 18322 ... None 10180001 1054 \n", "1 2023-12-01 39782 ... None 10180001 1020 \n", "2 2020-09-01 10075 ... None 10180010 nan \n", "3 2023-10-13 5420 ... None 10190001 1036 \n", "4 2023-09-30 9523 ... None 10190002 nan \n", "\n", " conus1_y conus2_x conus2_y gagesii_drainage_area gagesii_class \\\n", "0 818 1481 1764 4.02840 Ref \n", "1 870 1448 1817 3702.63700 Non-ref \n", "2 nan 1496 1814 79.11089 Non-ref \n", "3 690 1466 1639 61.90650 Ref \n", "4 nan 1515 1617 4213.53800 Non-ref \n", "\n", " gagesii_site_elevation usgs_drainage_area \n", "0 3188.0 1.54 \n", "1 2388.0 1431.00 \n", "2 2323.0 29.20 \n", "3 3040.0 23.90 \n", "4 2092.0 1627.00 \n", "\n", "[5 rows x 23 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Get site-level attributes for these sites\n", "metadata_df = get_point_metadata(dataset=\"usgs_nwis\", variable=\"streamflow\", temporal_resolution=\"daily\", aggregation=\"mean\",\n", " date_start=\"2002-10-01\", date_end=\"2003-09-30\", \n", " state=\"CO\", site_networks=\"gagesii\")\n", "\n", "# View the first five records\n", "metadata_df.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This gives us the data for the 289 Colorado GAGES-II sites that have data within the specified date range." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 4 }