{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Access and cite point observation data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To launch this notebook interactively in a Jupyter notebook-like browser interface, please click the \"Launch Binder\" button below. Note that Binder may take several minutes to launch.\n", "\n", "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/hydroframe/subsettools-binder/HEAD?labpath=hf_hydrodata/point/example_get_data.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook provides a walk-through of some example functionality for accessing and citing point observations data and site-level metadata using hf_hydrodata's `get_point_data` and `get_point_metadata` functions. Please see the full [point module](https://hf-hydrodata.readthedocs.io) documentation for information on what data is available, our data collection process, and new features we are working on! Our [Metadata Description](https://hf-hydrodata.readthedocs.io/en/latest/available_metadata.html#point-observations-metadata) page itemizes the fields that get returned from `get_point_metadata`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Import packages\n", "import sys\n", "import os\n", "import pandas as pd\n", "from hf_hydrodata import register_api_pin, get_point_data, get_point_metadata, get_citations" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# You need to register on https://hydrogen.princeton.edu/pin \n", "# and run the following with your registered information\n", "# before you can use the hydrodata utilities\n", "register_api_pin(\"your_email\", \"your_pin\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define input parameters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that `get_point_data` and `get_point_metadata` require mandatory parameters of `dataset`, `variable`, `temporal_resolution`, and `aggregation` (and `depth_level` if asking for soil moisture data). Please see [the documentation](https://hf-hydrodata.readthedocs.io/en/latest/available_data.html) for information about what point observation datasets are available and the parameters used to query them. \n", "\n", "The [hf_hydrodata API Reference](https://hf-hydrodata.readthedocs.io/en/latest/hf_hydrodata.point.html) includes information on what optional filtering parameters are available. These include filters for things like a geographic region or date range. Those parameters work cumulatively, so if `state` and `site_ids` are both supplied, for example, then only sites within `site_ids` that are *also* in `state` will be returned." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 1: Specify a date range and geographic bounding box" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example, a specific start and end date are provided, along with a geographic domain. Start and end dates, if provided, must be in 'YYYY-MM-DD' format. If a start date is not provided, data is returned from as early as it is available. Likewise, if an end date is not provided, data is returned through as current as is available.\n", "\n", "Geographic domain specifications, if provided, can be in the form of latitude and/or longitude bounds, a 2-digit state postal code (`state`='NJ'), a specific list of site IDs (see example 2 below), or a shapefile (see example notebook \"[How To Filter Sites by USGS HUC Boundaries](https://hf-hydrodata.readthedocs.io/en/latest/point_data/examples/example_shapefile.html)\"). If no geography restriction is included, sites from the entire continental United States will be returned. In many cases, this exceeds a user's single-request limit of 1GB. Please add additional geography and/or date filters as needed to keep requests within this limit." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
date010110000101350001015800010170000101755001018000010190000102720001029200...01046500011292000101000001010070010105000101400001018500010210000426433104294300
02002-01-019.706913.810412.904821.30990.013301NaN3.08471.986662.43663...46.12923.998411.91431.4829224.055061.4119.112621.90426084.50.2547
12002-01-029.537113.414212.055820.03640.012169NaN3.05641.918742.39135...46.69523.828611.68791.4150023.489059.7139.027721.90426056.20.2547
22002-01-039.339013.074611.518119.07420.011886NaN3.02811.881952.36305...46.97823.828611.51811.3584023.064558.5818.914521.90426084.50.2547
32002-01-049.169212.650111.093626.43220.011320NaN3.05641.836672.34890...51.50623.630511.29171.3131222.640057.4498.857921.90426056.20.2547
42002-01-058.999412.282210.669125.18700.010754NaN3.02811.791392.32060...37.63923.602211.09361.2763322.215556.3178.744721.90425546.80.2830
\n", "

5 rows × 32 columns

\n", "
" ], "text/plain": [ " date 01011000 01013500 01015800 01017000 01017550 01018000 \\\n", "0 2002-01-01 9.7069 13.8104 12.9048 21.3099 0.013301 NaN \n", "1 2002-01-02 9.5371 13.4142 12.0558 20.0364 0.012169 NaN \n", "2 2002-01-03 9.3390 13.0746 11.5181 19.0742 0.011886 NaN \n", "3 2002-01-04 9.1692 12.6501 11.0936 26.4322 0.011320 NaN \n", "4 2002-01-05 8.9994 12.2822 10.6691 25.1870 0.010754 NaN \n", "\n", " 01019000 01027200 01029200 ... 01046500 01129200 01010000 01010070 \\\n", "0 3.0847 1.98666 2.43663 ... 46.129 23.9984 11.9143 1.48292 \n", "1 3.0564 1.91874 2.39135 ... 46.695 23.8286 11.6879 1.41500 \n", "2 3.0281 1.88195 2.36305 ... 46.978 23.8286 11.5181 1.35840 \n", "3 3.0564 1.83667 2.34890 ... 51.506 23.6305 11.2917 1.31312 \n", "4 3.0281 1.79139 2.32060 ... 37.639 23.6022 11.0936 1.27633 \n", "\n", " 01010500 01014000 01018500 01021000 04264331 04294300 \n", "0 24.0550 61.411 9.1126 21.9042 6084.5 0.2547 \n", "1 23.4890 59.713 9.0277 21.9042 6056.2 0.2547 \n", "2 23.0645 58.581 8.9145 21.9042 6084.5 0.2547 \n", "3 22.6400 57.449 8.8579 21.9042 6056.2 0.2547 \n", "4 22.2155 56.317 8.7447 21.9042 5546.8 0.2830 \n", "\n", "[5 rows x 32 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's explore daily streamflow data with optional filters for a date range and bounding box. \n", "\n", "# Request point observations data\n", "data_df = get_point_data(dataset=\"usgs_nwis\", variable=\"streamflow\", temporal_resolution=\"daily\", aggregation=\"mean\",\n", " date_start=\"2002-01-01\", date_end=\"2002-01-05\", latitude_range=(45, 50), longitude_range=(-75, -50))\n", "\n", "# View first five records\n", "data_df.head(5)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
site_idsite_namesite_typeagencystatelatitudelongitudefirst_date_data_availablelast_date_data_availablerecord_count...doihuc8conus1_xconus1_yconus2_xconus2_ygagesii_drainage_areagagesii_classgagesii_site_elevationusgs_drainage_area
001011000Allagash River near Allagash, Mainestream gaugeUSGSME47.069722-69.0794441910-07-012023-11-3034028...None01010002nannan421027833186.8440Non-ref187.01478.00
101013500Fish River near Fort Kent, Mainestream gaugeUSGSME47.237500-68.5827781903-07-292023-12-0136507...None01010003nannan423728102252.6960Ref157.0873.00
201015800Aroostook River near Masardis, Mainestream gaugeUSGSME46.523056-68.3716671957-09-142023-12-0124185...None01010004nannan427627472313.7550Non-ref166.0892.00
301017000Aroostook River at Washburn, Mainestream gaugeUSGSME46.777222-68.1572221930-08-012023-12-0134091...None01010004nannan428127734278.9070Non-ref131.01654.00
401017550Williams Brook at Phair, Mainestream gaugeUSGSME46.628056-67.9530561999-11-012023-12-018797...None01010005nannan4300276210.0323Ref176.03.82
\n", "

5 rows × 23 columns

\n", "
" ], "text/plain": [ " site_id site_name site_type agency state \\\n", "0 01011000 Allagash River near Allagash, Maine stream gauge USGS ME \n", "1 01013500 Fish River near Fort Kent, Maine stream gauge USGS ME \n", "2 01015800 Aroostook River near Masardis, Maine stream gauge USGS ME \n", "3 01017000 Aroostook River at Washburn, Maine stream gauge USGS ME \n", "4 01017550 Williams Brook at Phair, Maine stream gauge USGS ME \n", "\n", " latitude longitude first_date_data_available last_date_data_available \\\n", "0 47.069722 -69.079444 1910-07-01 2023-11-30 \n", "1 47.237500 -68.582778 1903-07-29 2023-12-01 \n", "2 46.523056 -68.371667 1957-09-14 2023-12-01 \n", "3 46.777222 -68.157222 1930-08-01 2023-12-01 \n", "4 46.628056 -67.953056 1999-11-01 2023-12-01 \n", "\n", " record_count ... doi huc8 conus1_x conus1_y conus2_x conus2_y \\\n", "0 34028 ... None 01010002 nan nan 4210 2783 \n", "1 36507 ... None 01010003 nan nan 4237 2810 \n", "2 24185 ... None 01010004 nan nan 4276 2747 \n", "3 34091 ... None 01010004 nan nan 4281 2773 \n", "4 8797 ... None 01010005 nan nan 4300 2762 \n", "\n", " gagesii_drainage_area gagesii_class gagesii_site_elevation \\\n", "0 3186.8440 Non-ref 187.0 \n", "1 2252.6960 Ref 157.0 \n", "2 2313.7550 Non-ref 166.0 \n", "3 4278.9070 Non-ref 131.0 \n", "4 10.0323 Ref 176.0 \n", "\n", " usgs_drainage_area \n", "0 1478.00 \n", "1 873.00 \n", "2 892.00 \n", "3 1654.00 \n", "4 3.82 \n", "\n", "[5 rows x 23 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Request site-level metadata for these sites (using the same filters)\n", "metadata_df = get_point_metadata(dataset=\"usgs_nwis\", variable=\"streamflow\", temporal_resolution=\"daily\", aggregation=\"mean\",\n", " date_start=\"2002-01-01\", date_end=\"2002-01-05\", latitude_range=(45, 50), longitude_range=(-75, -50))\n", "\n", "# View first five records\n", "metadata_df.head(5)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Most U.S. Geological Survey (USGS) information resides in Public Domain and may be used without restriction, though they do ask that proper credit be given. An example credit statement would be: \"(Product or data name) courtesy of the U.S. Geological Survey\". Source: https://www.usgs.gov/information-policies-and-instructions/acknowledging-or-crediting-usgs'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# See how to cite the use of this data\n", "get_citations(dataset=\"usgs_nwis\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 2: Specifying a specific site ID or list of site IDs without a time restriction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instead of latitude/longitude bounds, data for a specific stream gauge or groundwater well can be returned with or without a date bound. Below, daily streamflow data is returned for a single site and then a select list of sites. There is no time restriction in these examples, so all data available in-house is included." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "First five records: \n", " date 01013500\n", "0 1903-07-29 21.5646\n", "1 1903-07-30 21.5646\n", "2 1903-07-31 21.5646\n", "3 1903-08-01 19.2723\n", "4 1903-08-02 18.1686\n", "\n", " Final five records: \n", " date 01013500\n", "36502 2023-11-27 30.281\n", "36503 2023-11-28 31.413\n", "36504 2023-11-29 30.564\n", "36505 2023-11-30 30.281\n", "36506 2023-12-01 29.715\n" ] } ], "source": [ "# Request point observations data for a single site\n", "data = get_point_data(dataset=\"usgs_nwis\", variable=\"streamflow\", temporal_resolution=\"daily\", aggregation=\"mean\", site_ids=\"01013500\")\n", "\n", "# View first five rows\n", "print(\"First five records: \")\n", "print(data.head(5))\n", "\n", "# View final five rows \n", "print(\"\\n Final five records: \")\n", "print(data.tail(5))" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
site_idsite_namesite_typeagencystatelatitudelongitudefirst_date_data_availablelast_date_data_availablerecord_count...doihuc8conus1_xconus1_yconus2_xconus2_ygagesii_drainage_areagagesii_classgagesii_site_elevationusgs_drainage_area
001013500Fish River near Fort Kent, Mainestream gaugeUSGSME47.2375-68.5827781903-07-292023-12-0136507...None01010003nannan423728102252.696Ref157.0873.0
\n", "

1 rows × 23 columns

\n", "
" ], "text/plain": [ " site_id site_name site_type agency state \\\n", "0 01013500 Fish River near Fort Kent, Maine stream gauge USGS ME \n", "\n", " latitude longitude first_date_data_available last_date_data_available \\\n", "0 47.2375 -68.582778 1903-07-29 2023-12-01 \n", "\n", " record_count ... doi huc8 conus1_x conus1_y conus2_x conus2_y \\\n", "0 36507 ... None 01010003 nan nan 4237 2810 \n", "\n", " gagesii_drainage_area gagesii_class gagesii_site_elevation \\\n", "0 2252.696 Ref 157.0 \n", "\n", " usgs_drainage_area \n", "0 873.0 \n", "\n", "[1 rows x 23 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Request the metadata for that site\n", "metadata = get_point_metadata(dataset=\"usgs_nwis\", variable=\"streamflow\", temporal_resolution=\"daily\", aggregation=\"mean\", site_ids=\"01013500\")\n", "metadata.head()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "First five records: \n", " date 01011000 01013500 01029500\n", "0 1902-10-01 NaN NaN 19.810\n", "1 1902-10-02 NaN NaN 19.810\n", "2 1902-10-03 NaN NaN 19.810\n", "3 1902-10-04 NaN NaN 18.678\n", "4 1902-10-05 NaN NaN 17.546\n", "\n", " Final five records: \n", " date 01011000 01013500 01029500\n", "44252 2023-11-27 NaN 30.281 41.035\n", "44253 2023-11-28 NaN 31.413 NaN\n", "44254 2023-11-29 NaN 30.564 NaN\n", "44255 2023-11-30 NaN 30.281 NaN\n", "44256 2023-12-01 NaN 29.715 NaN\n" ] } ], "source": [ "# Request point observations data for multiple sites\n", "data = get_point_data(dataset=\"usgs_nwis\", variable=\"streamflow\", temporal_resolution=\"daily\", aggregation=\"mean\", \n", " site_ids=[\"01013500\", \"01011000\", \"01029500\"])\n", "\n", "# View first five rows\n", "print(\"First five records: \")\n", "print(data.head(5))\n", "\n", "# View final five rows \n", "print(\"\\n Final five records: \")\n", "print(data.tail(5))" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
site_idsite_namesite_typeagencystatelatitudelongitudefirst_date_data_availablelast_date_data_availablerecord_count...doihuc8conus1_xconus1_yconus2_xconus2_ygagesii_drainage_areagagesii_classgagesii_site_elevationusgs_drainage_area
001011000Allagash River near Allagash, Mainestream gaugeUSGSME47.069722-69.0794441910-07-012023-11-3034028...None01010002nannan421027833186.844Non-ref187.01478.0
101013500Fish River near Fort Kent, Mainestream gaugeUSGSME47.237500-68.5827781903-07-292023-12-0136507...None01010003nannan423728102252.696Ref157.0873.0
201029500East Branch Penobscot River at Grindstone, Mainestream gaugeUSGSME45.730278-68.5894441902-10-012023-12-0137315...None01020002nannan429326562816.295Non-ref93.0837.0
\n", "

3 rows × 23 columns

\n", "
" ], "text/plain": [ " site_id site_name site_type \\\n", "0 01011000 Allagash River near Allagash, Maine stream gauge \n", "1 01013500 Fish River near Fort Kent, Maine stream gauge \n", "2 01029500 East Branch Penobscot River at Grindstone, Maine stream gauge \n", "\n", " agency state latitude longitude first_date_data_available \\\n", "0 USGS ME 47.069722 -69.079444 1910-07-01 \n", "1 USGS ME 47.237500 -68.582778 1903-07-29 \n", "2 USGS ME 45.730278 -68.589444 1902-10-01 \n", "\n", " last_date_data_available record_count ... doi huc8 conus1_x \\\n", "0 2023-11-30 34028 ... None 01010002 nan \n", "1 2023-12-01 36507 ... None 01010003 nan \n", "2 2023-12-01 37315 ... None 01020002 nan \n", "\n", " conus1_y conus2_x conus2_y gagesii_drainage_area gagesii_class \\\n", "0 nan 4210 2783 3186.844 Non-ref \n", "1 nan 4237 2810 2252.696 Ref \n", "2 nan 4293 2656 2816.295 Non-ref \n", "\n", " gagesii_site_elevation usgs_drainage_area \n", "0 187.0 1478.0 \n", "1 157.0 873.0 \n", "2 93.0 837.0 \n", "\n", "[3 rows x 23 columns]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Request the site-level attributes for those sites\n", "metadata = get_point_metadata(dataset=\"usgs_nwis\", variable=\"streamflow\", temporal_resolution=\"daily\", aggregation=\"mean\", \n", " site_ids=[\"01013500\", \"01011000\", \"01029500\"])\n", "metadata.head()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Most U.S. Geological Survey (USGS) information resides in Public Domain and may be used without restriction, though they do ask that proper credit be given. An example credit statement would be: \"(Product or data name) courtesy of the U.S. Geological Survey\". Source: https://www.usgs.gov/information-policies-and-instructions/acknowledging-or-crediting-usgs'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# See how to cite the use of this data\n", "get_citations(dataset=\"usgs_nwis\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 3: Add a restriction on the minimum number of observations per site within a requested time range" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The parameter `min_num_obs` allows the user to further specify that a site must have a minimum number of observations within the specified time range (if one is provided).\n", "\n", "The example below ensures that only sites that have valid streamflow data for every day of the calendar year requested get returned." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
date066148000662000006701500067019000670750006708800067090000670953006710150...382628104493700382629104493000383619104520401383637104531301383944104474201384037104472001384047104510301384048104504901384220104503701391504106225200
02005-01-010.0135843.53751.92442.193255.29210.1635740.520720.557510.056600...0.00.00.0025470.00.00.0215080.00.0084900.00.004245
12005-01-020.0133013.39601.92442.145145.20720.1448960.481100.537700.052355...0.00.00.0025470.00.00.0246210.00.0082070.00.004245
22005-01-030.0133013.31111.92442.150805.15060.1287650.495250.503740.058015...0.00.00.0025470.00.00.0237720.00.0079240.00.004245
32005-01-040.0133013.39601.92442.150805.00910.1199920.481100.481100.051506...0.00.00.0025470.00.00.0254700.00.0079240.00.004245
42005-01-050.0133013.39601.92442.238534.10350.1392360.416010.503740.046412...0.00.00.0025470.00.00.0229230.00.0079240.00.004245
\n", "

5 rows × 269 columns

\n", "
" ], "text/plain": [ " date 06614800 06620000 06701500 06701900 06707500 06708800 \\\n", "0 2005-01-01 0.013584 3.5375 1.9244 2.19325 5.2921 0.163574 \n", "1 2005-01-02 0.013301 3.3960 1.9244 2.14514 5.2072 0.144896 \n", "2 2005-01-03 0.013301 3.3111 1.9244 2.15080 5.1506 0.128765 \n", "3 2005-01-04 0.013301 3.3960 1.9244 2.15080 5.0091 0.119992 \n", "4 2005-01-05 0.013301 3.3960 1.9244 2.23853 4.1035 0.139236 \n", "\n", " 06709000 06709530 06710150 ... 382628104493700 382629104493000 \\\n", "0 0.52072 0.55751 0.056600 ... 0.0 0.0 \n", "1 0.48110 0.53770 0.052355 ... 0.0 0.0 \n", "2 0.49525 0.50374 0.058015 ... 0.0 0.0 \n", "3 0.48110 0.48110 0.051506 ... 0.0 0.0 \n", "4 0.41601 0.50374 0.046412 ... 0.0 0.0 \n", "\n", " 383619104520401 383637104531301 383944104474201 384037104472001 \\\n", "0 0.002547 0.0 0.0 0.021508 \n", "1 0.002547 0.0 0.0 0.024621 \n", "2 0.002547 0.0 0.0 0.023772 \n", "3 0.002547 0.0 0.0 0.025470 \n", "4 0.002547 0.0 0.0 0.022923 \n", "\n", " 384047104510301 384048104504901 384220104503701 391504106225200 \n", "0 0.0 0.008490 0.0 0.004245 \n", "1 0.0 0.008207 0.0 0.004245 \n", "2 0.0 0.007924 0.0 0.004245 \n", "3 0.0 0.007924 0.0 0.004245 \n", "4 0.0 0.007924 0.0 0.004245 \n", "\n", "[5 rows x 269 columns]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Request point observations data\n", "data_df = get_point_data(dataset=\"usgs_nwis\", variable=\"streamflow\", temporal_resolution=\"daily\", aggregation=\"mean\",\n", " date_start=\"2005-01-01\", date_end=\"2005-12-31\", \n", " state=\"CO\",\n", " min_num_obs=365)\n", "\n", "# View first five records\n", "data_df.head(5)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
site_idsite_namesite_typeagencystatelatitudelongitudefirst_date_data_availablelast_date_data_availablerecord_count...doihuc8conus1_xconus1_yconus2_xconus2_ygagesii_drainage_areagagesii_classgagesii_site_elevationusgs_drainage_area
006614800MICHIGAN RIVER NEAR CAMERON PASS, COstream gaugeUSGSCO40.496094-105.8650121973-10-012023-12-0118322...None101800011054818148117644.0284Ref3188.01.54
106620000NORTH PLATTE RIVER NEAR NORTHGATE, COstream gaugeUSGSCO40.936639-106.3391941904-06-012023-12-0139782...None101800011020870144818173702.6370Non-ref2388.01431.00
206701500SOUTH PLATTE RIVER BELOW CHEESMAN LAKE, COstream gaugeUSGSCO39.209157-105.2677731924-10-012007-09-2929217...None101900021091671nannan4557.0680Non-ref2081.01752.00
306701900SOUTH PLATTE RIVER BLW BRUSH CRK NEAR TRUMBULL...stream gaugeUSGSCO39.259990-105.2219382002-07-192023-12-017792...None10190002nannan152316275252.5570Non-ref1990.02028.00
406707500SOUTH PLATTE RIVER AT SOUTH PLATTE, COstream gaugeUSGSCO39.409156-105.1699901896-01-012007-09-2932959...None10190002nannannannan6689.0300Non-ref1901.02579.00
\n", "

5 rows × 23 columns

\n", "
" ], "text/plain": [ " site_id site_name site_type \\\n", "0 06614800 MICHIGAN RIVER NEAR CAMERON PASS, CO stream gauge \n", "1 06620000 NORTH PLATTE RIVER NEAR NORTHGATE, CO stream gauge \n", "2 06701500 SOUTH PLATTE RIVER BELOW CHEESMAN LAKE, CO stream gauge \n", "3 06701900 SOUTH PLATTE RIVER BLW BRUSH CRK NEAR TRUMBULL... stream gauge \n", "4 06707500 SOUTH PLATTE RIVER AT SOUTH PLATTE, CO stream gauge \n", "\n", " agency state latitude longitude first_date_data_available \\\n", "0 USGS CO 40.496094 -105.865012 1973-10-01 \n", "1 USGS CO 40.936639 -106.339194 1904-06-01 \n", "2 USGS CO 39.209157 -105.267773 1924-10-01 \n", "3 USGS CO 39.259990 -105.221938 2002-07-19 \n", "4 USGS CO 39.409156 -105.169990 1896-01-01 \n", "\n", " last_date_data_available record_count ... doi huc8 conus1_x \\\n", "0 2023-12-01 18322 ... None 10180001 1054 \n", "1 2023-12-01 39782 ... None 10180001 1020 \n", "2 2007-09-29 29217 ... None 10190002 1091 \n", "3 2023-12-01 7792 ... None 10190002 nan \n", "4 2007-09-29 32959 ... None 10190002 nan \n", "\n", " conus1_y conus2_x conus2_y gagesii_drainage_area gagesii_class \\\n", "0 818 1481 1764 4.0284 Ref \n", "1 870 1448 1817 3702.6370 Non-ref \n", "2 671 nan nan 4557.0680 Non-ref \n", "3 nan 1523 1627 5252.5570 Non-ref \n", "4 nan nan nan 6689.0300 Non-ref \n", "\n", " gagesii_site_elevation usgs_drainage_area \n", "0 3188.0 1.54 \n", "1 2388.0 1431.00 \n", "2 2081.0 1752.00 \n", "3 1990.0 2028.00 \n", "4 1901.0 2579.00 \n", "\n", "[5 rows x 23 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# NOTE: Metadata access does not support the `min_num_obs` filter because it does not inspect the data contents for the sliced date range.\n", "# Metadata access only filters on overall data availability to be within the specified range.\n", "\n", "# The following is an example workflow for obtaining metadata for only those sites that \n", "# additionally satisfy the `min_num_obs` filter\n", "metadata_df = get_point_metadata(dataset=\"usgs_nwis\", variable=\"streamflow\", temporal_resolution=\"daily\", aggregation=\"mean\",\n", " date_start=\"2005-01-01\", date_end=\"2005-12-31\", \n", " state=\"CO\")\n", "\n", "c = list(data_df.columns)\n", "c.remove('date')\n", "filtered_site_list = pd.DataFrame(data=c, columns=['site_id'])\n", "filtered_metadata_df = pd.merge(filtered_site_list, metadata_df, on='site_id', how='left')\n", "assert len(filtered_metadata_df) == data_df.shape[1]-1\n", "\n", "# View first five records\n", "filtered_metadata_df.head()" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Most U.S. Geological Survey (USGS) information resides in Public Domain and may be used without restriction, though they do ask that proper credit be given. An example credit statement would be: \"(Product or data name) courtesy of the U.S. Geological Survey\". Source: https://www.usgs.gov/information-policies-and-instructions/acknowledging-or-crediting-usgs'" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# See how to cite the use of this data\n", "get_citations(dataset=\"usgs_nwis\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 4 }