Explore point data availability
To launch this notebook interactively in a Jupyter notebook-like browser interface, please click the “Launch Binder” button below. Note that Binder may take several minutes to launch.
This notebook walks through how to explore what data is available using hf_hydrodata’s get_site_variables. Please see the full point module documentation for information on what data is available, our data collection process, and new features we are working on! Our Metadata Description page itemizes the fields that get returned from
get_point_metadata.
[1]:
# Import packages
from hf_hydrodata import register_api_pin, get_point_data, get_point_metadata, get_site_variables
[ ]:
# You need to register on https://hydrogen.princeton.edu/pin
# and run the following with your registered information
# before you can use the hydrodata utilities
register_api_pin("your_email", "your_pin")
Example 1: What streamflow sites are available in Colorado that were operational during Water Year 2019?
The get_site_variables function accepts any number of supported parameters.
In this example, we want to know information about streamflow sites, so we will supply variable='streamflow'. We are specifically interested in the state of Colorado, so we will supply state='CO'. Finally, we only want sites that were operational during Water Year 2019. We will set date_start='2018-10-01' and date_end='2019-09-30'.
[2]:
# Let's use the above input parameters.
df = get_site_variables(variable="streamflow", state="CO", date_start="2018-10-01", date_end="2019-09-30")
print(f'Number of records: {len(df)}')
df.head(5)
Number of records: 691
[2]:
| site_id | site_name | site_type | agency | state | variable_name | units | first_date_data_available | last_date_data_available | record_count | latitude | longitude | site_query_url | date_metadata_last_updated | tz_cd | doi | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 06614800 | MICHIGAN RIVER NEAR CAMERON PASS, CO | stream gauge | USGS | CO | Hourly average streamflow | m3/s | 1986-10-01 | 2023-12-02 | 224235 | 40.496094 | -105.865012 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | MST | None |
| 1 | 06614800 | MICHIGAN RIVER NEAR CAMERON PASS, CO | stream gauge | USGS | CO | Daily average streamflow | m3/s | 1973-10-01 | 2023-12-01 | 18322 | 40.496094 | -105.865012 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | MST | None |
| 2 | 06620000 | NORTH PLATTE RIVER NEAR NORTHGATE, CO | stream gauge | USGS | CO | Hourly average streamflow | m3/s | 1990-10-01 | 2023-12-02 | 177843 | 40.936639 | -106.339194 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | MST | None |
| 3 | 06620000 | NORTH PLATTE RIVER NEAR NORTHGATE, CO | stream gauge | USGS | CO | Daily average streamflow | m3/s | 1904-06-01 | 2023-12-01 | 39782 | 40.936639 | -106.339194 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | MST | None |
| 4 | 06659580 | SAND CREEK AT COLORADO-WYOMING STATE LINE | stream gauge | USGS | CO | Hourly average streamflow | m3/s | 1996-04-01 | 2020-09-02 | 104382 | 40.993650 | -105.759703 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | MST | None |
Above, we can see that 691 records were returned. Note that a site could collect data at multiple temporal resolutions (such as daily and hourly). The data availability range for each might be different, as a site could have started collecting hourly and daily data at different times. The above DataFrame will be unique for unique combinations of site_id and variable_name.
The fields first_date_data_available and last_date_data_available refer to the earliest and latest dates we have available for the site. The record_count field reports on the total number of records over that entire time span and does not relate to the specific date_start and date_end parameters supplied.
Let’s narrow our search down to only sites that have daily streamflow data by adding temporal_resolution='daily'. Let’s further refine things so that we only get sites with a latitude between 40 and 40.5 degrees.
[3]:
df = get_site_variables(variable="streamflow", state="CO", date_start="2018-10-01", date_end="2019-09-30", temporal_resolution="daily",
latitude_range=(40, 40.5))
print(f'Number of records: {len(df)}')
df.tail(5)
Number of records: 47
[3]:
| site_id | site_name | site_type | agency | state | variable_name | units | first_date_data_available | last_date_data_available | record_count | latitude | longitude | site_query_url | date_metadata_last_updated | tz_cd | doi | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 309 | 09032990 | MEADOW CREEK BLW MEADOW CREEK RES NR TABERNASH... | stream gauge | USGS | CO | Daily average streamflow | m3/s | 2018-06-01 | 2023-10-31 | 1073 | 40.052028 | -105.754167 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | MST | None |
| 311 | 09033010 | MEADOW CREEK DIVERSION NEAR TABERNASH, CO | stream gauge | USGS | CO | Daily average streamflow | m3/s | 2018-06-29 | 2023-10-31 | 1045 | 40.050286 | -105.779706 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | MST | None |
| 312 | 09040500 | TROUBLESOME CREEK NEAR TROUBLESOME, CO. | stream gauge | USGS | CO | Daily average streamflow | m3/s | 1904-10-01 | 2023-12-01 | 9318 | 40.058664 | -106.305178 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | MST | None |
| 314 | 401723105400000 | ANDREWS CREEK-LOCH VALE-RMNP | stream gauge | USGS | CO | Daily average streamflow | m3/s | 1991-09-30 | 2023-12-01 | 11582 | 40.290000 | -105.666667 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | MST | None |
| 316 | 401727105400000 | ANDREWS SPRING 1 | stream gauge | USGS | CO | Daily average streamflow | m3/s | 2019-04-10 | 2023-10-09 | 1341 | 40.290818 | -105.667227 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | MST | None |
Now we 47 sites. We can extract this site list and pass it in to the site_ids parameter of get_point_data and get_point_metadata functions to get data for WY2019 for only these sites.
Note that get_point_data and get_point_metadata require the parameters dataset, variable, temporal_resolution and aggregation to be supplied to be able to uniquely identify a single data series. For daily average streamflow, we want dataset='usgs_nwis', variable='streamflow', temporal_resolution='daily', and aggregation='mean'.
[4]:
# Get list of site IDs from the above exploratory DataFrame
co_streamflow_site_ids = list(df['site_id'])
assert len(co_streamflow_site_ids) == len(df)
[5]:
# Request point observations data
data_df = get_point_data(dataset="usgs_nwis", variable="streamflow", temporal_resolution="daily", aggregation="mean",
date_start="2018-09-30", date_end="2019-10-01",
site_ids=co_streamflow_site_ids)
# View first five records
data_df.head()
[5]:
| date | 06614800 | 06720990 | 06721000 | 06724970 | 06727410 | 06727500 | 06730160 | 06730200 | 06730500 | ... | 09304200 | 09304500 | 09304800 | 09306222 | 09306255 | 09306290 | 401723105400000 | 401727105400000 | 401733105392404 | 402114105350101 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2018-09-30 | 0.011037 | 1.24520 | 4.6695 | 0.044148 | NaN | 0.006792 | 0.0 | 1.00465 | 0.51506 | ... | 4.3582 | 5.7166 | 7.1882 | 0.024904 | 0.008490 | 6.2543 | 0.019244 | NaN | 0.054053 | 0.185931 |
| 1 | 2018-10-01 | 0.011320 | 1.12634 | 5.0657 | 0.043582 | NaN | NaN | NaN | 0.98484 | 0.72731 | ... | 4.8110 | 5.8864 | 7.4146 | 0.025187 | 0.008490 | 6.5373 | 0.018961 | NaN | 0.050657 | 0.178290 |
| 2 | 2018-10-02 | 0.011320 | 0.45280 | 5.1223 | 0.056600 | NaN | NaN | NaN | 0.93390 | 0.66222 | ... | 5.0091 | 6.0562 | 7.5561 | 0.029149 | 0.009622 | 6.9052 | 0.019244 | NaN | 0.051789 | 0.203194 |
| 3 | 2018-10-03 | 0.011886 | 0.43016 | 5.1223 | 0.048110 | NaN | NaN | NaN | 0.81787 | 0.54902 | ... | 5.5468 | 6.6788 | 8.1787 | 0.035375 | 0.011037 | 7.6976 | 0.023772 | NaN | 0.056317 | 0.232626 |
| 4 | 2018-10-04 | 0.013867 | 0.40752 | 5.0374 | 0.049808 | NaN | NaN | NaN | 0.81787 | 0.52072 | ... | 6.1128 | 7.6976 | 10.4993 | 0.059713 | 0.024904 | 10.6408 | 0.022357 | NaN | 0.059147 | 0.252153 |
5 rows × 48 columns
[6]:
# Request site-level attributes for these sites
metadata_df = get_point_metadata(dataset="usgs_nwis", variable="streamflow", temporal_resolution="daily", aggregation="mean",
date_start="2018-09-30", date_end="2019-10-01",
site_ids=co_streamflow_site_ids)
# View first five records
metadata_df.head()
[6]:
| site_id | site_name | site_type | agency | state | latitude | longitude | first_date_data_available | last_date_data_available | record_count | ... | doi | huc8 | conus1_x | conus1_y | conus2_x | conus2_y | gagesii_drainage_area | gagesii_class | gagesii_site_elevation | usgs_drainage_area | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 06614800 | MICHIGAN RIVER NEAR CAMERON PASS, CO | stream gauge | USGS | CO | 40.496094 | -105.865012 | 1973-10-01 | 2023-12-01 | 18322 | ... | None | 10180001 | 1054 | 818 | 1481 | 1764 | 4.0284 | Ref | 3188.0 | 1.54 |
| 1 | 06720990 | BIG DRY CREEK AT MOUTH NEAR FORT LUPTON, CO | stream gauge | USGS | CO | 40.068833 | -104.831986 | 1991-10-01 | 2023-12-01 | 11734 | ... | None | 10190003 | nan | nan | 1561 | 1705 | 272.4498 | Non-ref | 1494.0 | 107.00 |
| 2 | 06721000 | SOUTH PLATTE RIVER AT FORT LUPTON, CO. | stream gauge | USGS | CO | 40.116094 | -104.818583 | 1929-04-29 | 2023-12-01 | 17668 | ... | None | 10190003 | nan | nan | 1563 | 1715 | 13064.8200 | Non-ref | 1485.0 | 5043.00 |
| 3 | 06724970 | LEFT HAND CREEK AT HOVER ROAD NEAR LONGMONT, CO | stream gauge | USGS | CO | 40.134278 | -105.130819 | 2014-03-05 | 2023-12-01 | 3557 | ... | None | 10190005 | nan | nan | 1540 | 1718 | NaN | nan | NaN | 71.60 |
| 4 | 06727410 | FOURMILE CREEK AT LOGAN MILL ROAD NEAR CRISMAN... | stream gauge | USGS | CO | 40.042028 | -105.364917 | 2011-04-01 | 2023-09-30 | 1125 | ... | None | 10190005 | nan | nan | nan | nan | NaN | nan | NaN | 19.20 |
5 rows × 23 columns
This workflow shows how to quickly identify sites of interest based on location and temporal filters. The state, date_start, and date_end parameters can also be passed in to get_point_data and get_point_metadata directly. This get_site_variables offers a quick look-up for getting variable-level metadata without having to pre-specify the variable of interest (see additional example below)
Example 2: What data is available within the Raritan watershed in New Jersey (HUC8=’02030105’)?
Please see the example notebook Filter sites by USGS HUC boundary for a more detailed walk-through on the following procedure. You may also supply an existing shapefile and CRS if you have one for your domain.
[7]:
# Import additional packages
import requests
from zipfile import ZipFile
from io import BytesIO
import shapefile
from shapely.geometry import shape
[8]:
# Download and subset HUC02 boundary file from the USGS
# Send request for data
url = 'https://prd-tnm.s3.amazonaws.com/StagedProducts/Hydrography/WBD/HU2/Shape/WBD_02_HU2_Shape.zip'
url_response = requests.get(url) # note this might take a minute or so to run
# In this example, we will extract only the files with the HUC8 level watersheds
# This code saves these files to the local directory where this notebook is being run
myzipfile = ZipFile(BytesIO(url_response.content))
myzipfile.extractall(members=['Shape/WBDHU8.shp', 'Shape/WBDHU8.shx', 'Shape/WBDHU8.dbf', 'Shape/WBDHU8.prj'])
# Read in shapefile
huc02_shp = shapefile.Reader('Shape/WBDHU8.shp')
# Read in projection file
with open('Shape/WBDHU8.prj') as f:
usgs_huc_crs = f.readlines()[0]
print(f"CRS: {usgs_huc_crs}")
# We want to use the Raritan watershed, HUC8='02030105'. This is at index 72
print(huc02_shp.shapeRecord(i=72, fields=['states', 'huc8', 'name']).record)
# Extract the shape and record information for this index
raritan_shape = huc02_shp.shapeRecord(i=72).shape
raritan_record = huc02_shp.shapeRecord(i=72).record
# Save as shapefile, to be passed in to hf_hydrodata functions below
with shapefile.Writer('Shape/raritan_watershed') as w:
w.fields = huc02_shp.fields[1:]
w.record(raritan_record)
w.shape(raritan_shape)
CRS: GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]
Record #72: ['NJ', '02030105', 'Raritan']
[9]:
with open('Shape/WBDHU8.prj') as f:
usgs_huc_crs = f.readlines()[0]
print(f"CRS: {usgs_huc_crs}")
CRS: GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]
We will use this shapefile to see all of the sites with available data within that watershed. Again, if you have your own domain shapefile and CRS you can supply that here instead with the polygon and polygon_crs parameters.
[10]:
df = get_site_variables(polygon="Shape/raritan_watershed.shp", polygon_crs=usgs_huc_crs)
print(f"Number of records: {len(df)}")
df.head(5)
Number of records: 1013
[10]:
| site_id | site_name | site_type | agency | state | variable_name | units | first_date_data_available | last_date_data_available | record_count | latitude | longitude | site_query_url | date_metadata_last_updated | tz_cd | doi | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 01396091 | South Br Raritan River at Rt 46 at Budd Lake NJ | stream gauge | USGS | NJ | Hourly average streamflow | m3/s | 2008-06-01 | 2014-03-30 | 50925 | 40.859444 | -74.760833 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | EST | None |
| 1 | 01396190 | South Branch Raritan River at Four Bridges NJ | stream gauge | USGS | NJ | Hourly average streamflow | m3/s | 1999-01-20 | 2012-09-30 | 116127 | 40.806111 | -74.740556 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | EST | None |
| 2 | 01396500 | South Branch Raritan River near High Bridge NJ | stream gauge | USGS | NJ | Hourly average streamflow | m3/s | 1981-10-01 | 2023-12-02 | 353711 | 40.677778 | -74.879167 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | EST | None |
| 3 | 01396580 | Spruce Run at Glen Gardner NJ | stream gauge | USGS | NJ | Hourly average streamflow | m3/s | 1981-10-01 | 2005-08-31 | 153867 | 40.693333 | -74.939722 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | EST | None |
| 4 | 01396582 | Spruce Run at Main Street at Glen Gardner NJ | stream gauge | USGS | NJ | Hourly average streamflow | m3/s | 2005-10-01 | 2023-12-02 | 151737 | 40.691389 | -74.936944 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-05-30 | EST | None |
Let’s see what types of sites are included here.
[11]:
print(f"Site types: {df['site_type'].unique()}")
Site types: ['stream gauge' 'groundwater well']
Let’s look at the groundwater wells to see how many data records each well has.
[12]:
df[df['site_type'] == 'groundwater well']
[12]:
| site_id | site_name | site_type | agency | state | variable_name | units | first_date_data_available | last_date_data_available | record_count | latitude | longitude | site_query_url | date_metadata_last_updated | tz_cd | doi | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 63 | 402109074301301 | 230291-- Forsgate 1 Obs | groundwater well | USGS | NJ | Hourly average water table depth | m | 2007-10-01 | 2023-08-10 | 138590 | 40.352608 | -74.503209 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 64 | 402109074301302 | 230292-- Forsgate 2 Obs | groundwater well | USGS | NJ | Hourly average water table depth | m | 2007-10-01 | 2023-08-10 | 138221 | 40.352608 | -74.502931 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 65 | 402138074435801 | 210365-- Carter Rd Obs | groundwater well | USGS | NJ | Hourly average water table depth | m | 2007-10-01 | 2023-09-29 | 140087 | 40.360662 | -74.732383 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 66 | 402143074185201 | 230104-- Morrell 1 Obs | groundwater well | USGS | NJ | Hourly average water table depth | m | 2007-10-01 | 2023-12-02 | 140656 | 40.362053 | -74.313203 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 67 | 402151074525301 | 190251-- Corsalo Rd 1 Obs | groundwater well | USGS | NJ | Hourly average water table depth | m | 2007-10-01 | 2023-12-02 | 141074 | 40.364273 | -74.880999 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 967 | 405325074363901 | 271651-- 1963 | groundwater well | USGS | NJ | Water table depth | m | 1963-07-17 | 1989-09-22 | 4 | 40.890377 | -74.610437 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 968 | 405330074363801 | 271123-- Kenvil Newcrete 1 Obs | groundwater well | USGS | NJ | Water table depth | m | 1989-02-27 | 1991-01-01 | 20 | 40.891766 | -74.610160 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 969 | 405330074363802 | 271124-- Kenvil Newcrete 2 Obs | groundwater well | USGS | NJ | Water table depth | m | 1989-03-21 | 1997-10-08 | 20 | 40.891766 | -74.610160 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 970 | 405330074363803 | 271183-- Kenvil Newcrete 7 Obs | groundwater well | USGS | NJ | Water table depth | m | 1989-10-05 | 1990-09-30 | 13 | 40.891766 | -74.610160 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 971 | 405355074380801 | 271787-- Stierli Way 1A | groundwater well | USGS | NJ | Water table depth | m | 1989-09-08 | 1989-09-08 | 1 | 40.898710 | -74.635160 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
909 rows × 16 columns
It looks like some of these have a lot of records, and some have only a couple of sparse measurements. Let’s see what sites have regular, daily water table depth observations.
[13]:
df[(df['site_type'] == 'groundwater well') & (df['variable_name'] == 'Daily average water table depth')]
[13]:
| site_id | site_name | site_type | agency | state | variable_name | units | first_date_data_available | last_date_data_available | record_count | latitude | longitude | site_query_url | date_metadata_last_updated | tz_cd | doi | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 78 | 401518074223001 | 250216-- PW 1 | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-10-06 | 1975-07-31 | 697 | 40.255111 | -74.374593 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 79 | 401819074351601 | 210395-- MW-2 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1993-09-18 | 1994-08-28 | 345 | 40.301775 | -74.592100 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 80 | 402015074275701 | 230228-- Forsgate 3 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-11-01 | 1975-02-21 | 772 | 40.337608 | -74.465430 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 81 | 402015074275702 | 230229-- Forsgate 4 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-11-01 | 2005-07-16 | 678 | 40.337608 | -74.465430 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 82 | 402023074391901 | 210358-- Princeton 1-Brick Rd Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1989-06-28 | 1990-10-21 | 460 | 40.339829 | -74.654880 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 83 | 402032074392501 | 210359-- Princeton 2-Chill Pl Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1989-11-18 | 1992-02-16 | 784 | 40.342329 | -74.656547 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 84 | 402058074355901 | 230796-- Test 5 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1986-03-14 | 1992-02-20 | 2170 | 40.349552 | -74.599323 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 85 | 402058074355902 | 230800-- Test 9 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1986-03-14 | 1992-02-20 | 2152 | 40.349552 | -74.599323 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 86 | 402109074301301 | 230291-- Forsgate 1 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1973-02-07 | 2023-08-09 | 7278 | 40.352608 | -74.503209 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 87 | 402109074301302 | 230292-- Forsgate 2 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-12-06 | 2023-08-09 | 7435 | 40.352608 | -74.502931 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 88 | 402131074461201 | 210088-- Honey Br 10 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-10-01 | 1995-03-05 | 4037 | 40.358718 | -74.769329 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 89 | 402138074435801 | 210365-- Carter Rd Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1987-02-25 | 2023-09-28 | 13134 | 40.360662 | -74.732383 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 90 | 402143074185201 | 230104-- Morrell 1 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1985-01-24 | 2023-12-02 | 13930 | 40.362053 | -74.313203 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 91 | 402151074525301 | 190251-- Corsalo Rd 1 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1989-06-28 | 2023-12-02 | 12299 | 40.364273 | -74.880999 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 92 | 402208074145201 | 250272-- Marlboro 1 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1973-01-22 | 2023-09-10 | 16842 | 40.368998 | -74.247367 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 93 | 402208074505801 | 210705-- Gomez Tract Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 2007-12-05 | 2012-09-29 | 1690 | 40.368889 | -74.849444 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 94 | 402450074181801 | 230182-- Dom | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-12-04 | 1975-08-11 | 981 | 40.413718 | -74.304869 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 95 | 402458074200401 | 230184-- Runyon A4 | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-12-04 | 1975-03-25 | 529 | 40.416218 | -74.334037 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 96 | 402510074411601 | 350028-- Seven | groundwater well | USGS | NJ | Daily average water table depth | m | 1986-12-25 | 1989-09-19 | 888 | 40.419550 | -74.687382 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 97 | 402512074414301 | 350139-- MW109 | groundwater well | USGS | NJ | Daily average water table depth | m | 2003-06-18 | 2023-09-28 | 7183 | 40.420105 | -74.694882 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 98 | 402525074195401 | 230189-- Runyon R50 | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-12-04 | 1975-07-22 | 851 | 40.423718 | -74.331259 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 99 | 402536074201801 | 230194-- Runyon 1 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-11-02 | 1975-08-12 | 751 | 40.426774 | -74.337926 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 100 | 402553074203301 | 230343-- Sun Biscuit 5 | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-12-04 | 1975-08-12 | 921 | 40.431496 | -74.342093 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 101 | 402553074271701 | 230070-- Fischer Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-12-06 | 2023-12-02 | 14908 | 40.432050 | -74.454874 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 102 | 402555074213301 | 230433-- So River 4 | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-11-29 | 1973-03-12 | 104 | 40.432051 | -74.358760 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 103 | 402558074201301 | 230344-- Sayreville 2 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-12-04 | 1975-08-12 | 885 | 40.432884 | -74.336537 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 104 | 402633074220001 | 230439-- South River 2 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1973-01-24 | 1975-08-12 | 529 | 40.442606 | -74.366260 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 105 | 402704074213901 | 231058-- Hess Bros #1 | groundwater well | USGS | NJ | Daily average water table depth | m | 1987-03-18 | 1988-12-27 | 573 | 40.451217 | -74.360427 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 106 | 402743074221601 | 231056-- Monitoring #3 | groundwater well | USGS | NJ | Daily average water table depth | m | 1987-03-28 | 1987-11-22 | 62 | 40.462050 | -74.370705 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 107 | 402831074212001 | 231077-- Jcp&L-Say | groundwater well | USGS | NJ | Daily average water table depth | m | 1987-03-26 | 1988-07-11 | 442 | 40.475383 | -74.355149 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 108 | 403119074290301 | 231165-- Golf 13 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1992-05-08 | 2002-01-27 | 2673 | 40.518993 | -74.469597 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 109 | 403135074274401 | 231330-- MW-12A | groundwater well | USGS | NJ | Daily average water table depth | m | 1998-12-05 | 2002-01-27 | 1131 | 40.526493 | -74.461819 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 110 | 403135074274402 | 231331-- MW-12B | groundwater well | USGS | NJ | Daily average water table depth | m | 1998-12-05 | 2002-01-27 | 1140 | 40.526493 | -74.461819 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 111 | 403135074274403 | 231332-- MW-12C | groundwater well | USGS | NJ | Daily average water table depth | m | 1998-12-05 | 2002-01-27 | 1053 | 40.526493 | -74.461819 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 112 | 403200074420601 | 350138-- MW110 | groundwater well | USGS | NJ | Daily average water table depth | m | 2003-06-18 | 2023-09-28 | 7246 | 40.533250 | -74.701806 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 113 | 403455074514801 | 190276-- Environmental Ctr 1 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1992-05-20 | 2023-12-02 | 11416 | 40.577325 | -74.860443 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 114 | 403517074452501 | 190270-- Readington 11 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1990-04-25 | 2023-12-02 | 12247 | 40.588158 | -74.756551 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 115 | 404014074585401 | 190495-- NJWSA TW-B | groundwater well | USGS | NJ | Daily average water table depth | m | 2006-10-14 | 2023-09-14 | 6174 | 40.670444 | -74.981778 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 116 | 404452074493101 | 271302-- Jenkinson Farm 1 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1989-10-06 | 1991-01-31 | 483 | 40.747878 | -74.824888 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 117 | 404705074463801 | 271085-- TW Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1988-04-02 | 1991-02-07 | 919 | 40.784822 | -74.776831 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 118 | 404712074454701 | 271303-- Drew University Farm Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1990-09-14 | 2000-07-26 | 3578 | 40.786766 | -74.762664 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 119 | 404809074415501 | 271126-- Black River 4 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1989-03-22 | 1991-01-31 | 592 | 40.802600 | -74.698217 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 120 | 404809074415502 | 271164-- Black River 5 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1989-10-06 | 1991-01-22 | 416 | 40.802600 | -74.698217 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 121 | 404934074400501 | 271190-- Black River 10 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1992-05-21 | 2023-12-02 | 11462 | 40.817877 | -74.680995 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 122 | 404954074412201 | 271084-- Preliminary TW 2 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1988-04-02 | 1990-03-01 | 697 | 40.831766 | -74.689051 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 123 | 405005074410101 | 271083-- Test 1 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1988-04-02 | 1990-05-23 | 782 | 40.834822 | -74.683217 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 124 | 405047074392901 | 271597-- Kennedy School MW | groundwater well | USGS | NJ | Daily average water table depth | m | 1990-09-27 | 1990-12-23 | 88 | 40.846488 | -74.657661 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 125 | 405330074363801 | 271123-- Kenvil Newcrete 1 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1989-03-23 | 1991-02-07 | 574 | 40.891766 | -74.610160 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 126 | 405330074363803 | 271183-- Kenvil Newcrete 7 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1989-10-06 | 1990-09-30 | 359 | 40.891766 | -74.610160 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
This looks good, but several of these sites only have data through the 1980s or 1990s. Let’s say we only want to include sites that have data past at least 2000. We can specify date_start='2000-01-01' to make sure we capture only sites that are operational after that date. Let’s formalize this query more and also add variable='water_table_depth', temporal_resolution='daily' to the get_site_variables query.
[14]:
df = get_site_variables(polygon="Shape/raritan_watershed.shp", polygon_crs=usgs_huc_crs,
variable="water_table_depth", temporal_resolution="daily", date_start="2000-01-01")
print(f"Number of records: {len(df)}")
df.head(5)
Number of records: 20
[14]:
| site_id | site_name | site_type | agency | state | variable_name | units | first_date_data_available | last_date_data_available | record_count | latitude | longitude | site_query_url | date_metadata_last_updated | tz_cd | doi | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 402015074275702 | 230229-- Forsgate 4 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-11-01 | 2005-07-16 | 678 | 40.337608 | -74.465430 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 1 | 402109074301301 | 230291-- Forsgate 1 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1973-02-07 | 2023-08-09 | 7278 | 40.352608 | -74.503209 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 2 | 402109074301302 | 230292-- Forsgate 2 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1972-12-06 | 2023-08-09 | 7435 | 40.352608 | -74.502931 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 3 | 402138074435801 | 210365-- Carter Rd Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1987-02-25 | 2023-09-28 | 13134 | 40.360662 | -74.732383 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
| 4 | 402143074185201 | 230104-- Morrell 1 Obs | groundwater well | USGS | NJ | Daily average water table depth | m | 1985-01-24 | 2023-12-02 | 13930 | 40.362053 | -74.313203 | https://waterservices.usgs.gov/nwis/site/?form... | 2023-03-08 | EST | None |
Great! It looks like we have 20 sites left that fulfill our query. Let’s request the full daily, water table depth data series for those sites, starting from 2000-01-01.
Again, get_point_data and get_point_metadata require the parameters dataset, variable, temporal_resolution and aggregation to be supplied to be able to uniquely identify a single data series. For daily average water table depth, we want dataset='usgs_nwis', variable='water_table_depth', temporal_resolution='daily', and aggregation='mean'.
[15]:
# This DataFrame will contain data from 2000-01-01 onwards. If sites do not have data that early, they will have NaN values
# for dates until the site became operational.
data_df = get_point_data(dataset="usgs_nwis", variable="water_table_depth", temporal_resolution="daily", aggregation="mean",
date_start="2000-01-01",
polygon="Shape/raritan_watershed.shp", polygon_crs=usgs_huc_crs)
# View the first five records
data_df.head()
[15]:
| date | 402015074275702 | 402109074301301 | 402109074301302 | 402138074435801 | 402143074185201 | 402151074525301 | 402208074145201 | 402208074505801 | 402512074414301 | ... | 403119074290301 | 403135074274401 | 403135074274402 | 403135074274403 | 403200074420601 | 403455074514801 | 403517074452501 | 404014074585401 | 404712074454701 | 404934074400501 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2000-01-01 | NaN | NaN | NaN | 3.477768 | 0.777240 | 0.798576 | 37.956744 | NaN | NaN | ... | NaN | 3.761232 | 2.706624 | 2.368296 | NaN | 2.923032 | 5.660136 | NaN | 16.005048 | 3.691128 |
| 1 | 2000-01-02 | NaN | NaN | NaN | 3.465576 | 0.777240 | 0.746760 | 37.911024 | NaN | NaN | ... | NaN | 3.758184 | 2.688336 | 2.343912 | NaN | 2.904744 | 5.690616 | NaN | 15.995904 | 3.691128 |
| 2 | 2000-01-03 | NaN | NaN | NaN | 3.453384 | 0.780288 | 0.658368 | 37.874448 | NaN | NaN | ... | NaN | 3.761232 | 2.688336 | 2.334768 | NaN | 2.904744 | 5.727192 | NaN | 16.002000 | 3.688080 |
| 3 | 2000-01-04 | NaN | NaN | NaN | 3.407664 | 0.765048 | 0.509016 | 37.828728 | NaN | NaN | ... | NaN | 3.742944 | 2.654808 | 2.298192 | NaN | 2.883408 | 5.739384 | NaN | 15.989808 | 3.681984 |
| 4 | 2000-01-05 | NaN | NaN | NaN | 3.380232 | 0.646176 | 0.521208 | 37.853112 | NaN | NaN | ... | NaN | 3.724656 | 2.667000 | 2.289048 | NaN | 2.889504 | 5.760720 | NaN | 16.029432 | 3.678936 |
5 rows × 21 columns
[16]:
# Request site-level attributes for these sites
metadata_df = get_point_metadata(dataset="usgs_nwis", variable="water_table_depth", temporal_resolution="daily", aggregation="mean",
date_start="2000-01-01",
polygon="Shape/raritan_watershed.shp", polygon_crs=usgs_huc_crs)
# View the first five records
metadata_df.head()
[16]:
| site_id | site_name | site_type | agency | state | latitude | longitude | first_date_data_available | last_date_data_available | record_count | ... | conus1_x | conus1_y | conus2_x | conus2_y | usgs_nat_aqfr_cd | usgs_aqfr_cd | usgs_aqfr_type_cd | usgs_well_depth | usgs_hole_depth | usgs_hole_depth_src_cd | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 402015074275702 | 230229-- Forsgate 4 Obs | groundwater well | USGS | NJ | 40.337608 | -74.465430 | 1972-11-01 | 2005-07-16 | 678 | ... | None | None | 4035 | 1964 | S100NATLCP | 211FRNG | C | 330.0 | NaN | nan |
| 1 | 402109074301301 | 230291-- Forsgate 1 Obs | groundwater well | USGS | NJ | 40.352608 | -74.503209 | 1973-02-07 | 2023-08-09 | 7278 | ... | None | None | 4032 | 1964 | S100NATLCP | 211FRNG | C | 203.0 | NaN | nan |
| 2 | 402109074301302 | 230292-- Forsgate 2 Obs | groundwater well | USGS | NJ | 40.352608 | -74.502931 | 1972-12-06 | 2023-08-09 | 7435 | ... | None | None | 4032 | 1964 | S100NATLCP | 211ODBG | C | 104.0 | 130.0 | D |
| 3 | 402138074435801 | 210365-- Carter Rd Obs | groundwater well | USGS | NJ | 40.360662 | -74.732383 | 1987-02-25 | 2023-09-28 | 13134 | ... | None | None | 4013 | 1960 | N300ERLMZC | 227PSSC | U | 99.0 | NaN | Z |
| 4 | 402143074185201 | 230104-- Morrell 1 Obs | groundwater well | USGS | NJ | 40.362053 | -74.313203 | 1985-01-24 | 2023-12-02 | 13930 | ... | None | None | 4046 | 1970 | S100NATLCP | 211EGLS | U | 11.0 | 11.0 | R |
5 rows × 25 columns
Now let’s visualize these points on a map to see what we ended up with. We’ll use the bokeh package to do this. If you do not have bokeh installed into your environment, you may need to pip install bokeh first.
[17]:
from bokeh.plotting import figure, output_notebook, show
from bokeh.models import ColumnDataSource
import numpy as np
output_notebook()
# First let's define a function to convert lon/lat to mercator coordinates for Bokeh mapping procedure
def wgs84_to_web_mercator(lon, lat):
k = 6378137
x = lon * (k * np.pi/180.0)
y = np.log(np.tan((90 + lat) * np.pi/360.0)) * k
return x, y
# Calculate and add mercator coordinates to metadata DataFrame
metadata_df['x'] = metadata_df.apply(lambda x: wgs84_to_web_mercator(x['longitude'], x['latitude'])[0], axis=1)
metadata_df['y'] = metadata_df.apply(lambda x: wgs84_to_web_mercator(x['longitude'], x['latitude'])[1], axis=1)
# Define data source for point overlays
source = ColumnDataSource(
data=dict(x=list(metadata_df['x']),
y=list(metadata_df['y']),
site_id=list(metadata_df['site_id']),
site_name=list(metadata_df['site_name']))
)
# Range bounds supplied in web mercator coordinates
# Translate min and max bounding values from metadata
buffer = 0.2
p = figure(x_range=(wgs84_to_web_mercator((metadata_df['longitude'].max()+buffer), (metadata_df['latitude'].max()+buffer))[0],
wgs84_to_web_mercator((metadata_df['longitude'].min()-buffer), (metadata_df['latitude'].min()-buffer))[0]),
y_range=(wgs84_to_web_mercator((metadata_df['longitude'].min()-buffer), (metadata_df['latitude'].min()-buffer))[1],
wgs84_to_web_mercator((metadata_df['longitude'].max()+buffer), (metadata_df['latitude'].max()+buffer))[1]),
x_axis_type="mercator", y_axis_type="mercator")
p.add_tile('OSM')
# Overlay red circles onto map for site locations
p.circle(x="x", y="y", size=15, fill_color="red", fill_alpha=0.8, source=source)
show(p)