National Water Information Service
Streamflow and groundwater data from the USGS National Water Information System (NWIS) database.
Daily streamflow and water table depth data are obtained from the Daily Values Service.
Hourly streamflow and water table depth data are aggregated to the hourly level from the Instantaneous Values Service, which are frequently collected at 15-minute increments.
The water table depth data accessed with temporal_resolution='instantaneous' comes from the USGS Groundwater Levels Service. Note that these data usually do not have regular temporal coverage and many of the sites with data available through this method only have a single point-in-time observation available.
Dataset Name: usgs_nwis
Data Source: usgs
Data Collection or Processing Notes:
We query data from the USGS weekly, early on Sunday mornings. Each weekly job collects all observations since the date through which we have existing data stored. For sites that are currently in operation, this translates to collecting data for only the previous week (7 days for daily data, 168 hours for hourly data).
Because of the sparsity of the temporal_resolution='instantaneous' groundwater measurements, those are not included in this weekly schedule. We plan to query that source for new observations roughly every few months.
Note that raw hourly data is saved in UTC while raw daily data is saved with respect to the local site time zone.
To maintain the integrety and traceability back to the original sources, our team conducts very limited data manipulation on the queried data. This includes the following:
Unit translation into SI units - Standardization of NaN/missing values
For example, USGS will sometimes provide strings such as "Ice" or "Dry" to indicate reasons for why certain observations are missing. A full list of such fields is available here.
We standardize these values into the numeric numpy.NaN to allow the entireity of the series to be interpreted as numeric.
Consolidating multiple concurrent data series
The USGS data sometimes provides multiple concurrent observation series for the same variable for the same site. In these cases, we consolidate the multiple series into a single series following these prioritizations:
If one of the series has been verified, we prioritize that over provisional data
If both series are identical values, we simply reduce down to a single set of observations
If one of the series has non-missing data and the other series has missing data, we prioritize the non-missing data
If multiple series remain with conflicting values, we take the average of the resulting non-missing values
Variables
This describes the available variables of the dataset. Use the dataset, variables and temporal_resolution in python access functions as described in the Working with Gridded Data, and Working with Point Observations.
variable |
description |
temporal_resolution |
units |
aggregation |
3D |
|---|---|---|---|---|---|
water_table_depth |
Water table depth |
daily, weekly, monthly |
m |
mean |
no |
variable |
description |
temporal_resolution |
units |
aggregation |
3D |
|---|---|---|---|---|---|
streamflow |
Streamflow |
daily, weekly, monthly |
mm |
mean |
no |