hf_hydrodata.point module
Module to retrieve point observations.
- hf_hydrodata.point.get_point_data(*args, **kwargs)
Collect point observations data into a Pandas DataFrame.
Observations collected from HydroData for the specified data source, variable, temporal resolution, and aggregation. Optional arguments can be supplied for filters such as date bounds, geography bounds, and/or the minimum number of per-site observations allowed. Please see the package documentation for the full set of supported combinations.
- Parameters:
dataset (str, required) -- Source from which requested data originated. Currently supported: 'usgs_nwis', 'snotel', 'scan', 'ameriflux', 'jasechko_2024', 'fan_2013'.
variable (str, required) -- Description of type of data requested. Currently supported: 'streamflow', 'water_table_depth', 'swe', 'precipitation', 'air_temp', 'soil_moisture', 'latent_heat', 'sensible_heat', 'downward_shortwave', 'downward_longwave', 'vapor_pressure_deficit', 'wind_speed'.
temporal_resolution (str, required) -- Collection frequency of data requested. Currently supported: 'daily', 'hourly', 'instantaneous', 'yearly', and 'long_term'. Please see the documentation for allowable combinations with variable.
aggregation (str, required) -- Additional information specifying the aggregation method for the variable to be returned. Options include descriptors such as 'mean' and 'sum'. Please see the documentation for allowable combinations with variable.
depth_level (int, optional) -- Depth level in inches at which the measurement is taken. Necessary for variable = 'soil_moisture'.
date_start (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned from this date forward (inclusive).
date_end (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned up through this date (inclusive).
latitude_range (tuple, optional) -- Latitude range bounds for the geographic domain; lesser value is provided first.
longitude_range (tuple, optional) -- Longitude range bounds for the geographic domain; lesser value is provided first.
grid (str, optional) -- Value of either 'conus1' or 'conus2'. Used in combination with parameter grid_bounds to extract site locations for a specific region of conus coordinates.
grid_bounds (list of integers, optional) -- A list of points [left, bottom, right, top] in ij grid coordinates of the grid supplied by the grid parameter.
site_ids (str or list of strings, optional) -- Single site ID string or list of desired (string) site identifiers.
state (str, optional) -- Two-letter postal code state abbreviation (example: state='NJ').
huc_id (str or list of strings, optional) -- Single HUC ID string or list of adjacent HUC ID strings.
polygon (str, optional) -- Path to location of shapefile. Must be readable by PyShp's shapefile.Reader().
polygon_crs (str, optional) -- CRS definition accepted by pyproj.CRS.from_user_input().
site_networks (str or list of strings, optional) -- Name(s) of site networks. Can be a string with a single network name, or a list of strings containing strings for multiple available networks. There are currently supported networks for stream gages (dataset=='usgs_nwis', variable='streamflow') and groundwater wells (dataset=='usgs_nwis', variable='water_table_depth'). For streamflow, options include: 'gagesii', 'gagesii_reference', 'hcdn2009', 'camels', and 'nwm'. For water table depth, options include: 'climate_response_network'.
min_num_obs (int, optional) -- Value for the minimum number of observations desired for a site to have. If provided, data will be returned only for sites that have at least this number of non-NaN observation records within the requested date range (if supplied).
- Returns:
data_df (DataFrame) -- DataFrame with columns for each site_id satisfying input filters. Rows represent the date range requested from date_start and/or date_end, or the broadest range of data available for returned sites if no date range is explicitly requested.
If the environment variable HUC_VERSION is set this will cause the function to use the HUC boundaries for
that dataset_version when HUC is passed as a option.
The versions 2025_06, 2025_01, 2024_11 are supported as well as blank to use the latest HUC boundaries.
- hf_hydrodata.point.get_point_metadata(*args, **kwargs)
Return DataFrame with site metadata for the filtered sites.
- Parameters:
dataset (str, required) -- Source from which requested data originated. Currently supported: 'usgs_nwis', 'snotel', 'scan', 'ameriflux', 'jasechko_2024', 'fan_2013'.
variable (str, required) -- Description of type of data requested. Currently supported: 'streamflow', 'water_table_depth', 'swe', 'precipitation', 'air_temp', 'soil_moisture', 'latent_heat', 'sensible_heat', 'downward_shortwave', 'downward_longwave', 'vapor_pressure_deficit', 'wind_speed'.
temporal_resolution (str, required) -- Collection frequency of data requested. Currently supported: 'daily', 'hourly', 'instantaneous', 'yearly', and 'multiyear'. Please see the documentation for allowable combinations with variable.
aggregation (str, required) -- Additional information specifying the aggregation method for the variable to be returned. Options include descriptors such as 'mean' and 'sum'. Please see the documentation for allowable combinations with variable.
depth_level (int, optional) -- Depth level in inches at which the measurement is taken. Necessary for variable = 'soil_moisture'.
date_start (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned from this date forward (inclusive).
date_end (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned up through this date (inclusive).
latitude_range (tuple, optional) -- Latitude range bounds for the geographic domain; lesser value is provided first.
longitude_range (tuple, optional) -- Longitude range bounds for the geographic domain; lesser value is provided first.
grid (str, optional) -- Value of either 'conus1' or 'conus2'. Used in combination with parameter grid_bounds to extract site locations for a specific region of conus coordinates.
grid_bounds (list of integers, optional) -- A list of points [left, bottom, right, top] in ij grid coordinates of the grid supplied by the grid parameter.
site_ids (str or list of strings, optional) -- Single site ID string or list of desired (string) site identifiers.
state (str, optional) -- Two-letter postal code state abbreviation (example: state='NJ').
huc_id (str or list of strings, optional) -- Single HUC ID string or list of adjacent HUC ID strings.
polygon (str, optional) -- Path to location of shapefile. Must be readable by PyShp's shapefile.Reader().
polygon_crs (str, optional) -- CRS definition accepted by pyproj.CRS.from_user_input().
site_networks (str or list of strings, optional) -- Name(s) of site networks. Can be a string with a single network name, or a list of strings containing strings for multiple available networks. There are currently supported networks for stream gages (dataset=='usgs_nwis', variable='streamflow') and groundwater wells (dataset=='usgs_nwis', variable='water_table_depth'). For streamflow, options include: 'gagesii', 'gagesii_reference', 'hcdn2009', 'camels', and 'nwm'. For water table depth, options include: 'climate_response_network'.
- Returns:
Site-level DataFrame of site-level metadata.
- Return type:
DataFrame
- hf_hydrodata.point.get_site_variables(*args, **kwargs)
Return DataFrame with available sites, variables, and the period of record.
- Parameters:
dataset (str, optional) -- Source from which requested data originated. Currently supported: 'usgs_nwis', 'snotel', 'scan', 'ameriflux', 'jasechko_2024', and 'fan_2013'.
variable (str, optional) -- Description of type of data requested. Currently supported: 'streamflow', 'water_table_depth', 'swe', 'precipitation', 'air_temp', 'soil_moisture', 'latent_heat', 'sensible_heat', 'downward_shortwave', 'downward_longwave', 'vapor_pressure_deficit', 'wind_speed'.
temporal_resolution (str, optional) -- Collection frequency of data requested. Currently supported: 'daily', 'hourly', 'instantaneous', 'yearly', and 'long_term'. Please see the documentation for allowable combinations with variable.
aggregation (str, optional) -- Additional information specifying the aggregation method for the variable to be returned. Options include descriptors such as 'mean' and 'sum'. Please see the documentation for allowable combinations with variable.
date_start (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned from this date forward (inclusive).
date_end (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned up through this date (inclusive).
latitude_range (tuple, optional) -- Latitude range bounds for the geographic domain; lesser value is provided first.
longitude_range (tuple, optional) -- Longitude range bounds for the geographic domain; lesser value is provided first.
grid (str, optional) -- Value of either 'conus1' or 'conus2'. Used in combination with parameter grid_bounds to extract site locations for a specific region of conus coordinates.
grid_bounds (list of integers, optional) -- A list of points [left, bottom, right, top] in ij grid coordinates of the grid supplied by the grid parameter.
site_ids (str or list of strings, optional) -- Single site ID string or list of desired (string) site identifiers.
state (str, optional) -- Two-letter postal code state abbreviation (example: state='NJ').
huc_id (str or list of strings, optional) -- Single HUC ID string or list of adjacent HUC ID strings.
polygon (str, optional) -- Path to location of shapefile. Must be readable by PyShp's shapefile.Reader().
polygon_crs (str, optional) -- CRS definition accepted by pyproj.CRS.from_user_input().
site_networks (str or list of strings, optional) -- Name(s) of site networks. Can be a string with a single network name, or a list of strings containing strings for multiple available networks. There are currently supported networks for stream gages (dataset=='usgs_nwis', variable='streamflow') and groundwater wells (dataset=='usgs_nwis', variable='water_table_depth'). For streamflow, options include: 'gagesii', 'gagesii_reference', 'hcdn2009', 'camels', and 'nwm'. For water table depth, options include: 'climate_response_network'.
- Returns:
DataFrame unique by site_id and variable_name containing site- and variable-level metadata.
- Return type:
DataFrame