hf_hydrodata.point module

Module to retrieve point observations.

hf_hydrodata.point.get_point_data(*args, **kwargs)

Collect point observations data into a Pandas DataFrame.

Observations collected from HydroData for the specified data source, variable, temporal resolution, and aggregation. Optional arguments can be supplied for filters such as date bounds, geography bounds, and/or the minimum number of per-site observations allowed. Please see the package documentation for the full set of supported combinations.

Parameters:
  • dataset (str, required) -- Source from which requested data originated. Currently supported: 'usgs_nwis', 'snotel', 'scan', 'ameriflux', 'jasechko_2024', 'fan_2013'.

  • variable (str, required) -- Description of type of data requested. Currently supported: 'streamflow', 'water_table_depth', 'swe', 'precipitation', 'air_temp', 'soil_moisture', 'latent_heat', 'sensible_heat', 'downward_shortwave', 'downward_longwave', 'vapor_pressure_deficit', 'wind_speed'.

  • temporal_resolution (str, required) -- Collection frequency of data requested. Currently supported: 'daily', 'hourly', 'instantaneous', 'yearly', and 'long_term'. Please see the documentation for allowable combinations with variable.

  • aggregation (str, required) -- Additional information specifying the aggregation method for the variable to be returned. Options include descriptors such as 'mean' and 'sum'. Please see the documentation for allowable combinations with variable.

  • depth_level (int, optional) -- Depth level in inches at which the measurement is taken. Necessary for variable = 'soil_moisture'.

  • date_start (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned from this date forward (inclusive).

  • date_end (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned up through this date (inclusive).

  • latitude_range (tuple, optional) -- Latitude range bounds for the geographic domain; lesser value is provided first.

  • longitude_range (tuple, optional) -- Longitude range bounds for the geographic domain; lesser value is provided first.

  • grid (str, optional) -- Value of either 'conus1' or 'conus2'. Used in combination with parameter grid_bounds to extract site locations for a specific region of conus coordinates.

  • grid_bounds (list of integers, optional) -- A list of points [left, bottom, right, top] in ij grid coordinates of the grid supplied by the grid parameter.

  • site_ids (str or list of strings, optional) -- Single site ID string or list of desired (string) site identifiers.

  • state (str, optional) -- Two-letter postal code state abbreviation (example: state='NJ').

  • huc_id (str or list of strings, optional) -- Single HUC ID string or list of adjacent HUC ID strings.

  • polygon (str, optional) -- Path to location of shapefile. Must be readable by PyShp's shapefile.Reader().

  • polygon_crs (str, optional) -- CRS definition accepted by pyproj.CRS.from_user_input().

  • site_networks (str or list of strings, optional) -- Name(s) of site networks. Can be a string with a single network name, or a list of strings containing strings for multiple available networks. There are currently supported networks for stream gages (dataset=='usgs_nwis', variable='streamflow') and groundwater wells (dataset=='usgs_nwis', variable='water_table_depth'). For streamflow, options include: 'gagesii', 'gagesii_reference', 'hcdn2009', 'camels', and 'nwm'. For water table depth, options include: 'climate_response_network'.

  • min_num_obs (int, optional) -- Value for the minimum number of observations desired for a site to have. If provided, data will be returned only for sites that have at least this number of non-NaN observation records within the requested date range (if supplied).

Returns:

  • data_df (DataFrame) -- DataFrame with columns for each site_id satisfying input filters. Rows represent the date range requested from date_start and/or date_end, or the broadest range of data available for returned sites if no date range is explicitly requested.

  • If the environment variable HUC_VERSION is set this will cause the function to use the HUC boundaries for

  • that dataset_version when HUC is passed as a option.

  • The versions 2025_06, 2025_01, 2024_11 are supported as well as blank to use the latest HUC boundaries.

hf_hydrodata.point.get_point_metadata(*args, **kwargs)

Return DataFrame with site metadata for the filtered sites.

Parameters:
  • dataset (str, required) -- Source from which requested data originated. Currently supported: 'usgs_nwis', 'snotel', 'scan', 'ameriflux', 'jasechko_2024', 'fan_2013'.

  • variable (str, required) -- Description of type of data requested. Currently supported: 'streamflow', 'water_table_depth', 'swe', 'precipitation', 'air_temp', 'soil_moisture', 'latent_heat', 'sensible_heat', 'downward_shortwave', 'downward_longwave', 'vapor_pressure_deficit', 'wind_speed'.

  • temporal_resolution (str, required) -- Collection frequency of data requested. Currently supported: 'daily', 'hourly', 'instantaneous', 'yearly', and 'multiyear'. Please see the documentation for allowable combinations with variable.

  • aggregation (str, required) -- Additional information specifying the aggregation method for the variable to be returned. Options include descriptors such as 'mean' and 'sum'. Please see the documentation for allowable combinations with variable.

  • depth_level (int, optional) -- Depth level in inches at which the measurement is taken. Necessary for variable = 'soil_moisture'.

  • date_start (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned from this date forward (inclusive).

  • date_end (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned up through this date (inclusive).

  • latitude_range (tuple, optional) -- Latitude range bounds for the geographic domain; lesser value is provided first.

  • longitude_range (tuple, optional) -- Longitude range bounds for the geographic domain; lesser value is provided first.

  • grid (str, optional) -- Value of either 'conus1' or 'conus2'. Used in combination with parameter grid_bounds to extract site locations for a specific region of conus coordinates.

  • grid_bounds (list of integers, optional) -- A list of points [left, bottom, right, top] in ij grid coordinates of the grid supplied by the grid parameter.

  • site_ids (str or list of strings, optional) -- Single site ID string or list of desired (string) site identifiers.

  • state (str, optional) -- Two-letter postal code state abbreviation (example: state='NJ').

  • huc_id (str or list of strings, optional) -- Single HUC ID string or list of adjacent HUC ID strings.

  • polygon (str, optional) -- Path to location of shapefile. Must be readable by PyShp's shapefile.Reader().

  • polygon_crs (str, optional) -- CRS definition accepted by pyproj.CRS.from_user_input().

  • site_networks (str or list of strings, optional) -- Name(s) of site networks. Can be a string with a single network name, or a list of strings containing strings for multiple available networks. There are currently supported networks for stream gages (dataset=='usgs_nwis', variable='streamflow') and groundwater wells (dataset=='usgs_nwis', variable='water_table_depth'). For streamflow, options include: 'gagesii', 'gagesii_reference', 'hcdn2009', 'camels', and 'nwm'. For water table depth, options include: 'climate_response_network'.

Returns:

Site-level DataFrame of site-level metadata.

Return type:

DataFrame

hf_hydrodata.point.get_site_variables(*args, **kwargs)

Return DataFrame with available sites, variables, and the period of record.

Parameters:
  • dataset (str, optional) -- Source from which requested data originated. Currently supported: 'usgs_nwis', 'snotel', 'scan', 'ameriflux', 'jasechko_2024', and 'fan_2013'.

  • variable (str, optional) -- Description of type of data requested. Currently supported: 'streamflow', 'water_table_depth', 'swe', 'precipitation', 'air_temp', 'soil_moisture', 'latent_heat', 'sensible_heat', 'downward_shortwave', 'downward_longwave', 'vapor_pressure_deficit', 'wind_speed'.

  • temporal_resolution (str, optional) -- Collection frequency of data requested. Currently supported: 'daily', 'hourly', 'instantaneous', 'yearly', and 'long_term'. Please see the documentation for allowable combinations with variable.

  • aggregation (str, optional) -- Additional information specifying the aggregation method for the variable to be returned. Options include descriptors such as 'mean' and 'sum'. Please see the documentation for allowable combinations with variable.

  • date_start (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned from this date forward (inclusive).

  • date_end (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned up through this date (inclusive).

  • latitude_range (tuple, optional) -- Latitude range bounds for the geographic domain; lesser value is provided first.

  • longitude_range (tuple, optional) -- Longitude range bounds for the geographic domain; lesser value is provided first.

  • grid (str, optional) -- Value of either 'conus1' or 'conus2'. Used in combination with parameter grid_bounds to extract site locations for a specific region of conus coordinates.

  • grid_bounds (list of integers, optional) -- A list of points [left, bottom, right, top] in ij grid coordinates of the grid supplied by the grid parameter.

  • site_ids (str or list of strings, optional) -- Single site ID string or list of desired (string) site identifiers.

  • state (str, optional) -- Two-letter postal code state abbreviation (example: state='NJ').

  • huc_id (str or list of strings, optional) -- Single HUC ID string or list of adjacent HUC ID strings.

  • polygon (str, optional) -- Path to location of shapefile. Must be readable by PyShp's shapefile.Reader().

  • polygon_crs (str, optional) -- CRS definition accepted by pyproj.CRS.from_user_input().

  • site_networks (str or list of strings, optional) -- Name(s) of site networks. Can be a string with a single network name, or a list of strings containing strings for multiple available networks. There are currently supported networks for stream gages (dataset=='usgs_nwis', variable='streamflow') and groundwater wells (dataset=='usgs_nwis', variable='water_table_depth'). For streamflow, options include: 'gagesii', 'gagesii_reference', 'hcdn2009', 'camels', and 'nwm'. For water table depth, options include: 'climate_response_network'.

Returns:

DataFrame unique by site_id and variable_name containing site- and variable-level metadata.

Return type:

DataFrame