hf_hydrodata.point module

Module to retrieve point observations.

hf_hydrodata.point.get_point_data(*args, **kwargs)

Collect point observations data into a Pandas DataFrame.

Observations collected from HydroData for the specified data source, variable, temporal resolution, and aggregation. Optional arguments can be supplied for filters such as date bounds, geography bounds, and/or the minimum number of per-site observations allowed. Please see the package documentation for the full set of supported combinations.

Parameters:
  • dataset (str, required) -- Source from which requested data originated. Currently supported: 'usgs_nwis', 'snotel', 'scan', 'ameriflux'.

  • variable (str, required) -- Description of type of data requested. Currently supported: 'streamflow', 'water_table_depth', 'swe', 'precipitation', 'air_temp', 'soil_moisture', 'latent_heat', 'sensible_heat', 'downward_shortwave', 'downward_longwave', 'vapor_pressure_deficit', 'wind_speed'.

  • temporal_resolution (str, required) -- Collection frequency of data requested. Currently supported: 'daily', 'hourly', and 'instantaneous'. Please see the documentation for allowable combinations with variable.

  • aggregation (str, required) -- Additional information specifying the aggregation method for the variable to be returned. Options include descriptors such as 'mean' and 'sum'. Please see the documentation for allowable combinations with variable.

  • depth_level (int, optional) -- Depth level in inches at which the measurement is taken. Necessary for variable = 'soil_moisture'.

  • date_start (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned from this date forward (inclusive).

  • date_end (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned up through this date (inclusive).

  • latitude_range (tuple, optional) -- Latitude range bounds for the geographic domain; lesser value is provided first.

  • longitude_range (tuple, optional) -- Longitude range bounds for the geographic domain; lesser value is provided first.

  • site_ids (str or list of strings, optional) -- Single site ID string or list of desired (string) site identifiers.

  • state (str, optional) -- Two-letter postal code state abbreviation (example: state='NJ').

  • polygon (str, optional) -- Path to location of shapefile. Must be readable by PyShp's shapefile.Reader().

  • polygon_crs (str, optional) -- CRS definition accepted by pyproj.CRS.from_user_input().

  • site_networks (str or list of strings, optional) -- Name(s) of site networks. Can be a string with a single network name, or a list of strings containing strings for multiple available networks. There are currently supported networks for stream gages (dataset=='usgs_nwis', variable='streamflow') and groundwater wells (dataset=='usgs_nwis', variable='water_table_depth'). For streamflow, options include: 'gagesii', 'gagesii_reference', 'hcdn2009', and 'camels'. For water table depth, options include: 'climate_response_network'.

  • min_num_obs (int, optional) -- Value for the minimum number of observations desired for a site to have. If provided, data will be returned only for sites that have at least this number of non-NaN observation records within the requested date range (if supplied).

Returns:

data_df -- DataFrame with columns for each site_id satisfying input filters. Rows represent the date range requested from date_start and/or date_end, or the broadest range of data available for returned sites if no date range is explicitly requested.

Return type:

DataFrame

hf_hydrodata.point.get_point_metadata(*args, **kwargs)

Return DataFrame with site metadata for the filtered sites.

Parameters:
  • dataset (str, required) -- Source from which requested data originated. Currently supported: 'usgs_nwis', 'snotel', 'scan', 'ameriflux'.

  • variable (str, required) -- Description of type of data requested. Currently supported: 'streamflow', 'water_table_depth', 'swe', 'precipitation', 'air_temp', 'soil_moisture', 'latent_heat', 'sensible_heat', 'downward_shortwave', 'downward_longwave', 'vapor_pressure_deficit', 'wind_speed'.

  • temporal_resolution (str, required) -- Collection frequency of data requested. Currently supported: 'daily', 'hourly', and 'instantaneous'. Please see the documentation for allowable combinations with variable.

  • aggregation (str, required) -- Additional information specifying the aggregation method for the variable to be returned. Options include descriptors such as 'mean' and 'sum'. Please see the documentation for allowable combinations with variable.

  • depth_level (int, optional) -- Depth level in inches at which the measurement is taken. Necessary for variable = 'soil_moisture'.

  • date_start (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned from this date forward (inclusive).

  • date_end (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned up through this date (inclusive).

  • latitude_range (tuple, optional) -- Latitude range bounds for the geographic domain; lesser value is provided first.

  • longitude_range (tuple, optional) -- Longitude range bounds for the geographic domain; lesser value is provided first.

  • site_ids (str or list of strings, optional) -- Single site ID string or list of desired (string) site identifiers.

  • state (str, optional) -- Two-letter postal code state abbreviation (example: state='NJ').

  • polygon (str, optional) -- Path to location of shapefile. Must be readable by PyShp's shapefile.Reader().

  • polygon_crs (str, optional) -- CRS definition accepted by pyproj.CRS.from_user_input().

  • site_networks (str or list of strings, optional) -- Name(s) of site networks. Can be a string with a single network name, or a list of strings containing strings for multiple available networks. There are currently supported networks for stream gages (dataset=='usgs_nwis', variable='streamflow') and groundwater wells (dataset=='usgs_nwis', variable='water_table_depth'). For streamflow, options include: 'gagesii', 'gagesii_reference', 'hcdn2009', and 'camels'. For water table depth, options include: 'climate_response_network'.

Returns:

Site-level DataFrame of site-level metadata.

Return type:

DataFrame

hf_hydrodata.point.get_site_variables(*args, **kwargs)

Return DataFrame with available sites, variables, and the period of record.

Parameters:
  • dataset (str, optional) -- Source from which requested data originated. Currently supported: 'usgs_nwis', 'snotel', 'scan', 'ameriflux'.

  • variable (str, required) -- Description of type of data requested. Currently supported: 'streamflow', 'water_table_depth', 'swe', 'precipitation', 'air_temp', 'soil_moisture', 'latent_heat', 'sensible_heat', 'downward_shortwave', 'downward_longwave', 'vapor_pressure_deficit', 'wind_speed'.

  • temporal_resolution (str, optional) -- Collection frequency of data requested. Currently supported: 'daily', 'hourly', and 'instantaneous'. Please see the documentation for allowable combinations with variable.

  • aggregation (str, optional) -- Additional information specifying the aggregation method for the variable to be returned. Options include descriptors such as 'mean' and 'sum'. Please see the documentation for allowable combinations with variable.

  • date_start (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned from this date forward (inclusive).

  • date_end (str, optional) -- A date provided as a string in 'YYYY-MM-DD' format. If provided, data will be returned up through this date (inclusive).

  • latitude_range (tuple, optional) -- Latitude range bounds for the geographic domain; lesser value is provided first.

  • longitude_range (tuple, optional) -- Longitude range bounds for the geographic domain; lesser value is provided first.

  • site_ids (str or list of strings, optional) -- Single site ID string or list of desired (string) site identifiers.

  • state (str, optional) -- Two-letter postal code state abbreviation (example: state='NJ').

  • polygon (str, optional) -- Path to location of shapefile. Must be readable by PyShp's shapefile.Reader().

  • polygon_crs (str, optional) -- CRS definition accepted by pyproj.CRS.from_user_input().

  • site_networks (str or list of strings, optional) -- Name(s) of site networks. Can be a string with a single network name, or a list of strings containing strings for multiple available networks. There are currently supported networks for stream gages (dataset=='usgs_nwis', variable='streamflow') and groundwater wells (dataset=='usgs_nwis', variable='water_table_depth'). For streamflow, options include: 'gagesii', 'gagesii_reference', 'hcdn2009', and 'camels'. For water table depth, options include: 'climate_response_network'.

Returns:

DataFrame unique by site_id and variable_name containing site- and variable-level metadata.

Return type:

DataFrame