Menu

HT006
Stream temperature at core phenology stream sites in the Andrews Experimental Forest, 2009 - 2014

CREATOR(S): Sherri L. Johnson
PRINCIPAL INVESTIGATOR(S): Sherri L. Johnson
ORIGINATOR(S): Sherri L. Johnson
OTHER RESEARCHER(S): Judith L. Li, Mark D Schulze, Ivan Arismendi, Christina Murphy
METADATA CREATION DATE:
17 Dec 2012
MOST RECENT METADATA REVIEW DATE:
23 Jun 2017
KEYWORDS:
Populations, populations, phenology, water temperature, forests, streams
PURPOSE:
These stream temperature data were collected to support phenological research focused on increasing our understanding of factors that influence instream insect responses across the Andrews Experimental Forest. Species, taxa and communities within and across trophic levels are likely responding differently to thermal conditions.
METHODS:
Experimental Design - HT006:
Description:

Six streams phenology sites were selected to evaluate the variability of springtime aquatic insect emergence and phenology (SA025) across the Andrews Forest. Stream temperature sensors were installed in these six streams and temperatures measured year round so that calculations of themal accumulation and degree days could occur.

At low elevation, a gaged stream through old growth forest and a stream through a young forest were selected. At high elevation, a similar pair of gaged streams were selected. Prior information on insect community composition and seasonal emergence (SA022) was used to select these sites.

Two additional sites were selected; one was a cold water stream whose hydrology and thermal regime is greatly impacted by a headwater spring. The final site was located near the cold water spring and in a stream that goes dry in the later summer.

Field Methods - HT006:
Description: Temperature sensors (Onset Hobo U22-001 (accuracy 0.2 C)) were programmed to record instantaneous temperatures at 30 min intervals and placed in protective flow through housings and secured into the streams. Data were downloaded at a minimum of twice per year.
Statistics - HT006:
Description:

Evaluation of the high resolution temperatures (15 and 20 minute intervals) were conducted and questionable data flagged. Before calculation of hourly averages, flagged data were removed and missing values calculated using regression relationships with other temperature data from Andrews Forest for that period.

Script name: WaTeR.py

This script was designed to flag, clean, average by hour, and fill data from air temperature sensors deployed at the HJ Andrews.

Datasets: The script currently creates folders: flagged, cleaned, reference, and filled. These contain:

  • flagged original data with flags
  • cleaned data with the flags removed
  • reference cleaned data averaged into hourly timesteps
  • filled cleaned data which has been filled using other files in the reference folder based on their respective regressions
  • summary daily values with min, max and mean

Requires: Python, SciPy and NumPy

Note: This program was written for data loggers started during June (daylight savings time). Since ONSET uses the computer clock for time stamps, raw times are in PDT not PST. Hourly averaging changes times to PST and matches reference file format (where the hour represents the average of temperatures in the preceding hour). If original logger start dates are not in PDT, there is a line of code (currently 313) which can be turned on.

Settings: This may be run for a file or folder. The file or folder should be found in the same directory as the script. Input the file or folder name (e.g. INPUTFOLDER="Folder") and comment out the unused line (e.g. #INPUTFILE). Input files should contain the site name as this name is retained through processing. Date limits should be specified under the '#Date limits' heading. These form the bookends in which the program will attempt to fill gaps using the available reference files. Reference files are stored together in a folder (e.g. REFERENCE_DIR = "RS data for PC sites") and are labeled with *reformatted* to distinguish them from reference files which have not been modified to match the required input format. All sites are added as they are run, so this should be run twice if they are not already included in the reference folder and you want them as reference files for sites run in the same batch. Site files will not be used if they have *cleaned*, *filled* or *flagged* in their file name to avoid using processed data. Reference files must be in the correct format and include *reformatted* in their filename or they will not be used. These labels are consistent with the output files from this script as well as the script to convert reference data downloaded from the Andrews website (convert_reference_data.py).

Description: This script serves to flag, prune, average (by hour), and fill air temperature data as detailed below.

Step 1: Flagging (Original time steps); Output file – (input file name)_flagged_00-0000.csv, where 00-0000 is the month-year of the last data point

Flagging identifies for each line (date/time) entry:

nodata - Date/time recorded on the logger but contains no data. Interval does not equal 15 or 20 minutes – The time between samples was not as programmed extreme – Any temperature exceeding 20 deg C or less than -20 deg C (outside of the sensor range) jump – If the change in temperature exceeds 5 deg C in one time interval air – A forward rolling window, flags when the variation in temperatures within the 24hr period is greater than the TVAR_MAX (specified at the opening of the code, currently 1.5 deg C and the temperatures are below TMIN (currently 0.2 deg C) air_past – Looks back and flags if a 'Snow' flag is present in the past 24 hours

Step 2: Pruning (Original time step); Output file – (input file name)_cleaned_00-000.csv

Pruning removes lines containing extreme, air_past, air, jump and nodata.

Step 3: Averaging (Hourly time step); Output file – (input file name)_00-0000_reformatted.csv

Averaging uses only values remaining after pruning. The number of values used to calculate the average is included as a new column. Notes: The command for saving this output file includes the path for the reference folder, if that folder is changed, it should also be changed in this section (or a new folder will be made with the files but they will not be used for filling). Averaging follows the convention used for Andrews weather stations where the hour represents the average of temperatures in the preceding hour. The output is in PST (while all previous outputs are in PDT, matching the raw input).

Step 4: Filling (Hourly time step); Output file – (input file name)_filled_00-000.csv

The script uses cleaned data and compares remaining entries to reference files (see Settings). This is done as a linear regression of the cleaned data with each reference file, the output includes the R2 which can be found in the text file corresponding to the input file name. Prior to filling, the script creates placeholder hours bounded by the date range specified under “Date limits” which is the range in which filling is attempted. These values are set at 1000 degrees. The script aims to fill missing (1000 degree) data by moving sequentially through the reference data in order of fit (R2). The linear regression equation is used to modify the reference value for that data point and it replaces the 1000 degree placeholder. The reference file used in the temperature value filling is listed in a neighboring column. If all reference files are examined and no data is found to replace the missing value placeholder, the placeholder is retained, thus 1000 degrees should be treated as "no data".

Step 5: Max, min, mean (Daily time step); Output file – (input file name)_daily_00-0000.csv

The script ignores 1000 degree data and calculates daily max, min and mean temperature values from the filled dataset. The number of records (hours) used in the calculation is listed in the column 'count.'

Figures: Data points over time with flagged (and cleaned) data shown in red. Located within the "flagged" folder as .pdfs.

Quality Assurance - HT006:
Description:

After sensors were downloaded, high resolution data were put through a series of programs for quality control and for filling missing values before generating the hourly averages. A major concern of quality control was to detect when the sensors were not in the stream and recording air temperatures. When data anomalies or gaps in data were found, data were filled using the regression relationships with other sensors from this project and from stream temperature data from stream gages. Regressions were calculated using the best fit with other sensors during periods of time when the full data were available.

After the averaging and filling, data were also visually evaluated for outliers. If outliers were found, estimates were calculated and inserted.

SITE DESCRIPTION:
Andrews Phenology sites
TAXONOMIC SYSTEM:
None
GEOGRAPHIC EXTENT:
HJ Andrews Experimental Forest streams - phenology emergence sites
ELEVATION_MINIMUM (meters):
457
ELEVATION_MAXIMUM (meters):
1000
MEASUREMENT FREQUENCY:
30 minutes
PROGRESS DESCRIPTION:
Complete
UPDATE FREQUENCY DESCRIPTION:
notPlanned
CURRENTNESS REFERENCE:
Observed