Daily maximum and minimum temperature values for entity 1 are generated from the raw 15-min HOBO data files. The data are subjected to a series of QC checks including range, rate of change, sufficient number of observations, standard deviation, and multi-day persistence checks. For range checks, the values are compared to the state monthly record. If a value falls outside of 2.8 degree C buffer, the value is removed from further consideration. For the rate of change check, if the current temperature is not missing and within one hour of the previous valid temperature, the difference between the temperatures is calculated. If the temperature difference exceeds 10 degrees C, the current temperature is removed from further consideration. At this point, the maximum and minimum of all non-missing temperatures for the day become the daily maximum and minimum temperatures. To check if there were a sufficient number of observations for the day, if there were valid observations in less than 18 of the 24 hours in the day, the daily maximum and minimum temperatures are flagged as an insufficient number of hourly observations to generate a valid daily observation. The standard deviation of all observations for the day is calculated and if the standard deviation is less than 0.1 degrees C, the daily observations are flagged as having too little variation over the day to be considered valid. Finally, the daily observations are checked for multi-day persistence. The difference between successive daily observations is calculated and if the difference is 0.4 degrees C or less for more than 10 successive observations, all observations in the sequence (not already flagged by previous checks) are flagged as being too many days with the same value. There can be no more than two missing days between successive observations before the sequence is reset. The 0.4 degree C threshold was chosen to identify occurrences of drift in automated sensors. Maximum and minimum temperatures are tested for multi-day persistence independently to account for issues that affect only one end of the temperature range, or where maximum and minimum temperatures are measured independently.
The 15-minute QC'd data in entity 2 was a one-time process to create this dataset for publication and is separate from the processing of data in entity 1. The raw 15-minute data are first run through an R program that concatenates the separate files from the HOBO datalogger into a single, uniformly formatted, time series with time gaps filled with blank records. The concatenated data are then run through a Python program (hja_hobo_clean) created for the purpose of data quality assurance and quality control of HOBO temperature sensor data collected at the Andrews Forest. The program flags data based on user-specified parameters. The program can also remove flagged data (clean) and fill in missing data with regressed data from other sites, but these procedures where not used for these data. Visual inspection was also done of the data from some sites - those on the Lower Lookout (LL) and Upper Lookout (UL) transects only. Visual inspection consisted of comparing all sensors within a transect to identify suspect data (e.g., a sensor whose reading were clearly inconsistent with others and with its own previous patterns). For all sites, field notes were also used to identify data when a sensor may have been on ground and/or its radiation shield was broken. Lastly, data are run through a second R program that flags data based on results from the visual inspection and field notes.