Data to support Leveraging machine learning to automate regression model evaluations for large multi-site water-quality trend studies
Dates
Publication Date
2023-10-04
Time Period
2017
Time Period
2019
Citation
Murphy, J.C., and Chanat, J.G., 2023, Data to support Leveraging machine learning to automate regression model evaluations for large multi-site water-quality trend studies: U.S. Geological Survey data release, https://doi.org/10.5066/P9GNEN8S.
Summary
This data release contains one dataset and one model archive in support of the journal article, "Leveraging machine learning to automate regression model evaluations for large multi-site water-quality trend studies," by Jennifer C. Murphy and Jeffrey G. Chanat. The model archive contains scripts (run in R) to reproduce the four machine learning models (logistic regression, linear and quadratic discriminant analysis, and k-nearest neighbors) trained and tested as part of the journal article. The dataset contains the estimated probabilities for each of these models when applied to a training and test dataset.
Summary
This data release contains one dataset and one model archive in support of the journal article, "Leveraging machine learning to automate regression model evaluations for large multi-site water-quality trend studies," by Jennifer C. Murphy and Jeffrey G. Chanat. The model archive contains scripts (run in R) to reproduce the four machine learning models (logistic regression, linear and quadratic discriminant analysis, and k-nearest neighbors) trained and tested as part of the journal article. The dataset contains the estimated probabilities for each of these models when applied to a training and test dataset.
Click on title to download individual files attached to this item.
metadata.xml Original FGDC Metadata
View
19.73 KB
application/fgdc+xml
dataRelease-predictions.csv
2.33 MB
text/csv
model-archive.zip
27.16 KB
application/zip
readMe.txt
2.55 KB
text/plain
Related External Resources
Type: ScienceBase Repository
De Cicco, L.A., Sprague, L.A., Murphy, J.C., Riskin, M.L., Falcone, J.A., Stets, E.G., Oelsner, G.P., and Johnson, H.M., 2017, Water-quality and streamflow datasets used in the Weighted Regressions on Time, Discharge, and Season (WRTDS) models to determine trends in the Nation’s rivers and streams, 1972-2012 (ver. 1.1 July 7, 2017): U.S. Geological Survey data release, https://doi.org/10.5066/F7KW5D4H.
Murphy, J.C., Shoda, M.E. and Follette, D.D., 2020, Water-quality trends for rivers and streams in the Delaware River Basin using Weighted Regressions on Time, Discharge, and Season (WRTDS) models, Seasonal Kendall Trend (SKT) tests, and multisource data, Water Year 1978-2018: U.S. Geological Survey data release, https://doi.org/10.5066/P9KMWNJ5.
Murphy, J., and Chanat, J., 2023, Leveraging machine learning to automate regression model evaluations for large multi-site water-quality trend studies: Environmental Modelling & Software, v. 170, p. 105864, https://doi.org/10.1016/j.envsoft.2023.105864.