Groundwater arsenic data and ASCII grids for predicting elevated arsenic in northwestern and central Minnesota using boosted regression tree methods
Dates
Publication Date
2019-01-28
Time Period
1980
Time Period
2016
Citation
Elliott, S.M., and Christenson, C.A., 2019, Groundwater arsenic data and ASCII grids for predicting elevated arsenic in northwestern and central Minnesota using boosted regression tree methods: U.S. Geological Survey data release, https://doi.org/10.5066/F77H1HH8.
Summary
This data release contains: (1) ASCII grids of predicted probability of elevated arsenic in groundwater for the Northwest and Central Minnesota regions, (2) input arsenic and predictive variable data used in model development and calculation of predictions, and (3) ASCII files used to predict the probability of elevated arsenic across the two study regions. The probability of elevated arsenic was predicted using Boosted Regression Tree (BRT) modeling methods using the gbm package in R Studio version 3.4.2. The response variable was the presence or absence of arsenic >10 µg/L, the U.S. Environmental Protection Agency’s maximum contaminant level for arsenic, in 3,283 wells located throughout both study regions (1,363 in the Northwest [...]
Summary
This data release contains: (1) ASCII grids of predicted probability of elevated arsenic in groundwater for the Northwest and Central Minnesota regions, (2) input arsenic and predictive variable data used in model development and calculation of predictions, and (3) ASCII files used to predict the probability of elevated arsenic across the two study regions. The probability of elevated arsenic was predicted using Boosted Regression Tree (BRT) modeling methods using the gbm package in R Studio version 3.4.2. The response variable was the presence or absence of arsenic >10 µg/L, the U.S. Environmental Protection Agency’s maximum contaminant level for arsenic, in 3,283 wells located throughout both study regions (1,363 in the Northwest region and 1,920 in the Central). The original database used to develop the BRT model consisted of 127 predictor variables which included well characteristics, land use, soil properties, aquifer properties, depth to water table, and predicted nitrate. After optimization steps, a final database of 33 predictor variables was used to predict the occurrence of elevated arsenic across the two study regions.
Arsenic is a naturally-occurring contaminant in geologically diverse aquifers throughout the world, making chronic exposure to elevated arsenic via drinking water a human health concern. In Minnesota, USA, elevated arsenic concentrations are prevalent in drinking water aquifers in certain regions of the state. This data set was used to develop a model to predict the probability of elevated arsenic within two vulnerable regions in Minnesota. Results can help well drillers and homeowners identify important variables that may be controlled during well construction to minimize exposure to arsenic.