Using Knowledge-Guided Machine Learning To Assess Patterns of Areal Change in Waterbodies across the Contiguous United States

Environ Sci Technol. 2024 Mar 19;58(11):5003-5013. doi: 10.1021/acs.est.3c05784. Epub 2024 Mar 6.

Abstract

Lake and reservoir surface areas are an important proxy for freshwater availability. Advancements in machine learning (ML) techniques and increased accessibility of remote sensing data products have enabled the analysis of waterbody surface area dynamics on broad spatial scales. However, interpreting the ML results remains a challenge. While ML provides important tools for identifying patterns, the resultant models do not include mechanisms. Thus, the "black-box" nature of ML techniques often lacks ecological meaning. Using ML, we characterized temporal patterns in lake and reservoir surface area change from 1984 to 2016 for 103,930 waterbodies in the contiguous United States. We then employed knowledge-guided machine learning (KGML) to classify all waterbodies into seven ecologically interpretable groups representing distinct patterns of surface area change over time. Many waterbodies were classified as having "no change" (43%), whereas the remaining 57% of waterbodies fell into other groups representing both linear and nonlinear patterns. This analysis demonstrates the potential of KGML not only for identifying ecologically relevant patterns of change across time but also for unraveling complex processes that underpin those changes.

Keywords: K-means clustering; KGML; domain knowledge; limnology; machine learning; surface area; temporal change.

MeSH terms

  • Lakes*
  • Machine Learning*
  • United States