Integrating temporal cumulative Effects into the random forest model for short-term forecasting of cyanobacterial bloom area in Lake Chaohu
Author:
Affiliation:

College of Urban and Environmental Sciences, Northwest University

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Similar Literature
  • |
  • Cited by
  • |
  • Appendix
  • |
  • Comments
    Abstract:

    Cyanobacterial blooms have emerged as a global environmental challenge threatening lake ecosystem security and drinking water safety. Timely prediction of bloom outbreaks is critical for implementing preventive measures and reducing disaster risks. To overcome the limitations of conventional mechanism-driven models, including their numerous parameters and computational complexity, this study established an machine learning framework that integrates multi-source monitoring data and remote sensing observations for Lake Chaohu. By integrating multi-site meteorological and water quality measurements with satellite-derived time-series data, we investigated the temporal cumulative effects of meteorological and water quality variables on cyanobacterial blooms. Based on the Random Forest (RF) model, two forecasting models were developed: one considering the temporal cumulative effects of variables (cumulative variable model) and the other using only single-day observations (single-day variable model), to achieve 1–7day (d) forecasts of bloom coverage area. Additionally, SHapley Additive exPlanations (SHAP) analysis was further applied to decode the model"s decision-making mechanisms, revealing feature contributions and nonlinear threshold behaviors. The results showed that: (1) Meteorological variables (air temperature, humidity, precipitation, and air pressure) exhibited longer cumulative effect durations (15~30 days) compared to water quality variables (nitrogen, phosphorus, and dissolved oxygen (1~10 days); (2) Cumulative-variable models demonstrated superior predictive accuracy (R2 = 0.7~0.8) over single-day variable models (R2 = 0.4~0.6), with optimal 1-day ahead performance (R2 = 0.79, RMSE = 35.36 km2); (3) Critical thresholds were identified at average temperature approximately > 23°C, maximum wind speed approximately < 4 m/s, precipitation approximately > 200 mm, nitrogen-phosphorus ratio approximately < 15, pH > 8.5, and dissolved oxygen approximately < 8.9 mg/L. The proposed method enables high-precision short-term forecasting using multi-station monitoring data, holding promise for providing a transferable decision support framework for eutrophic lake management.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:April 07,2025
  • Revised:April 29,2026
  • Adopted:October 29,2025
  • Online: December 18,2025
  • Published:
You are the first    Visitors
Address:No.299, Chuangzhan Road, Qilin Street, Jiangning District, Nanjing, China    Postal Code:211135
Phone:025-86882041;86882040     Fax:025-57714759     Email:jlakes@niglas.ac.cn
Copyright © Lake Science, Nanjing Institute of Geography and Lake Sciences, Chinese Academy of Sciences:All Rights Reserved
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Su Gongwang Security No. 11040202500063

     苏ICP备09024011号-2