Water management in California is plagued by missing data. Typical questions include:
How much water should naturally be flowing in this stream?
How much water is actually flowing in this stream?
Stream gages only cover about 10% of rivers in the state, and 70% of watersheds have no active gages and no history of gages. But with recent advances in data science and machine learning, we will soon be able to answer these questions for most rivers in California. In this talk we will present the results of our machine learning pipeline that converts monthly precipitation and temperature data into natural or unimpaired stream flow predictions for >95% of the rivers in California. These data are currently available on https://rivers.codefornature.org/. We will also present some initial results from our efforts to predict both unimpaired and impaired (actual) flows at the daily time-step from 2000 to the present. The presentation will conclude with a Q&A session where the audience can interact with the speakers to explore the implications of predicting river flows with machine learning.