David Gagne: Coupling Data Science Techniques and Numerical Weather Prediction Models for High-Impact Weather Prediction

Members - Faculty, students, and collaborators
News - Recent news and publicty from members of the IDEA lab
Theses and Dissertations - Publications and code releases for student theses and disserations
Publications - Recent technical papers and presentations
Software - Recent software releases
 

Abstract

Meteorologists have access to more model guidance and observations than ever before, but this additional information does not necessarily lead to better forecasts. New tools are needed to reduce the cognitive load on forecasters and to provide them with accurate, reliable consensus guidance. Techniques from the data science community, such as machine learning and image processing, have the potential to summarize and calibrate numerical weather prediction model output and to generate deterministic and probabilistic forecasts of high-impact weather. In this dissertation, I developed data-science-based approaches to improve the predictions of two high-impact weather domains: hail and solar irradiance. Both hail and solar irradiance produce large economic impacts, have non-Gaussian distributions of occurrence, are poorly observed, and are partially driven by processes too small to be resolved by numerical weather prediction models.

Hail forecasts were produced with convection-allowing model output from the Center for Analysis and Prediction of Storms and National Center for Atmospheric Research ensembles. The machine learning hail forecasts were compared against storm surrogate variables and physics-based diagnostic models of hail size. Initial machine learning hail forecasts reduced size errors but struggled with predicting extreme events. By coupling the machine learning model to predicting hail size distributions and estimating the distribution parameters jointly, the machine learning methods were able to show skill and reliability in predicting both severe and significant hail.

Machine learning model and data configurations for gridded solar irradiance forecasting were evaluated on two numerical modeling systems. The evaluation determined how machine learning model choice, closeness of fit to training data, training data aggregation, and interpolation method affected forecasts of clearness index at Oklahoma Mesonet sites not included in the training data. The choice of machine learning model, interpolation scheme, and loss function had the biggest impacts on performance. Errors tended to be lower at testing sites with sunnier weather and those that were closer to training sites. All of the machine learning methods produced reliable predictions but underestimated the frequency of cloudiness compared to observations..

Dissertation

David Gagne (2016). Coupling Data Science Techniques and Numerical Weather Prediction Models for High-Impact Weather Prediction. PhD Thesis, School of Meteorology, University of Oklahoma.

Related publications and presentations

  • Related conference paper
    • Gagne II, David John and McGovern, Amy and Brotzge, Jerald and Coniglio, Michael and Correia, James and Xue, Ming. (2015) Day-Ahead Hourly Hail Prediction Integrating Machine Learning with Storm-Scale Numerical Weather Models. Proceedings of the 2015 Innovative Applications of Artificial Intelligence conference, pages 3954-3960. pdf (10 MB)
  • Two journal papers are in preparation and will be posted here when they are published.
  • Related AMS talks
    • Gagne II, David John and McGovern, Amy and Snook, N. and Sobash, R. A. and Xue, M. (2016) Severe Hail Forecasting Evaluation: Machine Learning and Severe Weather Proxy Variables. Presented at the 14th Conference on Artificial and Computational Intelligence and its Applications to the Environmental Sciences at the annual American Meteorological Society meeting. [Abstract and recorded presentation]
    • Gagne II, David John and Haupt, S. E. and Linden, S. and Wiener, G. (2016) An Evaluation of Statistical Learning Methods for Gridded Solar Irradiance Forecasting. Presented at the 14th Conference on Artificial and Computational Intelligence and its Applications to the Environmental Sciences at the annual American Meteorological Society meeting. [Abstract and recorded presentation]

Code

  • Hagelslag is an open-source python library for object detection and machine learning used for the hail research in this dissertation. There is an AMS poster presentation about Hagelslag linked below.
    • Gagne II, David John and McGovern, A, and Snook, N. and Sobash, R. A. and Labriola, J. D. and Williams, J. K. and Haupt, S. E. and Xue, M. (2016). Hagelslag: Scalable Object-Based Severe Weather Analysis and Forecasting. Presented at the Sixth Symposium on Advances in Modeling and Analysis Using Python. [Abstract and poster] [URL to open-source software release] [local pdf of poster]

Created by amcgovern [at] ou.edu.

Last modified June 12, 2017 12:57 PM