Large Scale Representation Learning of Atmospheric Data

The AtmoRep project asks if one can train one neural network that represents and describes all atmospheric dynamics. AtmoRep's ambition is hence to demonstrate that the concept of large-scale representation learning, whose principle feasibility and potential was established by large language models such as GPT-3, is also applicable to scientific data and in particular to atmospheric dynamics. The project is enabled by the large amounts of atmospheric observations that have been made in the past as well as advances on neural network architectures and self-supervised learning that allow for effective training on petabytes of data. Eventually, we aim to train on all of the ERA5 reanalysis and, furthermore, fine tune on observational data such as satellite measurements to move beyond the limits of reanalyses.


  • December 2022: AtmoRep presented an online poster at AGU on Tuesday, Dec. 13th.

  • Publications

  • C. Lessig, at Google Research, October 2022; Slides.
  • I. Luise, DWD KI Forum, September 2022.
  • Poster
  • C. Lessig et al., American Geophysical Union Annual Meeting 2022; Poster.
  • C. Lessig et al., ECMWF Workshop on Machine Learning, Nov. 2022; Poster.
  • I. Luise et al., Italian Society for Climate Sciences Annual Conference 2022.