Can we democratise AI for climate change?
Applying Machine Learning (ML) to Earth observation (EO) data gives us the ability to better make predictions about how to adapt and mitigate our changing climate. This project aims to create a new public ‘ML Toolbox’ comprised of an open repository of artificial intelligence tools, called ‘ML4CC’. The goal of this work is to develop methods to simplify ML production and validation and ultimately improve climate related decision-making within the UK as well as develop a comprehensive point of view on ‘the case for AI’ in tackling climate change.
Goal
MLOps is a set of tools and best practice that combines Machine Learning, DevOps and Data Engineering. The goal of MLOPs is to (1) streamline research overhead and (2) deploy and maintain ML systems in production reliably and efficiently. Development of robust MLOPs reproducible code and sample data that can be adapted to related flooding challenges. This work is an initial prototype study as part of a larger program aiming to democratise AI for climate and disaster related challenges.
How the sprint worked
ML4CC worked in teams of two (one research engineer and one data scientist) and were supported by a specialist machine learning mentor and domain specialist to develop MLOPs tools. These tools will draw upon the learning of existing ML pipelines (and data) of flood resilience and response developed at FDL (FDL.ai), supported by Google.
The ML4CC teams tackled opportunities in the following MLOps categories:
DATA PREP AND DISCOVERY
DATA ENHANCEMENT
PIPELINE DEFINITION/MODEL AUGMENTATION
VISULIZATION METHODS
The project spanned over 8 weeks part time (working remotely) and kicking off with initial introductions and a ‘Big Think’ meeting where the opportunities where discussed and directions identified.
The development of the MLOPS tools then took place over 6 weeks and in the 7th and 8th week, they were required to package their work for a peer review, engage in final code reviews and complete and annotate a Colab notebook in advance of it being open-sourced on SpaceML.
The Results
ML4Floods: an ecosystem of data, models and code pipelines to tackle flooding with ML - Click here
ML4CC Overview
Understanding and using the tools
The figure below presents an overview of the ML4Floods toolkit alongside the users of each component. The toolkit is structured as an end-to-end pipeline with components that 1) ingest, sort and organise satellite data, integrating ground-truth masks, 2) tile, augment and normalise the data, 3) train new models on the data, or run existing models on new data, and display uncertainty maps generated by the models, and 4) query and visualise the results via a web-based mapping application.
Each of the components 1-3 can be accessed via an application programming interface (API) so that technical users can fine-tune their workflows, or adopt the components in their own tools. However, the teams also developed a graphical interface (4) that can run the toolkit through a simple point-and-click interface. This last component places the power of ML-enhanced flood segmentation models in the hands of ordinary users, like disaster relief coordinators and urban planners. Finally, the graphical tool is incredibly useful for machine-learning researchers, allowing them to quickly compare and contrast model results on the same interface, greatly speeding up the model development process.
ML4CC CONSORTIUM PARTNERS