Papers
Topics
Authors
Recent
Search
2000 character limit reached

QuakeFlow: A Scalable Machine-learning-based Earthquake Monitoring Workflow with Cloud Computing

Published 30 Aug 2022 in physics.geo-ph | (2208.14564v1)

Abstract: Earthquake monitoring workflows are designed to detect earthquake signals and to determine source characteristics from continuous waveform data. Recent developments in deep learning seismology have been used to improve tasks within earthquake monitoring workflows that allow the fast and accurate detection of up to orders of magnitude more small events than are present in conventional catalogs. To facilitate the application of machine-learning algorithms to large-volume seismic records, we developed a cloud-based earthquake monitoring workflow, QuakeFlow, that applies multiple processing steps to generate earthquake catalogs from raw seismic data. QuakeFlow uses a deep learning model, PhaseNet, for picking P/S phases and a machine learning model, GaMMA, for phase association with approximate earthquake location and magnitude. Each component in QuakeFlow is containerized, allowing straightforward updates to the pipeline with new deep learning/machine learning models, as well as the ability to add new components, such as earthquake relocation algorithms. We built QuakeFlow in Kubernetes to make it auto-scale for large datasets and to make it easy to deploy on cloud platforms, which enables large-scale parallel processing. We used QuakeFlow to process three years of continuous archived data from Puerto Rico, and found more than a factor of ten more events that occurred on much the same structures as previously known seismicity. We applied Quakeflow to monitoring frequent earthquakes in Hawaii and found over an order of magnitude more events than are in the standard catalog, including many events that illuminate the deep structure of the magmatic system. We also added Kafka and Spark streaming to deliver real-time earthquake monitoring results. QuakeFlow is an effective and efficient approach both for improving realtime earthquake monitoring and for mining archived seismic data sets.

Citations (48)

Summary

  • The paper introduces QuakeFlow, a scalable framework that integrates deep learning and cloud technologies for enhanced seismic monitoring.
  • It employs PhaseNet for precise phase picking and GaMMA for effective phase association, significantly improving event detection.
  • The approach demonstrates a tenfold increase in event detection in tectonic and volcanic regions, indicating its potential for real-time earthquake monitoring.

QuakeFlow: Scalable Machine Learning-Based Earthquake Monitoring

Introduction

The paper "QuakeFlow: A Scalable Machine-learning-based Earthquake Monitoring Workflow with Cloud Computing" (2208.14564) introduces QuakeFlow, a novel cloud-based earthquake monitoring workflow tailored to integrate machine learning algorithms for processing large seismic datasets. This workflow addresses the data-intensive challenges commonly faced in earthquake signal detection and source characterization using continuous waveform data. The incorporation of machine learning algorithms like PhaseNet for phase picking and GaMMA for phase association, alongside cloud computing technologies, propels the capability of generating detailed earthquake catalogs from vast seismic records.

Earthquake Monitoring Workflow

QuakeFlow is designed with a modular architecture to streamline the tasks of phase detection, association, location, and characterization. Key components include PhaseNet, which utilizes deep convolutional networks for precise P- and S-phase arrival time picking, and GaMMA, a Bayesian Gaussian mixture model algorithm, for effective phase association. The framework is implemented in Kubernetes to leverage auto-scaling and containerization—facilitating seamless integration and scalability across cloud platforms, thus enabling efficient large-scale parallel processing.

Machine Learning Models

PhaseNet: PhaseNet excels in picking seismic phase arrival times with improved precision, achieving significant detection performance over conventional methods. Trained on extensive labeled datasets, it predicts characteristic functions of seismic phases using deep convolutional neural networks. This capability results in detecting an order of magnitude more S-phase picks.

GaMMA: GaMMA addresses phase association, treating it as a clustering problem within a probabilistic framework. It incorporates varied phase data types—including arrival times and amplitudes—to associate picks from dense sequences, effectively estimating earthquake locations and magnitudes under complex seismic conditions.

Cloud Computing and Implementation

QuakeFlow utilizes cloud-native tools including Kubernetes, Kafka, and Spark Streaming for data orchestration and real-time processing. Kafka facilitates seamless streaming of seismic waveform data, serving as the backbone for real-time monitoring. Meanwhile, Spark Streaming supports scaled data processing and transformation for machine learning model predictions. Kubernetes ensures efficient resource utilization through auto-scaling, optimizing computational workloads.

Applications and Numerical Results

The study explores two applications of QuakeFlow: monitoring tectonic earthquakes in Puerto Rico and volcanic earthquakes in Hawaii. Significant results include detecting over ten times more events than traditional catalogs, especially small-magnitude earthquakes, contributing to enhanced temporal and spatial resolution in seismic monitoring.

  • Puerto Rico: Analysis of three years of data revealed numerous new events, improving understanding of complex seismic sequences and underlying fault structures, crucial for hazard quantification and forecasting.
  • Hawaii: The improved catalog from Hawaii's volcanic regions provides a clearer picture of magmatic system dynamics, delineating connections between deep and shallow seismic events, illuminating magma transport pathways.

Future Directions and Implications

QuakeFlow represents a significant step toward integrating advanced machine learning methodologies in operational earthquake monitoring workflows. Its scalable architecture invites future improvements through containerized updates and further model enhancements. Practical implications include real-time seismic monitoring capabilities, potential cross-domain applications, and advancing understanding of seismic phenomena in geothermal and tectonically active regions.

Conclusion

QuakeFlow amalgamates machine learning with cloud computing to redefine earthquake monitoring, enabling precise and efficient processing of extensive seismic datasets. Its modular and scalable nature not only anticipates rapid advancements in AI but also facilitates widespread adoption across diverse seismic networks. The demonstrated efficacy in Puerto Rico and Hawaii showcases its potential as a transformative tool, driving future innovation in computational seismology.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.