EQcorrscan tutorial¶
Welcome to EQcorrscan - this package is designed to compute earthquake detections using a paralleled matched-filter network cross-correlation routine. The inner loop of this package is the cross-correlation of templates of seismic data with day-ong seismic data. This inner function is the openCV.match_template function - this appears to be a well optimized cross-correlation function, and is written in c++. Cross-correlations are computed in the frequency domain for large datasets, for which a day of seismic data usually qualifies.
Before continuing with this tutorial please check that you have installed all the pre-requisite modules, as not all will be installed by the setup.py file. The list of these is in the Introduction section of this documentation.
As you will see, this package is divided into two main sub-modules, the Core and Utils sub-modules. The Core sub-module contains the main, high-level functions:
bright_lights: | A brightness based template detection routine; |
---|---|
template_gen: | A series of routines to generate templates for match-filter detection from continuous or cut data, with pick-times defined either manually, or from a Seisan s-file; |
match_filter: | The main matched-filter routines, this is split into several smaller functions to allow python based parallelisation; |
lag_calc: | Routines for calculating optimal lag-times for events detected by the match-filter routine, these lags can then be used to define new picks for high accuracy relocations. |
The Utils sub-module contains useful, but small functions. These functions are rarely cpu intensive, but perform vital operations, such as reading Seisan s-files, finding peaks in noisy data, converting a seisan database to hypoDD formatted files and computing cross-correlations between detections for hypoDD (a double difference reloaction software), calculating magnitudes, clustering detections, stacking detections, making pretty plots, and processing seismic data in the same way repeatedly using Obspy‘s functionality.
Matched-filter detection¶
In this section we will discuss generating a pair of templates from two Seisan s-files before using these templates to scan for similar earthquakes within a day of data. This small example does not truly exploit the parallel operations within this package however, so you would be encouraged to think about where parallel operations occur (hint, the code can run one template per cpu), and why there are –instance and–splits flags in the other scripts in the guthub repository (*hint, if you have heaps of memory and cpus
you can do some brute force day parallelisation!*).
The following script is included in the top-level directory alongside the full-scripts used by the author to generate a 6.5 year long catalogue of low-frequency earthquakes for the central Southern Alps of New Zealand.
This tutorial script highlights the ability of the match-filter method in detecting earthquakes of near-repeating nature. The dataset is a day of data taken from the New Zealand national database, and the Southern Alp Microearthquake Borehole Array (SAMBA) network (Boese et al. 2012). This day was found to contain a swarm of earthquakes, as published by Boese et al. (2014), the s-files provided are two of these events.
The main processing flow is outlined in the figure below, note the main speedups in this process are achieved by running multiple templates at once, however this increases memory usage. If memory is a problem there are flags (mem_issue) in the match_filter.py source that can be turned on - the codes will then write temporary files, which is slower, but can allow for more data crunching at once, your trade-off, your call.

References¶
- CM Boese, J Townend, E Smith, T Stern (2012). Microseismicity and stress in the vicinity of the Alpine Fault, central Southern Alps, New Zealand, JGR, doi:10.1029/2011JB008460
- CM Boese, KM Jacobs, EGC Smith, TA Stern, J Townend (2014). Background and delayed-triggered swarms in the central Southern Alps, South Island, New Zealand, G-cubed, doi:10.1002/2013GC005171
#!/usr/bin/env python
r"""Tutorial This script is designed as a tutorial to highlight how to\
call the main functions within the EQcorrscan module. In this tutorial
we will see how to generate a template and run this through the
matched-filter routine.
The template will be generated from a pre-picked earthquake, however there
are other ways to generate templates, for example this package also contains
a simple brightness function that is designed to scan continuous seismic
data and look for impulsive energy originating from a discrete point in a
seismic velocity model.
This package is dstributed under the LGPL v3.0, by using this script and the
functions contained within the EQcorrscan package you implicitly accept the
licence. For the full wording of the licence refer to the licence.txt file.
Copyright 2015 Calum Chamberlain
This file is part of EQcorrscan.
EQcorrscan is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
EQcorrscan is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with EQcorrscan. If not, see <http://www.gnu.org/licenses/>.
"""
# First we import the required modules:
import os
from obspy import read
from obspy import Stream
from eqcorrscan.core import template_gen, match_filter
# Before calling these module imports for parameter files you should insert
# your own path into sys.path so that we find your parameter files.
from eqcorrscan.utils import pre_processing, Sfile_util
import glob
# Set up the default parameters - these used to be stored in parameter files
debug = 3 # High debug level should output lots to keep you informed
threshold = 8.0 # Threshold level as MAD multiplier
threshtype = 'MAD' # Threshold type, in this case Median Absolute Deviation
trig_int = 6.0 # Minimum trigger interval for one template in seconds
# Now we find the s-file we want to use to generate a template from
data_directory = os.path.join('test_data', 'tutorial_data')
sfiles = glob.glob(os.path.join(data_directory, '*L.S*'))
print sfiles
templates = []
template_names = []
for i, sfile in enumerate(sfiles):
# Read in the picks from the S-file, note, in the full case one fo the main\
# functions in template_gen would be used rather than this, but for\
# the tutorial we will read in the data here - also note that this\
# template generation is inefficient for multiple templates, if using\
# daylong data for multiple templates you would want to only read\
# the seismic data once and cut it multiple times.
picks=Sfile_util.readpicks(sfile)
for pick in picks:
print pick
if not 'wavefiles' in locals():
wavefiles = glob.glob(os.path.join(data_directory,
'.'.join([pick.station, '*'])))
else:
wavefiles += glob.glob(os.path.join(data_directory,
'.'.join([pick.station, '*'])))
wavefiles = list(set(wavefiles))
for wavefile in wavefiles:
print ' '.join(['Reading data from', wavefile])
if 'st' not in locals():
st = read(wavefile)
else:
st += read(wavefile)
st = st.merge(fill_value='interpolate')
day = st[0].stats.starttime.date
# Process the data with our required parameters
for tr in st:
tr = pre_processing.dayproc(tr, 1.0, 20.0, 3, 100.0,\
debug, day)
# Use the template generation function to cut our templates
template = template_gen._template_gen(picks, st, length=1.0, swin='all',
prepick=0.1, plot=True)
# This will generate an obspy.Stream object
# Append this Stream to the list of templates
templates += [template]
template_names.append('_'.join(['tutorial', str(i)]))
# Save template for later
template.write(os.path.join(data_directory, '_'.join([template_names[i],
'template.ms'])),
format='MSEED')
# Delete excess information from memory If you are re-using this script
# with the same templates you should be able to comment out this loop
# once you have generated your templates once.
del template, st
# Extract the stations from the templates
for template in templates:
if not 'stachans' in locals():
stachans = [(tr.stats.station, tr.stats.channel) for tr in template]
else:
stachans += [(tr.stats.station, tr.stats.channel) for tr in template]
# Make this a unique list
stachans = list(set(stachans))
# Read in the continuous data for these station, channel combinations
for stachan in stachans:
data_file = ''.join([stachan[0], '.*..*', stachan[1][-1], '.*'])
data_file = os.path.join(data_directory, data_file)
print ' '.join(['Reading data from:', data_file])
# Generate a new stream object and add to it
if 'st' not in locals():
st = read(data_file)
else:
st += read(data_file)
# Merge the data to account for miniseed files being written in chunks
# We need continuous day-long data, so data are padded if there are gaps
st = st.merge(fill_value='interpolate')
# Work out what day we are working on, required as we will pad the data to be daylong
day = st[0].stats.starttime.date
# Process the data in the same way as the template
for tr in st:
tr = pre_processing.dayproc(tr, 1.0, 20.0, 3, 100.0,\
debug, day)
# Compute detections
detections = match_filter.match_filter(template_names, templates, st,
threshold, threshtype, trig_int,
plotvar=True, cores=2, tempdir=False,
debug=debug, plot_format='pdf')
# We now have a list of detections! We can output these to a file to check later
f = open('tutorial_detections.csv', 'w')
for detection in detections:
line = ', '.join([detection.template_name, str(detection.detect_time),
str(detection.detect_val), str(detection.threshold),
str(detection.no_chans)])
f.write(line)
print line
f.write(os.linesep)
f.close()