Processing Configuration¶

The processing_config (“config”) together with the kernel dataset represent the key inputs to the processing pipeline. The processing config is based on the mt_metadata.base.Base class. This means it is a container that has a JSON or dictionary representation.

It is the purpose of the config to encapsulate all the parameters required for processing. There are many parameters and effort has been given to selecting some “reasonable” default values so that users do not need to worry about all of these parameters if they don’t want to.

The ProcessingConfig is expected to evolve with aurora as new functionalities become available. This is one reason why such a genreic data srtuctrue was selected In this tutorial, we will use the synthetic dataset example to show some of the features of the config object.

Hopefully, it will be fairly easy to add other parameters to the config, such as:

coherencey sorting
polarization sorting
ARMA prewhitening
Other tools from the community

There are two main ways one would normally build the object:

Use the ConfigCreator class
Edit existing config.json files

But one can also initialize a Processing object, which is what is done inside ConfigCreator

from mt_metadata.transfer_functions.processing.aurora import Processing

p = Processing()

{
    "processing": {
        "channel_nomenclature.ex": "ex",
        "channel_nomenclature.ey": "ey",
        "channel_nomenclature.hx": "hx",
        "channel_nomenclature.hy": "hy",
        "channel_nomenclature.hz": "hz",
        "decimations": [],
        "id": null,
        "stations.local.id": null,
        "stations.local.mth5_path": null,
        "stations.local.remote": false,
        "stations.local.runs": [],
        "stations.remote": []
    }
}

Using ConfigCreator¶

from aurora.config import BANDS_DEFAULT_FILE
from aurora.config.config_creator import ConfigCreator

The config creator does not need any arguments to initialize, but it uses a KernelDataset to generate a processing object with station and run information, and default settings are applied to the processing parameters.

cc = ConfigCreator()

The ConfigCreator class generates a processing config with default arguments if it is provided with a KernelDataset

Example of making a KernelDataset from an mth5¶

from aurora.test_utils.synthetic.paths import SyntheticTestPaths
from mtpy.processing import RunSummary, KernelDataset

synthetic_test_paths = SyntheticTestPaths()
MTH5_PATH = synthetic_test_paths.mth5_path

Make an example file to work with

mth5_path = MTH5_PATH.joinpath("test12rr.h5")
if mth5_path.exists():
    pass
else:
    synthetic_test_paths.mkdirs()
    from mth5.data.make_mth5_from_asc import create_test12rr_h5
    create_test12rr_h5(target_folder=MTH5_PATH)

run_summary = RunSummary()
run_summary.from_mth5s([mth5_path,])
run_summary.df

kernel_dataset = KernelDataset()
kernel_dataset.from_run_summary(run_summary, "test1", "test2")
kernel_dataset.df

kernel_dataset.mini_summary

Create Config from KernelDataset¶

config = cc.create_from_kernel_dataset(kernel_dataset)

24:10:14T13:39:26 | INFO | line:108 |aurora.config.config_creator | determine_band_specification_style | Bands not defined; setting to EMTF BANDS_DEFAULT_FILE

config

You can see the entire config by executing the cell below, or you can cut and paste the json code into a json editor (e.g. https://jsoneditoronline.org) and then you can look at it heirarchically.

You can also transform the processing object to a json string

json_string = config.to_json()

json_string

Which can be saved:

with open("config.json", "w") as fid:
    fid.write(json_string)

Default Parameters¶

The default config parameters are listed below:

input_channels = ["hx", "hy"],
output_channels = ["hz", "ex", "ey"],
estimator = None,
emtf_band_file = BANDS_DEFAULT_FILE,

What can user change with the Config?¶

Channel Nomenclature
- for example ex,ey maybe called e1, e2 in the mth5
  - This is handled by passing a channel_nomenclature keyword argument.
  - Examples of systems that might need this are LEMI and Phoenix
Windowing Parameters
- Window shape (family)
- Window length
- Sliding window overlap
- Clock-Zero (optional)
Choice of Stations
- This is currently done via KernelDataset
Scale Factors for Individual Channels
- allows for correcting, or trying to add simple, frequency independent gain corrections
Frequency Bands
- Group the Fourier coefficients into bands to be processed together and averaged for a TF estimate
- currently via an emtf-style band_setup file
Number of Decimation Levels
- currently via an emtf-style band_setup file
Regression Estimator Engine
- Currently only choices are RME, RME_RR (Regression M-estimate) and Remote reference regresson M-estimate
Regression Parameters
- Maximum number of iterations
- Maximum number of redecending iterations
- Minimum number of frequecny cycles

EMTF Band Setup File¶

The frequency bands will eventually be setup in a variety of ways, but currently aurora supports only specification of bands either by explicit construction or via EMTF “band setup” files.

BANDS_DEFAULT_FILE

PosixPath('/home/kkappler/software/irismt/aurora/aurora/config/emtf_band_setup/bs_test.cfg')

Here is the content of a typical EMTF band setup file:

These legacy files have the following significance; The first line, 25 indicates the number of bands, and there are 25 lines following, one line per frequency band.

Each line comprises three numbers:

decimation_level, first_FC_index, last_FC_index

where “FC” stands for Fourier coefficient

The decimation factor applied at each level was controlled in EMTF by a sepearate file, called decset.cfg. In the old EMTF codes, this controlled the window length, overlap, the decimation factor, and the corners of the anti-alias filter applied before downsampling.

The decimation factor in EMTF was almost always 4, and the default behaviour of the ConfigCreator is to assume a decimation factor of 4 at each level, but this can be changed manually.

aurora

Synthetic Data Tutorial

aurora

Process CAS04 with Remote Reference