Processing Configuration¶
The processing_config (“config”) together with the kernel dataset represent the key inputs to the processing pipeline. The processing config is based on the mt_metadata.base.Base
class. This means it is a container that has a JSON or dictionary representation.
It is the purpose of the config to encapsulate all the parameters required for processing. There are many parameters and effort has been given to selecting some “reasonable” default values so that users do not need to worry about all of these parameters if they don’t want to.
The ProcessingConfig is expected to evolve with aurora as new functionalities become available. This is one reason why such a genreic data srtuctrue was selected In this tutorial, we will use the synthetic dataset example to show some of the features of the config object.
Hopefully, it will be fairly easy to add other parameters to the config, such as:
- coherencey sorting
- polarization sorting
- ARMA prewhitening
- Other tools from the community
There are two main ways one would normally build the object:
- Use the ConfigCreator class
- Edit existing config.json files
But one can also initialize a Processing object, which is what is done inside ConfigCreator
from mt_metadata.transfer_functions.processing.aurora import Processing
p = Processing()
p
{
"processing": {
"channel_nomenclature.ex": "ex",
"channel_nomenclature.ey": "ey",
"channel_nomenclature.hx": "hx",
"channel_nomenclature.hy": "hy",
"channel_nomenclature.hz": "hz",
"decimations": [],
"id": null,
"stations.local.id": null,
"stations.local.mth5_path": null,
"stations.local.remote": false,
"stations.local.runs": [],
"stations.remote": []
}
}
Using ConfigCreator¶
from aurora.config import BANDS_DEFAULT_FILE
from aurora.config.config_creator import ConfigCreator
The config creator does not need any arguments to initialize, but it uses a KernelDataset to generate a processing object with station and run information, and default settings are applied to the processing parameters.
cc = ConfigCreator()
The ConfigCreator class generates a processing config with default arguments if it is provided with a KernelDataset
Example of making a KernelDataset from an mth5¶
from aurora.test_utils.synthetic.paths import SyntheticTestPaths
from mtpy.processing import RunSummary, KernelDataset
synthetic_test_paths = SyntheticTestPaths()
MTH5_PATH = synthetic_test_paths.mth5_path
Make an example file to work with
mth5_path = MTH5_PATH.joinpath("test12rr.h5")
if mth5_path.exists():
pass
else:
synthetic_test_paths.mkdirs()
from mth5.data.make_mth5_from_asc import create_test12rr_h5
create_test12rr_h5(target_folder=MTH5_PATH)
run_summary = RunSummary()
run_summary.from_mth5s([mth5_path,])
run_summary.df
kernel_dataset = KernelDataset()
kernel_dataset.from_run_summary(run_summary, "test1", "test2")
kernel_dataset.df
kernel_dataset.mini_summary
Create Config from KernelDataset¶
config = cc.create_from_kernel_dataset(kernel_dataset)
24:10:14T13:39:26 | INFO | line:108 |aurora.config.config_creator | determine_band_specification_style | Bands not defined; setting to EMTF BANDS_DEFAULT_FILE
config
You can see the entire config by executing the cell below, or you can cut and paste the json code into a json editor (e.g. https://

You can also transform the processing object to a json string
json_string = config.to_json()
json_string
Which can be saved:
with open("config.json", "w") as fid:
fid.write(json_string)
Default Parameters¶
The default config parameters are listed below:
input_channels = ["hx", "hy"],
output_channels = ["hz", "ex", "ey"],
estimator = None,
emtf_band_file = BANDS_DEFAULT_FILE,
What can user change with the Config?¶
- Channel Nomenclature
- for example ex,ey maybe called e1, e2 in the mth5
- This is handled by passing a channel_nomenclature keyword argument.
- Examples of systems that might need this are LEMI and Phoenix
- for example ex,ey maybe called e1, e2 in the mth5
- Windowing Parameters
- Window shape (family)
- Window length
- Sliding window overlap
- Clock-Zero (optional)
- Choice of Stations
- This is currently done via KernelDataset
- Scale Factors for Individual Channels
- allows for correcting, or trying to add simple, frequency independent gain corrections
- Frequency Bands
- Group the Fourier coefficients into bands to be processed together and averaged for a TF estimate
- currently via an emtf-style band_setup file
- Number of Decimation Levels
- currently via an emtf-style band_setup file
- Regression Estimator Engine
- Currently only choices are RME, RME_RR (Regression M-estimate) and Remote reference regresson M-estimate
- Regression Parameters
- Maximum number of iterations
- Maximum number of redecending iterations
- Minimum number of frequecny cycles
EMTF Band Setup File¶
The frequency bands will eventually be setup in a variety of ways, but currently aurora supports only specification of bands either by explicit construction or via EMTF “band setup” files.
BANDS_DEFAULT_FILE
PosixPath('/home/kkappler/software/irismt/aurora/aurora/config/emtf_band_setup/bs_test.cfg')
Here is the content of a typical EMTF band setup file:
25
1 25 30
1 20 24
1 16 19
1 13 15
1 10 12
1 8 9
1 6 7
1 5 5
2 14 17
2 11 13
2 9 10
2 7 8
2 6 6
2 5 5
3 14 17
3 11 13
3 9 10
3 7 8
3 6 6
3 5 5
4 18 22
4 14 17
4 10 13
4 7 9
4 5 6
These legacy files have the following significance; The first line, 25 indicates the number of bands, and there are 25 lines following, one line per frequency band.
Each line comprises three numbers:
decimation_level, first_FC_index, last_FC_index
where “FC” stands for Fourier coefficient
The decimation factor applied at each level was controlled in EMTF by a sepearate file, called decset.cfg
. In the old EMTF codes, this controlled the window length, overlap, the decimation factor, and the corners of the anti-alias filter applied before downsampling.
The decimation factor in EMTF was almost always 4, and the default behaviour of the ConfigCreator is to assume a decimation factor of 4 at each level, but this can be changed manually.