Skip to article frontmatterSkip to article content

Make an MTH5 from LEMI data

LEMI instruments are the long period instruments provided by IRIS. These instruments sample at 1 second and output the data as text files that are a maximum of 1 day in length. The text files contain information about each channel recorded in a single line. A typical line looks like

2020 09 30 20 21 00 23813.621   729.816 41802.042  39.76  40.48   131.013  -111.026   164.166     9.715 13.01 2204.5 3404.83911 N 10712.84475 W 12 2 0

where each column is

year month day hour minute second bx by bz temperature_e temperature_h e1 e2 e3 e4 battery elevation latitude latitude_hemisphere longitude longitude_hemisphere n_satellites gps_fix time_difference

A reader is provided with the mth5 module to parse these text files into RunTS objects. The files contain little metadata so the user needs to input some important information.

Note that the output LEMI424 files are calibrated to physical units, therefore no calibration is needed.

This notebook provides an example of how to read in LEMI (.TXT) files into an MTH5.

from pathlib import Path
from mth5.clients import MakeMTH5
%matplotlib inline

LEMI Collection

We will use the LEMICollection to assemble the .txt files into a logical order by schedule action or run. The output LEMI files include all data for each channel.

IMPORTANT: LEMICollection assumes the given file path is for a single station.

Metadata: we need to input the station_id and the survey_id to provide minimal metadata when making an MTH5 fild.

The LEMICollection.get_runs() will return a two level ordered dictionary (OrderedDict). The first level is keyed by station ID. These objects are in turn ordered dictionaries by run ID. Therefore you can loop over stations and runs.

Note: n_samples is an estimate based on file size not the data. To get an accurate number you should read in the full file.

Set the LEMI path to get the data.

lemi_path = Path().home().joinpath("shared", "shortcourses", "mt", "lemi")

Build MTH5

Now that we have a logical collection of files, lets load them into an MTH5. We will simply loop over the stations, runs, and channels in the ordered dictionary.

There are a few things that to keep in mind:

  • The LEMI raw files come with very little metadata, so as a user you will have to manually input most of it.
  • The output files from a LEMI are already calibrated into units of nT and mV/km (I think), therefore there are no filters to apply to calibrate the data.
survey_id = "test"
station_id = "mt5003"
mth5_filename = MakeMTH5.from_lemi424(
    lemi_path,
    survey_id=survey_id,
    station_id=station_id,
    save_path=Path().cwd(),
    mth5_filename=f"from_lemi_{station_id}.h5")
2024-10-17T11:23:25.177782-0700 | INFO | mth5.mth5 | _initialize_file | Initialized MTH5 0.2.0 file /home/jovyan/earthscope-mt-course/notebooks/mth5/from_lemi_mt5003.h5 in mode w
2024-10-17T11:23:37.452231-0700 | WARNING | mth5.timeseries.run_ts | validate_metadata | end time of dataset 2020-10-07T17:05:46+00:00 does not match metadata end 2020-10-07T14:19:46+00:00 updating metatdata value to 2020-10-07T17:05:46+00:00
2024-10-17T11:23:39.543796-0700 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/jovyan/earthscope-mt-course/notebooks/mth5/from_lemi_mt5003.h5

MTH5 Structure

Have a look at the MTH5 structure and make sure it looks correct.

from mth5.mth5 import MTH5
m = MTH5()
m = m.open_mth5(mth5_filename)
m
/: ==================== |- Group: Experiment -------------------- |- Group: Reports ----------------- |- Group: Standards ------------------- --> Dataset: summary ...................... |- Group: Surveys ----------------- |- Group: test -------------- |- Group: Filters ----------------- |- Group: coefficient --------------------- |- Group: fap ------------- |- Group: fir ------------- |- Group: time_delay -------------------- |- Group: zpk ------------- |- Group: Reports ----------------- |- Group: Standards ------------------- --> Dataset: summary ...................... |- Group: Stations ------------------ |- Group: mt5003 ---------------- |- Group: Fourier_Coefficients ------------------------------ |- Group: Transfer_Functions ---------------------------- |- Group: sr1_0001 ------------------ --> Dataset: bx ................. --> Dataset: by ................. --> Dataset: bz ................. --> Dataset: e1 ................. --> Dataset: e2 ................. --> Dataset: temperature_e ............................ --> Dataset: temperature_h ............................ |- Group: sr1_0002 ------------------ --> Dataset: bx ................. --> Dataset: by ................. --> Dataset: bz ................. --> Dataset: e1 ................. --> Dataset: e2 ................. --> Dataset: temperature_e ............................ --> Dataset: temperature_h ............................ |- Group: sr1_0003 ------------------ --> Dataset: bx ................. --> Dataset: by ................. --> Dataset: bz ................. --> Dataset: e1 ................. --> Dataset: e2 ................. --> Dataset: temperature_e ............................ --> Dataset: temperature_h ............................ |- Group: sr1_0004 ------------------ --> Dataset: bx ................. --> Dataset: by ................. --> Dataset: bz ................. --> Dataset: e1 ................. --> Dataset: e2 ................. --> Dataset: temperature_e ............................ --> Dataset: temperature_h ............................ |- Group: sr1_0005 ------------------ --> Dataset: bx ................. --> Dataset: by ................. --> Dataset: bz ................. --> Dataset: e1 ................. --> Dataset: e2 ................. --> Dataset: temperature_e ............................ --> Dataset: temperature_h ............................ --> Dataset: channel_summary .............................. --> Dataset: fc_summary ......................... --> Dataset: tf_summary .........................

Channel Summary

Have a look at the channel summary and make sure everything looks good.

m.channel_summary.to_dataframe()
Loading...

Run Summary

Have a look at the various runs in the file

m.run_summary
Loading...

Plot a Run

Let’s have a look at one of the runs. Run sr1_0005 was the longest, lets try to look at that one. This can take a bit of time since it was 7 days of data.

Notice the dirunal variations caused by just temperature. Looks like there was something at day 2020-10-06 that caused a spike, so lets grab just the first few days.

run_group = m.get_run(station_id, "sr1_0005", survey=survey_id)
run_ts_object = run_group.to_runts(start="2020-10-01T00:00:00", n_samples=86400*4)
2024-10-17T11:28:22.473163-0700 | WARNING | mth5.timeseries.run_ts | validate_metadata | start time of dataset 2020-10-01T00:00:00+00:00 does not match metadata start 2020-09-30T21:14:00+00:00 updating metatdata value to 2020-10-01T00:00:00+00:00
2024-10-17T11:28:22.474580-0700 | WARNING | mth5.timeseries.run_ts | validate_metadata | end time of dataset 2020-10-04T23:59:59+00:00 does not match metadata end 2020-10-07T17:05:46+00:00 updating metatdata value to 2020-10-04T23:59:59+00:00
print(run_group.metadata)
run:
	channels_recorded_auxiliary = ['temperature_e', 'temperature_h']
	channels_recorded_electric = ['e1', 'e2']
	channels_recorded_magnetic = ['bx', 'by', 'bz']
	data_logger.firmware.author = None
	data_logger.firmware.name = None
	data_logger.firmware.version = None
	data_logger.id = None
	data_logger.manufacturer = LEMI
	data_logger.model = LEMI424
	data_logger.power_source.voltage.end = 12.6
	data_logger.power_source.voltage.start = 13.01
	data_logger.timing_system.drift = 0.0
	data_logger.timing_system.type = GPS
	data_logger.timing_system.uncertainty = 0.0
	data_logger.type = None
	data_type = BBMT
	hdf5_reference = <HDF5 object reference>
	id = sr1_0005
	mth5_type = Run
	sample_rate = 1.0
	time_period.end = 2020-10-07T17:05:46+00:00
	time_period.start = 2020-09-30T21:14:00+00:00
run_plot = run_ts_object.plot()
<Figure size 640x480 with 7 Axes>

Close the MTH5

This is important, you should close the file after you are done using it. Otherwise bad things can happen if you try to open it with another program or Python interpreter.

m.close_mth5()
2024-10-17T11:28:45.501270-0700 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/jovyan/earthscope-mt-course/notebooks/mth5/from_lemi_mt5003.h5