Make an MTH5 from LEMI data¶
LEMI instruments are the long period instruments provided by IRIS. These instruments sample at 1 second and output the data as text files that are a maximum of 1 day in length. The text files contain information about each channel recorded in a single line. A typical line looks like
2020 09 30 20 21 00 23813.621 729.816 41802.042 39.76 40.48 131.013 -111.026 164.166 9.715 13.01 2204.5 3404.83911 N 10712.84475 W 12 2 0
where each column is
year month day hour minute second bx by bz temperature_e temperature_h e1 e2 e3 e4 battery elevation latitude latitude_hemisphere longitude longitude_hemisphere n_satellites gps_fix time_difference
A reader is provided with the mth5 module to parse these text files into RunTS
objects. The files contain little metadata so the user needs to input some important information.
Note that the output LEMI424 files are calibrated to physical units, therefore no calibration is needed.
This notebook provides an example of how to read in LEMI (.TXT) files into an MTH5.
from pathlib import Path
from mth5.clients import MakeMTH5
%matplotlib inline
LEMI Collection¶
We will use the LEMICollection
to assemble the .txt files into a logical order by schedule action or run. The output LEMI files include all data for each channel.
IMPORTANT: LEMICollection
assumes the given file path is for a single station.
Metadata: we need to input the station_id
and the survey_id
to provide minimal metadata when making an MTH5 fild.
The LEMICollection.get_runs()
will return a two level ordered dictionary (OrderedDict
). The first level is keyed by station ID. These objects are in turn ordered dictionaries by run ID. Therefore you can loop over stations and runs.
Note: n_samples
is an estimate based on file size not the data. To get an accurate number you should read in the full file.
Set the LEMI path to get the data.¶
lemi_path = Path().home().joinpath("shared", "shortcourses", "mt", "lemi")
Build MTH5¶
Now that we have a logical collection of files, lets load them into an MTH5. We will simply loop over the stations, runs, and channels in the ordered dictionary.
There are a few things that to keep in mind:
- The LEMI raw files come with very little metadata, so as a user you will have to manually input most of it.
- The output files from a LEMI are already calibrated into units of nT and mV/km (I think), therefore there are no filters to apply to calibrate the data.
survey_id = "test"
station_id = "mt5003"
mth5_filename = MakeMTH5.from_lemi424(
lemi_path,
survey_id=survey_id,
station_id=station_id,
save_path=Path().cwd(),
mth5_filename=f"from_lemi_{station_id}.h5")
2024-10-17T11:23:25.177782-0700 | INFO | mth5.mth5 | _initialize_file | Initialized MTH5 0.2.0 file /home/jovyan/earthscope-mt-course/notebooks/mth5/from_lemi_mt5003.h5 in mode w
2024-10-17T11:23:37.452231-0700 | WARNING | mth5.timeseries.run_ts | validate_metadata | end time of dataset 2020-10-07T17:05:46+00:00 does not match metadata end 2020-10-07T14:19:46+00:00 updating metatdata value to 2020-10-07T17:05:46+00:00
2024-10-17T11:23:39.543796-0700 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/jovyan/earthscope-mt-course/notebooks/mth5/from_lemi_mt5003.h5
MTH5 Structure¶
Have a look at the MTH5 structure and make sure it looks correct.
from mth5.mth5 import MTH5
m = MTH5()
m = m.open_mth5(mth5_filename)
m
/:
====================
|- Group: Experiment
--------------------
|- Group: Reports
-----------------
|- Group: Standards
-------------------
--> Dataset: summary
......................
|- Group: Surveys
-----------------
|- Group: test
--------------
|- Group: Filters
-----------------
|- Group: coefficient
---------------------
|- Group: fap
-------------
|- Group: fir
-------------
|- Group: time_delay
--------------------
|- Group: zpk
-------------
|- Group: Reports
-----------------
|- Group: Standards
-------------------
--> Dataset: summary
......................
|- Group: Stations
------------------
|- Group: mt5003
----------------
|- Group: Fourier_Coefficients
------------------------------
|- Group: Transfer_Functions
----------------------------
|- Group: sr1_0001
------------------
--> Dataset: bx
.................
--> Dataset: by
.................
--> Dataset: bz
.................
--> Dataset: e1
.................
--> Dataset: e2
.................
--> Dataset: temperature_e
............................
--> Dataset: temperature_h
............................
|- Group: sr1_0002
------------------
--> Dataset: bx
.................
--> Dataset: by
.................
--> Dataset: bz
.................
--> Dataset: e1
.................
--> Dataset: e2
.................
--> Dataset: temperature_e
............................
--> Dataset: temperature_h
............................
|- Group: sr1_0003
------------------
--> Dataset: bx
.................
--> Dataset: by
.................
--> Dataset: bz
.................
--> Dataset: e1
.................
--> Dataset: e2
.................
--> Dataset: temperature_e
............................
--> Dataset: temperature_h
............................
|- Group: sr1_0004
------------------
--> Dataset: bx
.................
--> Dataset: by
.................
--> Dataset: bz
.................
--> Dataset: e1
.................
--> Dataset: e2
.................
--> Dataset: temperature_e
............................
--> Dataset: temperature_h
............................
|- Group: sr1_0005
------------------
--> Dataset: bx
.................
--> Dataset: by
.................
--> Dataset: bz
.................
--> Dataset: e1
.................
--> Dataset: e2
.................
--> Dataset: temperature_e
............................
--> Dataset: temperature_h
............................
--> Dataset: channel_summary
..............................
--> Dataset: fc_summary
.........................
--> Dataset: tf_summary
.........................
Channel Summary¶
Have a look at the channel summary and make sure everything looks good.
m.channel_summary.to_dataframe()
Run Summary¶
Have a look at the various runs in the file
m.run_summary
Plot a Run¶
Let’s have a look at one of the runs. Run sr1_0005
was the longest, lets try to look at that one. This can take a bit of time since it was 7 days of data.
Notice the dirunal variations caused by just temperature. Looks like there was something at day 2020-10-06
that caused a spike, so lets grab just the first few days.
run_group = m.get_run(station_id, "sr1_0005", survey=survey_id)
run_ts_object = run_group.to_runts(start="2020-10-01T00:00:00", n_samples=86400*4)
2024-10-17T11:28:22.473163-0700 | WARNING | mth5.timeseries.run_ts | validate_metadata | start time of dataset 2020-10-01T00:00:00+00:00 does not match metadata start 2020-09-30T21:14:00+00:00 updating metatdata value to 2020-10-01T00:00:00+00:00
2024-10-17T11:28:22.474580-0700 | WARNING | mth5.timeseries.run_ts | validate_metadata | end time of dataset 2020-10-04T23:59:59+00:00 does not match metadata end 2020-10-07T17:05:46+00:00 updating metatdata value to 2020-10-04T23:59:59+00:00
print(run_group.metadata)
run:
channels_recorded_auxiliary = ['temperature_e', 'temperature_h']
channels_recorded_electric = ['e1', 'e2']
channels_recorded_magnetic = ['bx', 'by', 'bz']
data_logger.firmware.author = None
data_logger.firmware.name = None
data_logger.firmware.version = None
data_logger.id = None
data_logger.manufacturer = LEMI
data_logger.model = LEMI424
data_logger.power_source.voltage.end = 12.6
data_logger.power_source.voltage.start = 13.01
data_logger.timing_system.drift = 0.0
data_logger.timing_system.type = GPS
data_logger.timing_system.uncertainty = 0.0
data_logger.type = None
data_type = BBMT
hdf5_reference = <HDF5 object reference>
id = sr1_0005
mth5_type = Run
sample_rate = 1.0
time_period.end = 2020-10-07T17:05:46+00:00
time_period.start = 2020-09-30T21:14:00+00:00
run_plot = run_ts_object.plot()

Close the MTH5¶
This is important, you should close the file after you are done using it. Otherwise bad things can happen if you try to open it with another program or Python interpreter.
m.close_mth5()
2024-10-17T11:28:45.501270-0700 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/jovyan/earthscope-mt-course/notebooks/mth5/from_lemi_mt5003.h5