Skip to article frontmatterSkip to article content

Data Querying

Now that we have an MTCollection stored in an MTH5 we can now work with the collection of transfer functions. We can query the data for transfer functions with the same survey name, or set a bounding box to get all transfer functions in a given area, etc. You can query the Pandas Dataframe provided by MTCollection.

MTCollection provides a few dataframes to use.

  • master_dataframe which never changes and contains all the transfer functions in the file.
  • working_dataframe which is the dataframe that contains only the stations you have queried for from the master_dataframe. By default it is initially equal to the master_dataframe.
  • dataframe is an alias for working dataframe.

MTCollection vs MTData

MTCollection is meant to be the archive and database where the transfer functions are stored. This object is stored in memory.

MTData is meant to be the working object for analyzing, plotting, and creating input files for modeling. If you want to store the manipulated transfer functions you can put into the MTCollection. This object is stored in RAM.

from pathlib import Path
from mtpy import MTCollection
%matplotlib inline

Open MTCollection

In the previous example we created a MTH5 file from existing Yellowstone data. Let’s open that file here for plotting.

mc = MTCollection()
mc.open_collection(Path().cwd().joinpath("yellowstone_mt_collection.h5"))

Make sure that everything is there as expected

mc.dataframe
Loading...

MTCollection Working Data Frame

The MTH5 includes a summary of all the transfer functions in the file, this is a property, so when it is called it is providing you with current information. MTCollection utilizes the tf_summary of MTH5 and calles it MTCollection.master_dataframe. If you have a file with a bunch of transfer functions that cover a wide area, maybe you don’t always want to use all the stations for plotting. In this case you can set the MTCollection.working_dataframe as a subset of the master_dataframe. To make it simpler, or more complicated MTCollection has a property simply called dataframe which will return the working_dataframe if one has been set, if not the master_dataframe will be returned. We will see examples of this later.

  • MTCollection.master_dataframe is a property that calls MTH5.tf_summary. Because it is a property it is updated in real time, providing a summary of all transfer functions in the collection or MTH5.
  • MTCollection.working_dataframe is an attribute that is a subset of the master_dataframe. A user can set the working_dataframe by querying the master_dataframe. Once the working_dataframe is set this will be used by the methods of MTCollection.
  • MTCollection.dataframe is a property that returns the working_dataframe if set, if not the master_dataframe is returned. This is a convenience property for the user.
mc.working_dataframe = mc.master_dataframe[mc.master_dataframe.station.str.contains("YNP")]
mc.dataframe
Loading...

Use a Bounding Box to set Working DataFrame

We can also set the working_dataframe by applying a bounding box to the master_dataframe. This can be done with the method apply_bbox

mc.apply_bbox(-112, -109.5, 44, 45.75)
mc.dataframe
Loading...

MTData Object

Now that we have queried the data to just the stations we want, lets convert those stations to a MTData object so we can plot and analyze the data. In the next series of notebooks we will demonstrate how to plot and analyze the data from the MTData object.

mt_data = mc.to_mt_data()

Close Collection

Remember it is important to close the collection when we are done so there are no open instances of the H5 file.

mc.close_collection()
24:10:17T12:49:22 | INFO | line:777 |mth5.mth5 | close_mth5 | Flushing and closing /home/jovyan/earthscope-mt-course/notebooks/mtpy/yellowstone_mt_collection.h5