Configure file
The configure.yml file is a crucial component of your seismic inversion and forward modeling process. It contains essential parameter settings that determine how your seismic simulations and inversions are carried out. In this document, we will describe the key sections and parameters within the configure.yml file.
Format of configure file
The configure.yml is formatted with yaml. Seistorch automatically parses the configure.yml configuration file into a python dictionary, making it easy to access and utilize the configuration parameters within your Seistorch scripts. You can read the .yml file by following codes:
from yaml import CLoader as Loader
# Load the configure file
config_path = r'Your_YML_FILE_PATH.yml'
with open(config_path, 'r') as ymlfile:
cfg = load(ymlfile, Loader=Loader)
Parameters in Configure File
This section provides an overview of key parameters in the configuration file that are related to modeling and inversion.
Boolean parameters
For boolean parameters, valid values are:
true: Represents a positive or affirmative condition.false: Represents a negative or negative condition.
Example:
training:
implicit:
use: false
minibatch: true
training->implicit->useOnly works in inversion. Use the implicit network (siren) for representing the model parameters or not.
training->minibatchandtraining->batch_sizeWorks in both classic fwi and source-encoding fwi.
In classic fwi, the
minibatchparameter specifies how many individual (or batched) shots (equal tobatch_size) are used to compute the gradient for one epoch. The shots are selected randomly.$$gradient=\Sigma^{batchsize}_{i} \partial loss(w(t, \mathbf {x}^i_s), \mathbf m)/\partial \mathbf m$$
In source encoding fwi, the
minibatchparameter specifies how many individual shots (equal to thebatch_size) are encoded into a single super-shot for one forward modeling. This approach is used to accelerate the inversion process by reducing the computational cost.$$gradient= \partial loss(w(t, \Sigma^{batchsize}_{i} \mathbf {x}^i_s), \mathbf m)/\partial \mathbf m$$
geom->multipleWorks in both forward modeling and inversion.
When
multipleis set totrue, the absorption boundary conditions on the upper boundary of the model will be deactivated. When set tofalse, absorption boundaries are applied on all sides of the model.geom->boundary_savingOnly works in inversion.
When
boundary_savingis set totrue, during theloss.backward()operation, the boundary saving strategy is employed to reconstruct the wavefield for reducing the GPU memory usage. When set tofalse, pure automatic differentiation will be used.geom->wavelet_inverseWorks in both forward modeling and inversion. This parameter refers to whether to invert (reverse) the polarity of the Ricker wavelet.
invlistOnly works in inversion. The parameters in
invlistspecifies the parameters that need to be inverted or included in the inversion process.In multi-parameter inversion, it is possible to configure the inversion process to focus on optimizing or inverting specific parameters while keeping others fixed or constrained. This can be useful when you have a complex model with many parameters, but you are primarily interested in updating or estimating a subset of those parameters that are most relevant to your research or application.
Example: In elastic wave inversion, if you only want to invert for the P-wave velocity and S-wave velocity while keeping the density unchanged, you can configure it as follows.
equation: elastic geom: initPath: vp: ./velocity/init_vp.npy vs: ./velocity/init_vs.npy rho: ./velocity/init_rho.npy invlist: vp: true vs: true rho: false
Restricted choice parameter
This type of parameters has a limited set of valid choices, causing errors if selections fall outside this predefined list.
dtypeThe data type in torch, default is
float32.equationThe wave equation used for modeling. All implemented wave equations are stored in
seistorch/equations. For example, when set toequation: acoustic, it will call the functions from the file with the same name,acoustic.pyfor both forward and inversion.geom->source_typeandgeom->receiver_typeThe
source_typeparameter specifies the type of seismic source, which determines thewavefieldcomponent where the source wavelet is loaded. This parameter allows you to control how the source wavelet influences the simulation.The
receiver_typeparameter specifies the type of seismic receiver, which determines the wavefield component that will be recorded and used for analysis.The
source_typeandreceiver_typecan be a list from the valid wavefield names of the classWavefieldinseistorch/eqconfigure.pyExample: In elastic modeling, if you want to load the seismic source into the
txxandtzzcomponents, and record the velocity components, you can configure it as follows.equation: elastic source_type: - txx - tzz receiver_type: - vx - vz
geom->boundary->typeValid options are
habcandpml. Please note that thehabconly valid for second-order acoustic equations.
Scalar parameters
seedThe
seedis the seed value used in a random process and can be employed to reproduce an experiment or random outcome.training->batch_sizeSee Boolean paramters
training->minibatchandtraining->batch_size.training->N_epochsThe number of epochs in each scale.
training->lrThe initial learning rate of the optimizer.
training->scale_decayandtraining->lr_decayThe parameter
scale_decayrepresents the decay rate of the learning rate across different scales, whilelr_decayrefers to the exponential decay rate of the learning rate within each scale.In the $i_{th}$ epoch of the $n_{th}$ frequency band, the learning rate is given by:
$$lr(n, i)=(lr_init^{scale_decay})^{lr_decay}$$
training->filter_ordThe order of the filter. Recommand value: from 1 to 4.
training->smoothSeistorchusescipy.ndimage.gaussian_filterto smooth the gradients before callingoptimizer.step().smooth: counts: 10 # should be int, indictes the times for smooth radius: # radius of the gaussian kernel in x and z direction. x: 5 z: 10 sigma: # sigma of the gaussian kernel in x and z direction. x: z:
geom->wavelet_delayThe delay of the ricker wavelet.
geom->multiscaleThe parameter
multiscalespecifies the dominant frequency for each scale in a multi-scale inversion. A low pass filter is used for each scale. A keywordallwill use the original data and original wavelet for inversion.Example:
geom: multiscale: - - 1.0 - 3.0 - 5.0 - all
geom->dtThe parameter
dtrepresents the time interval or time step used in simulations, and it can also be the time interval between samples of a source wavelet.geom->ntThe number of the time samples for simulation and recording.
geom->fmThe dominant frequency of the ricker wave. Only works when the paramter
waveletleaves blank.geom->hThe grid size in both x and z directions.
geom->NshotsThe number of the shots for forward modeling. This parameter can be bigger than the length of the source list and receivers list defined in
geom->sourcesandgeom->receivers.The actual number of shots simulated is determined as the minimum value between
Nshotsand the length of thesourceslist. This ensures that the number of simulated shots does not exceed the available seismic source data.$$actual_shots=min(Nshots, len(sources))$$
geom->boundary->bwidthThe width of the absorbing boundary conditions.