Configuration¶

WRF Ensembly is configured through a single TOML configuration file, typically named config.toml in the experiment directory. This file contains all the settings needed to run an ensemble assimilation experiment, from model directories to SLURM job parameters.

The configuration is structured into several sections, each controlling different aspects of the experiment. Below is a comprehensive reference of all available configuration options.

Configuration Structure¶

The configuration file is organized into the following main sections:

metadata - Basic experiment information
directories - Paths to model installations and data
domain_control - Grid and projection settings
time_control - Experiment timing and cycle configuration
data - Input data locations and settings
assimilation - Ensemble and DART configuration
observations - Observation processing settings
geogrid - Geographical data preprocessing
perturbations - Initial condition perturbation settings
slurm - SLURM job configuration
postprocess - Post-processing and output settings
environment - Environment variables
wrf_namelist - WRF namelist overrides

Metadata¶

Basic information about the experiment.

[metadata]
name = "my_experiment"
description = "A description of what this experiment does"

Field	Type	Description
`name`	string	Required. Name of the experiment
`description`	string	Required. Description of the experiment

Directories¶

Paths to model installations and data directories.

[directories]
wrf_root = "/path/to/WRF"
wps_root = "/path/to/WPS"
dart_root = "/path/to/DART"
scratch_root = "./scratch"

Field	Type	Description
`wrf_root`	Path	Required. Root directory of the WRF model. Should contain the `run` directory with `real.exe` compiled
`wps_root`	Path	Required. Root directory of WPS. Should contain the `geogrid.exe`, `metgrid.exe` and `ungrib.exe` executables
`dart_root`	Path	Required. Root directory of DART. Should contain a `models/wrf` directory, compiled
`scratch_root`	Path	Scratch directory for temporarily storing model output files before post-processing. If relative, will be inside the experiment directory. Default: `./scratch`

Domain Control¶

Grid configuration and map projection settings.

[domain_control]
xy_resolution = [30, 30]  # km
xy_size = [340, 130]      # grid points
projection = "lambert"
ref_lat = 20.0
ref_lon = -17.0
truelat1 = 20.0
truelat2 = 18.0
stand_lon = -11.0

Field	Type	Description
`xy_resolution`	[int, int]	Required. Space between grid points in x and y directions (kilometers). Corresponds to WRF's `dx` and `dy`
`xy_size`	[int, int]	Required. Number of grid points in x and y directions. Corresponds to WRF's `e_we` and `e_sn`
`projection`	string	Required. Map projection for the grid
`ref_lat`	float	Required. Reference latitude for the projection
`ref_lon`	float	Required. Reference longitude for the projection
`truelat1`	float	Required. First true latitude for the projection
`truelat2`	float	Second true latitude for the projection
`stand_lon`	float	Standard longitude for the projection
`pole_lat`	float	Pole latitude for the projection
`pole_lon`	float	Pole longitude for the projection

Time Control¶

Experiment timing, cycle configuration, and I/O settings.

[time_control]
start = 2025-03-01T00:00:00Z
end = 2025-04-30T00:00:00Z
boundary_update_interval = 180  # minutes
output_interval = 60           # minutes
analysis_interval = 180        # minutes
runtime_io = ["+:h:0:EDUST1,EDUST2,EDUST3,EDUST4,EDUST5"]

# Per-cycle overrides
[time_control.cycles.0]
duration = 120
output_interval = 30

Field	Type	Description
`start`	datetime	Required. Start timestamp of the experiment
`end`	datetime	Required. End timestamp of the experiment
`boundary_update_interval`	int	Time between incoming real data (lateral boundary conditions) in minutes. Default: 180
`output_interval`	int	Time between output (history) files in minutes. Default: 60
`analysis_interval`	int	Time between analysis/assimilation cycles in minutes. Default: 360
`runtime_io`	[string]	Runtime I/O options for WRF. Creates a text file in each member directory for `iofields_filename`. See WRF I/O documentation
`cycles`	dict	Per-cycle configuration overrides. Keys are cycle numbers (0-indexed)

Per-Cycle Configuration¶

You can override certain settings for specific cycles:

Field	Type	Description
`duration`	int	Override the cycle duration in minutes
`output_interval`	int	Override the output interval for this cycle

Data¶

Input data locations and processing settings.

[data]
wps_geog = "/path/to/WPS_GEOG"
meteorology = "/path/to/meteorology"
meteorology_glob = "*.grib"
meteorology_vtable = "Vtable.ERA-interim.pl"
per_member_meteorology = false
manage_chem_ic = false

# Chemistry data (optional)
[data.chemistry]
path = "/path/to/chemistry/data"
model_name = "cams_global_forecasts"

Field	Type	Description
`wps_geog`	Path	Required. Directory containing WPS geographical data
`meteorology`	Path	Required. Directory containing meteorological GRIB files
`meteorology_glob`	string	Glob pattern for finding meteorological files. Default: `".grib"`*
`meteorology_vtable`	Path	Vtable file for meteorological data. Default: `"Vtable.ERA-interim.pl"`
`per_member_meteorology`	bool	Whether to use separate meteorology for each member. If true, `meteorology` should contain `%MEMBER%` placeholder. Default: false
`manage_chem_ic`	bool	Whether to manage chemical initial conditions. Sets `chem_in_opt` to 0 for `real.exe` and 1 for `wrf.exe`. Default: false

Chemistry Data¶

Optional configuration for chemistry model data (used with WRF-CHEM):

Field	Type	Description
`path`	Path	Required. Directory containing chemistry data in YYYY-MM-DD subdirectories
`model_name`	string	Required. Name of the chemistry model (e.g., "cams_global_forecasts")

Assimilation¶

Ensemble configuration and DART settings.

[assimilation]
n_members = 30
cycled_variables = ["U", "V", "P", "PH", "THM", "MU", "QVAPOR"]
state_variables = ["U", "V", "W", "PH", "THM", "MU", "QVAPOR", "PSFC"]
filter_mpi_tasks = 24

Field	Type	Description
`n_members`	int	Required. Number of ensemble members
`cycled_variables`	[string]	Required. Variables to carry forward from the previous cycle
`state_variables`	[string]	Required. Variables to include in the state vector for assimilation
`filter_mpi_tasks`	int	Number of MPI tasks for DART filter. If != 1, filter runs with MPI. Default: 1

Observations¶

Observation processing and quality control settings.

[observations]
boundary_width = 2.0
boundary_error_factor = 2.5
boundary_error_width = 1.0

Field	Type	Description
`boundary_width`	float	How many grid points to reduce the domain by when removing observations outside the domain. Default: 0
`boundary_error_factor`	float	Factor to inflate observation errors near the boundary. Default: 2.5
`boundary_error_width`	float	Width in grid points where boundary error inflation is applied. Set to 0 to disable. Default: 1.0

Geogrid¶

Geographical data preprocessing settings.

[geogrid]
table = "GEOGRID.TBL.ARW_CHEM"

Field	Type	Description
`table`	string	Name of the GEOGRID table file to use. Default: `"GEOGRID.TBL"`

Perturbations¶

Initial condition perturbation settings for ensemble generation.

[perturbations]
seed = 42
apply_perturbations_every_cycle = false

# Per-variable perturbation settings
[perturbations.variables.DUST_EMIS_WEIGHT]
operation = "multiply"
mean = 2.6
sd = 0.8
rounds = 20
boundary = 0
min_value = 0.1
max_value = 10.0

Field	Type	Description
`seed`	int	Random seed for perturbation generation. If not set, randomly generated
`apply_perturbations_every_cycle`	bool	Whether to apply perturbations at the start of every cycle. Default: false
`variables`	dict	Per-variable perturbation configuration

Per-Variable Perturbation Settings¶

Field	Type	Description
`operation`	"add" or "multiply"	Required. Whether to add or multiply the perturbation
`mean`	float	Mean of the perturbation field. Default: 1.0
`sd`	float	Standard deviation of the perturbation field. Default: 1.0
`rounds`	int	Number of smoothing rounds to apply. Default: 10
`boundary`	int	Size of perturbation boundary in grid points. If > 0, edges won't be perturbed. Default: 0
`min_value`	float	Minimum value for the perturbation field
`max_value`	float	Maximum value for the perturbation field

SLURM¶

SLURM job configuration and resource allocation.

[slurm]
sbatch_command = "sbatch --parsable"
command_prefix = "micromamba run -n wrf"
mpirun_command = "mpirun"
env_modules = ["intel/2021.4"]
pre_commands = ["export OMP_NUM_THREADS=1"]

# Job resource configurations
[slurm.directives_large]
partition = "compute"
nodes = 2
ntasks-per-node = 24
cpus-per-task = 1
mem = "64G"

[slurm.directives_small]
partition = "compute"
nodes = 1
ntasks-per-node = 8
cpus-per-task = 1
mem = "16G"

[slurm.directives_postprocess]
partition = "compute"
nodes = 1
ntasks = 24
cpus-per-task = 1
mem = "32G"

Field	Type	Description
`sbatch_command`	string	Command for submitting SLURM jobs. Default: `"sbatch --parsable"`
`command_prefix`	string	Prefix for all `wrf-ensembly` commands (e.g., conda environment activation)
`mpirun_command`	string	Command for running MPI jobs. Default: `"mpirun"`
`env_modules`	[string]	Environment modules to load in each job
`pre_commands`	[string]	Commands to run at the start of each job
`directives_large`	dict	SLURM directives for large jobs (ensemble member advance)
`directives_small`	dict	SLURM directives for small jobs (Python steps)
`directives_postprocess`	dict	SLURM directives for post-processing jobs

Postprocess¶

Post-processing settings for model output.

[postprocess]
variables_to_keep = ["DUST_\\d", "U", "V", "wind_.*"]
compression_filters = "shf|dfl"
ppc_filter = "default=3#Z.*=6#X.*=6"
keep_per_member = false
compute_ensemble_statistics_in_job = true
processor_cores = 1
statistics_cores = 24
concatenate_cores = 24
cdo_path = "cdo"
ncrcat_cmd = "ncrcat"

# Custom processors
[[postprocess.processors]]
processor = "script"
params = { script = "python enhance_data.py {in} {out}" }

[[postprocess.processors]]
processor = "/path/to/custom_processor.py:MyProcessor"
params = { custom_param = "value" }

Field	Type	Description
`variables_to_keep`	[string]	Regular expressions for variables to keep in output. If not set, all variables are kept
`compression_filters`	string	NCO compression filters to apply. Default: `"shf\|zst,3"`
`ppc_filter`	string	Lossy quantization settings for precision control. Default: `"default=3#Z.=6#X.=6"`
`keep_per_member`	bool	Whether to keep per-member files in addition to ensemble statistics. Default: false
`compute_ensemble_statistics_in_job`	bool	Whether to compute ensemble statistics in SLURM jobs. Default: true
`processor_cores`	int	Number of cores for processor pipeline. Default: 1
`statistics_cores`	int	Number of cores for statistics computation. Default: 1
`concatenate_cores`	int	Number of cores for concatenation step. Default: 1
`cdo_path`	string	Path to CDO executable. Default: `"cdo"`
`ncrcat_cmd`	string	Path to ncrcat executable. Default: `"ncrcat"`
`processors`	[ProcessorConfig]	List of custom data processors to apply

Data Processors¶

WRF Ensembly supports custom data processors for post-processing model output. The built-in XWRFProcessor is always applied first.

Built-in Processors¶

script: Execute external scripts (for backward compatibility)

Custom Processors¶

You can specify custom processors using: - Module path: "my_package.processors:CustomProcessor" - File path: "/path/to/file.py:MyProcessor"

Each processor can have custom parameters passed via the params dictionary.

Environment¶

Environment variables to set when running the experiment.

[environment]
# Applied to all commands
universal = { OMP_NUM_THREADS = "1", MALLOC_TRIM_THRESHOLD = "536870912" }

# Applied only to WRF/WPS commands
wrf = { WRF_EM_CORE = "1" }

# Applied only to DART commands
dart = { DART_DEBUG = "1" }

Field	Type	Description
`universal`	dict	Environment variables applied to all commands
`wrf`	dict	Environment variables applied only to WRF/WPS commands
`dart`	dict	Environment variables applied only to DART commands

WRF Namelist¶

WRF namelist overrides and per-member customizations.

[wrf_namelist]
# Global namelist overrides
[wrf_namelist.time_control]
history_interval = 60
restart_interval = 3600

[wrf_namelist.domains]
time_step = 180
max_dom = 1

[wrf_namelist.physics]
mp_physics = 10
ra_lw_physics = 4

# Per-member namelist overrides
[wrf_namelist_per_member.member_001.physics]
mp_physics = 8

[wrf_namelist_per_member.member_002.physics]
mp_physics = 6

You can override any WRF namelist variable by specifying it in the appropriate section. The structure follows the WRF namelist format with sections like time_control, domains, physics, etc.

For per-member customizations, use the wrf_namelist_per_member section with the member name (e.g., member_001) as the key.

Example Configuration¶

Here's a complete example configuration file:

[metadata]
name = "dust_experiment"
description = "North African dust assimilation experiment"

[directories]
wrf_root = "/opt/WRF-4.5"
wps_root = "/opt/WPS-4.5"
dart_root = "/opt/DART"

[domain_control]
xy_resolution = [30, 30]
xy_size = [340, 130]
projection = "lambert"
ref_lat = 20.0
ref_lon = -17.0
truelat1 = 20.0
truelat2 = 18.0
stand_lon = -11.0

[time_control]
start = 2025-03-01T00:00:00Z
end = 2025-03-31T00:00:00Z
boundary_update_interval = 180
output_interval = 60
analysis_interval = 360

[data]
wps_geog = "/data/WPS_GEOG"
meteorology = "/data/ERA5"
meteorology_vtable = "Vtable.ERA-interim.pl"
manage_chem_ic = true

[data.chemistry]
path = "/data/CAMS"
model_name = "cams_global_forecasts"

[assimilation]
n_members = 20
cycled_variables = ["U", "V", "P", "PH", "THM", "MU", "QVAPOR", "DUST_1", "DUST_2", "DUST_3", "DUST_4", "DUST_5"]
state_variables = ["U", "V", "W", "PH", "THM", "MU", "QVAPOR", "PSFC", "DUST_1", "DUST_2", "DUST_3", "DUST_4", "DUST_5"]
filter_mpi_tasks = 20

[perturbations]
apply_perturbations_every_cycle = false

[perturbations.variables.DUST_EMIS_WEIGHT]
operation = "multiply"
mean = 1.0
sd = 0.5
rounds = 15
min_value = 0.1
max_value = 3.0

[slurm.directives_large]
partition = "compute"
nodes = 1
ntasks-per-node = 20
mem = "64G"

[postprocess]
variables_to_keep = ["DUST_\\d", "U", "V", "wind_.*"]
compression_filters = "shf|dfl"
compute_ensemble_statistics_in_job = true

This configuration sets up a dust assimilation experiment with 20 ensemble members, running on a Lambert conformal conic projection grid over North Africa, with 6-hour assimilation cycles.