Skip to content

Configuration

This page summarizes the main configuration objects exposed through the current API.

The canonical configuration definitions live in:

  • foreblocks/config.py for ModelConfig and TrainingConfig

ModelConfig

ModelConfig is a lightweight dataclass for architecture-level settings.

Core fields:

FieldPurpose
model_typeselects the backbone family such as LSTM-style vs transformer-style workflows
input_sizefeature dimension presented to the model
output_sizeoutput feature dimension
hidden_sizelatent width for recurrent or sequence blocks
seq_lensource window length
target_lenforecast horizon
strategyforecasting strategy such as seq2seq
teacher_forcing_ratiodecoder teacher forcing during training

Extended architecture fields include dim_feedforward, dropout, num_encoder_layers, num_decoder_layers, multi_encoder_decoder, latent_size, and nheads.

It is useful when you want a typed container for model-level settings, even though many examples instantiate modules directly.

TrainingConfig

TrainingConfig controls the training loop.

Core training

FieldPurpose
num_epochsmaximum epoch count
learning_rateoptimizer learning rate
weight_decayoptimizer weight decay
batch_sizedefault batch size used in examples and helpers
patienceearly-stopping patience
min_deltaminimum improvement threshold for early stopping
use_ampenables automatic mixed precision
gradient_clip_valgradient clipping threshold
gradient_accumulation_stepsnumber of mini-batches to accumulate before stepping
l1_regularizationoptional L1 penalty
kl_weightoptional KL weight for compatible models

Scheduler and logging

FieldPurpose
scheduler_typescheduler selector
lr_step_sizestep scheduler interval
lr_gammastep scheduler decay
min_lrlower learning-rate bound
verbosetrainer logging verbosity
log_intervalbatch logging cadence
save_best_modelwhether to retain best checkpoint state
save_model_pathoptional filesystem path for saved weights
experiment_nameMLTracker experiment name when tracking is enabled

Mixture-of-experts logging

FieldPurpose
moe_loggingenables MoE report collection
moe_log_latencyrecords latency metrics in MoE logs
moe_condition_namecondition label for segmented MoE analysis
moe_condition_cardinalitycategory count for condition-aware reports

NAS settings

FieldPurpose
train_nasenables alternating optimization for architecture parameters
nas_alpha_lrlearning rate for architecture parameters
nas_alpha_weight_decayweight decay for architecture parameters
nas_warmup_epochsepochs before alpha updates begin
nas_alternate_stepsalpha-step frequency
nas_use_val_for_alphauses validation loss for alpha updates
nas_discretize_at_enddiscretizes architecture choices after training
nas_discretize_thresholdthreshold used during discretization
nas_log_alphaslogs alpha values during training

Conformal prediction settings

Shared fields:

FieldPurpose
conformal_enabledmaster switch
conformal_methodmethod name such as split, rolling, agaci, enbpi, cptc, or afocp
conformal_quantiletarget coverage level
conformal_knn_kneighborhood size for local conformal methods
conformal_local_windowcalibration window for local methods
conformal_aci_gammaadaptation rate for ACI-style methods
conformal_rolling_alpharolling update rate
conformal_agaci_gammasgamma grid for AgACI
conformal_enbpi_Bbootstrap count for EnbPI
conformal_enbpi_windowrolling window for EnbPI
conformal_tsp_lambdaregularization parameter for TSP
conformal_tsp_windowcalibration window for TSP
conformal_cptc_windowstate-aware rolling window
conformal_cptc_taustate filter sharpness
conformal_cptc_hard_state_filterswitches CPTC filtering mode
conformal_afocp_feature_dimfeature dimension for AFOCP
conformal_afocp_attn_hiddenattention hidden size for AFOCP
conformal_afocp_windowcontext window for AFOCP
conformal_afocp_tautemperature-like scaling for AFOCP
conformal_afocp_internal_feat_hiddenhidden size of the internal feature network
conformal_afocp_internal_feat_depthdepth of the internal feature network
conformal_afocp_internal_feat_dropoutdropout inside the internal feature network
conformal_afocp_online_lronline update learning rate
conformal_afocp_online_stepsnumber of online update steps

Helper methods

TrainingConfig also provides:

  • update(**kwargs) for safe field updates
  • get_conformal_params() to build the trainer-facing conformal configuration payload

Workflow recipes

Smoke test — no optional features:

python
config = TrainingConfig(
    num_epochs=5,
    batch_size=16,
    patience=3,
    use_amp=False,
)
trainer = Trainer(model, config=config, auto_track=False)

Preprocessing-heavy run — longer training, AMP, gradient clipping:

python
config = TrainingConfig(
    num_epochs=50,
    learning_rate=3e-4,
    batch_size=32,
    patience=10,
    use_amp=True,
    gradient_clip_val=1.0,
    scheduler_type="cosine",
    min_lr=1e-6,
    save_best_model=True,
    experiment_name="revin_decomp_baseline",
)

DARTS / NAS run — architecture parameter alternation:

python
config = TrainingConfig(
    num_epochs=30,
    learning_rate=1e-3,
    batch_size=32,
    train_nas=True,
    nas_alpha_lr=3e-4,
    nas_warmup_epochs=5,
    nas_alternate_steps=1,
    nas_use_val_for_alpha=True,
    nas_discretize_at_end=True,
    nas_log_alphas=True,
)

MoE monitoring run — enable router diagnostics:

python
config = TrainingConfig(
    num_epochs=40,
    learning_rate=3e-4,
    use_amp=True,
    moe_logging=True,
    moe_log_latency=True,
    moe_condition_name="season",
    moe_condition_cardinality=4,
    experiment_name="moe_router_audit",
    save_best_model=True,
)

Conformal prediction run — split conformal with 90 % coverage:

python
config = TrainingConfig(
    num_epochs=30,
    batch_size=32,
    conformal_enabled=True,
    conformal_method="split",
    conformal_quantile=0.90,
)

Tracking-oriented production run:

python
config = TrainingConfig(
    experiment_name="baseline_direct",
    save_best_model=True,
    save_model_path="checkpoints/best.pt",
    use_amp=True,
    gradient_clip_val=1.0,
    scheduler_type="cosine",
)

TimeSeriesHandler settings

Important preprocessor controls include:

  • window_size
  • horizon
  • normalize
  • differencing
  • detrend
  • apply_filter
  • apply_ewt
  • remove_outliers
  • apply_imputation
  • generate_time_features
  • self_tune

Use the dedicated guide for more detail:

Practical guidance

  • For first runs, keep configuration small and explicit.
  • Disable optional features until the core training loop is working.
  • Enable conformal and NAS only after validating your baseline training path.
  • Use auto_track=False on Trainer when you do not want MLTracker involved in local smoke tests.

MIT License