DARTS Search Pipeline¶

This page maps the conceptual DARTS workflow onto the actual ForeBlocks modules.

Why this pipeline exists¶

ForeBlocks does not treat DARTS as a single opaque training call. The search stack is deliberately staged so you can:

prune bad candidates early with cheap signals
spend short differentiable-search budgets only on promoted candidates
derive a discrete architecture explicitly
retrain the final model from scratch when desired

That structure makes it easier to debug, benchmark, and reason about the search than a monolithic NAS loop.

Phase-to-module map¶

Phase	Main code location	Role
search-space definition	`foreblocks/darts/config.py`	defines architecture modes, op pools, and search budgets
candidate generation	`foreblocks/darts/search/orchestrator.py`	samples candidate configs and coordinates evaluation
zero-cost ranking	`foreblocks/darts/search/zero_cost.py`	computes cheap pre-training metrics
bilevel search training	`foreblocks/darts/training/darts_loop.py`	runs the mixed-architecture DARTS phase
multi-fidelity orchestration	`foreblocks/darts/search/multi_fidelity.py`	promotes top candidates and manages staged budgets
final fixed-model retraining	`foreblocks/darts/training/final_trainer.py`	retrains the derived architecture
result analysis	`foreblocks/darts/evaluation/analyzer.py`	builds search-result summaries and plots
public entry surface	`foreblocks/darts/trainer.py`	exposes the staged workflow as `DARTSTrainer` methods

Public control flow¶

The high-level path is:

instantiate DARTSTrainer
call multi_fidelity_search(...)
inspect the returned candidates and final metrics
save the best final model

Internally, the call chain is roughly:

DARTSTrainer.multi_fidelity_search(...)
  -> search/multi_fidelity.py
      -> search/orchestrator.py
      -> search/zero_cost.py
      -> training/darts_loop.py
      -> trainer.derive_final_architecture(...)
      -> training/final_trainer.py

Result artifacts by phase¶

Candidate generation and zero-cost ranking¶

Produces:

raw candidates
cheap ranking metrics
promoted top_k candidates

This is the best phase for debugging search-space quality.

Bilevel DARTS phase¶

Produces:

mixed searched model
training and validation losses
alpha histories
intermediate metrics

This is the best phase for debugging collapse, instability, or bad promotion thresholds.

Final retraining phase¶

Produces:

the fixed final_model
final metrics
training info

This is the artifact that should matter most for downstream use.

The important design choice: fixed-model retraining¶

ForeBlocks explicitly separates search-time mixed architectures from final deployment artifacts. That means the model you save after derive_final_architecture(...) is not just the same mixed model with a different label; it is a discrete architecture intended to be retrained and validated on its own merits.

This separation matters because:

mixed-model performance can overestimate the final discrete architecture
retraining reveals whether the search signal really transfers
saving and sharing the final model becomes cleaner

Stats collection and benchmarking¶

When collect_stats=True, the multi-fidelity pipeline stores timing and worker-parallelism artifacts. This is useful when you want to compare:

how expensive phase 1 really is
whether more workers help in practice
whether candidate generation or search training is the true bottleneck

Treat this as an experiment-benchmarking feature, not something you need for every search run.

DARTS Search Pipeline¶

Why this pipeline exists¶

Phase-to-module map¶

Public control flow¶

Result artifacts by phase¶

Candidate generation and zero-cost ranking¶

Bilevel DARTS phase¶

Final retraining phase¶

The important design choice: fixed-model retraining¶

Stats collection and benchmarking¶

Related pages¶