Quick Start¶
This guide gets you from raw model output to a rigorous statistical report in under 5 minutes.
What you'll learn
- Install RetroCast and initialize a project
- Configure an adapter for your model's output format
- Run the
ingest→score→analyzepipeline - Generate statistical reports with confidence intervals
1. Install¶
Verify installation:
2. Initialize Project¶
Go to your working directory and create the default configuration and directory structure:
This creates:
retrocast-config.yaml- Configuration filedata/retrocast/- Structured data directories (1-benchmarks, 2-raw, 3-processed, 4-scored, 5-results)
Custom data directory
You can customize the data directory location via:
- CLI flag:
retrocast --data-dir ./my-data <command> - Environment variable:
export RETROCAST_DATA_DIR=./my-data - Config file: Add
data_dir: ./my-datatoretrocast-config.yaml
Run retrocast config to see the resolved paths.
Configure Your Model¶
Open retrocast-config.yaml and register your model. You need to tell RetroCast which adapter to use to parse your files.
models:
# The name you will use in CLI commands
my-new-model: # (1)!
# The parser logic (see docs/developers/adapters.md)
adapter: aizynth # (2)!
# The filename RetroCast looks for in 2-raw/
raw_results_filename: predictions.json # (3)!
sampling: # (4)!
strategy: top-k
k: 10
- Choose a descriptive name (lowercase with hyphens)
- See supported adapters - includes AiZynthFinder, Retro*, DMS, SynPlanner, Syntheseus, ASKCOS, and more
- Must match the filename you'll place in
2-raw/within your data directory - Optional: Limit routes per target (omit to keep all routes)
3. The Workflow (Ingest → Score → Analyze)¶
RetroCast enforces a structured workflow to ensure reproducibility:
graph LR
A[Place Raw Data<br/>2-raw/] --> B[Ingest<br/>Standardize]
B --> C[Score<br/>Evaluate]
C --> D[Analyze<br/>Statistics]
B -.-> E[3-processed/]
C -.-> F[4-scored/]
D -.-> G[5-results/]
All paths are relative to your data directory (default: data/retrocast/).
Step A: Place Raw Data¶
Put your model's raw output file in the 2-raw/ directory (within your data directory) following this structure:
Example:
mkdir -p data/retrocast/2-raw/my-new-model/mkt-cnv-160
cp predictions.json data/retrocast/2-raw/my-new-model/mkt-cnv-160/
Available benchmarks
See Benchmarks Guide for details on evaluation sets:
- Market Series (
mkt-*): Practical utility with commercial stock - Reference Series (
ref-*): Algorithm comparison with ground-truth stock
Step B: Ingest¶
Convert raw output into the canonical RetroCast Route format. This standardizes data and removes duplicates.
Output: data/retrocast/3-processed/my-new-model/mkt-cnv-160/routes.json.gz
Step C: Score¶
Evaluate routes against the benchmark's defined stock.
Output: data/retrocast/4-scored/my-new-model/mkt-cnv-160/scores.json.gz
Step D: Analyze¶
Generate final report with bootstrapped confidence intervals and visualization plots.
Output: data/retrocast/5-results/mkt-cnv-160/my-new-model/
report.md- Statistical summary*.html- Interactive plots (add--make-plotsarg and make sure to installvizdep group, i.e.uv tool install "retrocast[viz]")
You're done!
Check data/retrocast/5-results/mkt-cnv-160/my-new-model/report.md for your results!
Alternative: Quick Evaluation¶
Just want to score one file?
If you don't want to set up a full project structure, use the score-file command:
retrocast score-file \
--benchmark data/retrocast/1-benchmarks/definitions/mkt-cnv-160.json.gz \
--routes my_predictions.json.gz \
--stock data/retrocast/1-benchmarks/stocks/buyables-stock.txt \
--output scores.json.gz \
--model-name "Quick-Check"
This skips the config setup and directly evaluates a single predictions file.
Next Steps¶
Learn the Concepts
Read Concepts to understand why we use adapters and manifests.
Use the Python API
Want to use RetroCast inside your own scripts? See the Library Guide.
Write Custom Adapters
Need to support a new output format? Learn how to write an Adapter.
Full CLI Reference
See all available commands in the CLI Reference.
Explore Benchmarks
Learn about stratified evaluation sets in the Benchmarks Guide.