ML-Guided Heuristics
ML-guided heuristics are an optional optimizer advisor lane. They are disabled by default and cannot change source semantics. Default compilation uses
deterministic hand-written heuristics.
Run the bounded evidence gate with:
make ml-heuristics-research-check
This gate validates source-owned metadata, model identity, fallback behavior,
compile-cost budgets, decision logs, release reports, diagnostics, and negative
fixtures. It does not train models, run broad benchmarks, self-host the
compiler, deploy, or publish remote artifacts.
Enablement
The only accepted opt-in flag is:
--enable-ml-heuristics
Without that explicit flag, the advisor returns the deterministic fallback for
the decision family.
Pilot Decisions
The first supported advisor decisions are:
inline-size-speed;register-eviction;branch-probability;layout-policy.
Each pilot decision has a fallback heuristic in
ml-heuristics/advisor-policy.tsv.
Feature Schemas
Feature schemas live in ml-heuristics/feature-schemas.tsv. Across the
supported schemas, the evidence includes:
- IR level;
- MachineIR operation class;
- hotness;
- code size;
- register pressure;
- loop depth;
- target features.
The advisor rejects inputs that do not satisfy the schema for the selected
decision.
Model Artifacts
Model artifacts live outside the compiler binary under
ml-heuristics/models/. The registry records:
- model id;
- decision id;
- version;
- artifact path;
- artifact hash;
- detached attestation;
- offline training source;
- rollback artifact.
The compiler binary must not embed these model artifacts. A missing,
incompatible, or hash-mismatched model falls back to the deterministic
heuristic.
Advisor Command
The source-owned advisor command is:
python3 tools/ml_heuristics_advisor.py --input ml-heuristics/fixtures/valid-advisor-input.json
By default this returns a fallback decision. Explicit ML evaluation requires:
python3 tools/ml_heuristics_advisor.py \
--input ml-heuristics/fixtures/valid-advisor-input.json \
--enable-ml yes \
--model-id inline-size-speed-v1
The output includes decision id, source, model id, action, fallback reason,
confidence, compile cost, and semantic effect.
Release Evidence
Release evidence lives in:
release/ml-heuristics-decision-report.tsv;release/ml-heuristics-feature-schema-report.tsv;release/ml-heuristics-fallback-report.tsv;release/ml-heuristics-model-report.tsv;release/ml-heuristics-benchmark-report.tsv;release/ml-heuristics-rollback-report.tsv;release/ml-heuristics-diagnostics-report.tsv;perf/ml-heuristics-performance.tsv;compat/ml-heuristics-contract.tsv.
Diagnostics
The gate reports stable diagnostics:
MLH_DEFAULT_ENABLED;MLH_MISSING_FEATURE;MLH_MODEL_HASH;MLH_NO_FALLBACK;MLH_COMPILE_COST;MLH_COMPILER_EMBEDDED;MLH_ROLLBACK.
Negative fixtures under ml-heuristics/fixtures/ prove the gate fails closed
for unsafe defaults, incomplete schemas, bad model identity, missing fallback,
and excessive compile cost.