ML-Guided Heuristics

ML-guided heuristics are an optional optimizer advisor lane. They are disabled by default and cannot change source semantics. Default compilation uses

deterministic hand-written heuristics.

Run the bounded evidence gate with:


make ml-heuristics-research-check

This gate validates source-owned metadata, model identity, fallback behavior,

compile-cost budgets, decision logs, release reports, diagnostics, and negative

fixtures. It does not train models, run broad benchmarks, self-host the

compiler, deploy, or publish remote artifacts.

Enablement

The only accepted opt-in flag is:


--enable-ml-heuristics

Without that explicit flag, the advisor returns the deterministic fallback for

the decision family.

Pilot Decisions

The first supported advisor decisions are:

inline-size-speed;
register-eviction;
branch-probability;
layout-policy.

Each pilot decision has a fallback heuristic in

ml-heuristics/advisor-policy.tsv.

Feature Schemas

Feature schemas live in ml-heuristics/feature-schemas.tsv. Across the

supported schemas, the evidence includes:

IR level;
MachineIR operation class;
hotness;
code size;
register pressure;
loop depth;
target features.

The advisor rejects inputs that do not satisfy the schema for the selected

decision.

Model Artifacts

Model artifacts live outside the compiler binary under

ml-heuristics/models/. The registry records:

model id;
decision id;
version;
artifact path;
artifact hash;
detached attestation;
offline training source;
rollback artifact.

The compiler binary must not embed these model artifacts. A missing,

incompatible, or hash-mismatched model falls back to the deterministic

heuristic.

Advisor Command

The source-owned advisor command is:


python3 tools/ml_heuristics_advisor.py --input ml-heuristics/fixtures/valid-advisor-input.json

By default this returns a fallback decision. Explicit ML evaluation requires:


python3 tools/ml_heuristics_advisor.py \
  --input ml-heuristics/fixtures/valid-advisor-input.json \
  --enable-ml yes \
  --model-id inline-size-speed-v1

The output includes decision id, source, model id, action, fallback reason,

confidence, compile cost, and semantic effect.

Release Evidence

Release evidence lives in:

release/ml-heuristics-decision-report.tsv;
release/ml-heuristics-feature-schema-report.tsv;
release/ml-heuristics-fallback-report.tsv;
release/ml-heuristics-model-report.tsv;
release/ml-heuristics-benchmark-report.tsv;
release/ml-heuristics-rollback-report.tsv;
release/ml-heuristics-diagnostics-report.tsv;
perf/ml-heuristics-performance.tsv;
compat/ml-heuristics-contract.tsv.

Diagnostics

The gate reports stable diagnostics:

MLH_DEFAULT_ENABLED;
MLH_MISSING_FEATURE;
MLH_MODEL_HASH;
MLH_NO_FALLBACK;
MLH_COMPILE_COST;
MLH_COMPILER_EMBEDDED;
MLH_ROLLBACK.

Negative fixtures under ml-heuristics/fixtures/ prove the gate fails closed

for unsafe defaults, incomplete schemas, bad model identity, missing fallback,

and excessive compile cost.