Independent Compiler Guide
The independent compiler milestone adds a second implementation that can
cross-check the main compiler. It must be a real compiler implementation, not a
wrapper around compiler/main.0x0 and not a generated-C-only path.
Required Source
The expected implementation path is:
compiler2/main.0x0
It may import only these compiler2-owned backend support modules:
compiler2/encoding.0x0
compiler2/object-format.0x0
compiler2/elf-runtime.0x0
compiler2/encoding.0x0 contains shared deterministic hex, endian, alignment,
padding, and source-path encoding helpers used by compiler2 object and ELF
output. compiler2/object-format.0x0 contains ELF64 relocatable header and
section-header writers used by compiler2 object output. compiler2/elf-runtime.0x0
contains executable ELF packaging and syscall snippets used by the direct
compiler2 backend. All support modules are hashed with the independent compiler
fixtures and release inputs. They must not import compiler/main.0x0, shell out
to a compiler, or contain parser, semantic checker, optimizer, or main compiler
backend internals.
Current implemented slice:
- independent lexer with text-kind tokens;
- independent recursive parser with compiler2-prefixed diagnostics;
- independent front-end summary API:
```txt
compiler2-summary-file(path)
compiler2-summary-module(source)
```
The front-end slice covers submilestone 7.1. It does not claim backend emission,
cross-build support, or release participation.
Current semantic slice:
- supported top-level form validation;
- duplicate function and parameter rejection;
- module, top-level doc, import, and export form-shape validation;
- explicit export validation for missing and duplicate exported functions;
- file-backed import signature loading for unaliased and namespaced imports;
- package import signature loading through
0x0.lock; - imported call arity, concrete argument type, and explicit export filtering
checks;
- parameter and local-binding scope checks;
- user function arity checks;
- builtin arity checks for the current pure expression slice;
- type annotation shape and parameter arity validation;
- argument, return, equality, and
ifcondition type checks when concrete
types are inferable;
- capability annotation validation;
purecapability checks for effectful builtins and user calls;- unbound symbol rejection.
- diagnostic-class compatibility for parser, duplicate declaration, unbound
symbol, arity, declaration-shape, and effect failures.
This is submilestone 7.2 for the current pure front-end slice. Full type
coverage for every runtime/library value, larger backend emission, and
cross-build support are still separate submilestones.
Current backend slice:
- canonical compiler API for the supported source slice:
```txt
compile-file(path)
compile-module(source)
compile-elf-program-file(path)
compile-elf-program-module(source)
compile-elf-compiler-builder-file(path)
compile-elf-compiler-builder-chunks-file(path)
compiler2-compile-file(path)
compiler2-compile-module(source)
```
- compiler2-owned OISA text emission for validated functions in the current
semantic slice, including imported function declarations that compiler2 can
load through file or package imports;
- canonical
compile-elf-program-file(path)and
compile-elf-program-module(source) APIs that emit direct Linux ELF64
executables for the implemented compiler2 I64 and Text direct backend slices;
compile-elf-compiler-builder-file(path)and
compile-elf-compiler-builder-chunks-file(path) for the current cross-matrix
artifact slice, backed by compiler2-owned ELF emission;
compiler2-elf-program-file(path);compiler2-elf-program-module(source);compiler2-elf-compiler-builder-file(path);compiler2-elf-compiler-builder-chunks-file(path);compiler2-object-main-i64-file(path);compiler2-object-main-i64-module(source);compiler2-object-i64-file(path);compiler2-object-i64-module(source);compiler2-elf-main-i64-file(path);compiler2-elf-main-i64-module(source);compiler2-elf-main-text-file(path);compiler2-elf-main-text-module(source);- direct Linux ELF64 executable output for validated
main -> I64integer
literal, same-file direct-call, recursive-call, binary arithmetic,
comparison, conditional, sequencing, local-binding, parameter, and argument
passing programs in the current backend slice;
- direct Linux ELF64 executable output for constant
text-lenandtext-eq?
expressions over literal/constant-text-cat values, lowered to I64/Bool
values without runtime Text allocation;
- direct Linux ELF64 executable output for constant
text-at,text-slice,
is-space?, is-digit?, and is-alpha? expressions over literal and
constant text values, lowered without runtime Text allocation;
- direct Linux ELF64 executable output for constant
int-to-textand
text-to-int expressions, lowered without runtime conversion allocation;
- direct Linux ELF64 executable output for validated
main -> Textliteral
and constant text-cat programs that write the resulting bytes to stdout
through the Linux syscall boundary;
- ELF64
ET_RELobject output for validated functions returning signed
32-bit integer literals, sign-extended as ABI I64;
- ELF64
ET_RELobject output for inlineI64arithmetic, comparison,
conditional, sequencing, and local-binding expressions that require no nested
relocation-bearing calls;
- same-object direct calls between integer-returning functions, represented as
x86-64 call rel32 stubs with R_X86_64_PC32 entries in .rela.text,
including the current I64 argument-register setup slice;
- imported pure helper calls for the current object backend slice, including
Unit -> I64 and (I64 I64) -> I64 signatures loaded from imported source
files, represented as undefined global function symbols resolved by
zero-link;
- import/local symbol conflict rejection before object emission;
- explicit imported-module export lists constrain which external symbols are
accepted by the compiler2 object backend;
.text,.rela.text,.note.0x0.abi,.note.0x0.source,.symtab,
.strtab, and .shstrtab;
- one local
.text.localsection symbol before exported function symbols, with
.symtab.sh_info pointing at the first global symbol;
- ABI-compatible output that links with
zero-link.
The current backend accepts a signed 32-bit integer literal for each emitted
return stub and sign-extends it to the ABI I64 return register.
This is submilestone 7.3 for the current signed integer-literal backend slice,
the first direct executable backend slice, and the first same-object direct-call
relocation slice, plus the first 7.5 mixed-output slice. The backend follows the
production object symbol-table contract for local text metadata, so compiler2
objects can be archived and linked without exporting local implementation
symbols. Wider integer immediates, non-Unit -> I64 imported signatures, data
relocations, larger mixed compiler object suites, ELF compiler-builder emission
from compiler2, and compiler cross-builds remain separate submilestones.
Executable outputs are inspected as metadata with zero-elf-info before the
gate runs them.
The second compiler may share:
- the language guide;
- ABI documentation;
- conformance programs;
- package manifests and lockfiles;
- reusable library modules whose behavior is independently tested.
It must not copy the main compiler's parser, semantic checker, optimizer, or
backend internals unless that code has been deliberately extracted into a shared
library with its own tests.
Cross-Build Matrix
The gate requires:
compiler A builds compiler A
compiler A builds compiler B
compiler B builds compiler A
compiler B builds compiler B
compiler A is compiler/main.0x0. compiler B is compiler2/main.0x0.
The current 7.4 source gate is make independent-cross-matrix-source-check.
It does not claim the heavy cross-build has passed. It verifies that the heavy
matrix gate is explicit before the project spends memory on it:
- the heavy matrix uses
artifact_kind=elf-compiler-builderfor every edge; - both compiler sources must provide
compile-elf-compiler-builder-file; - the bootstrap entries are
trusted->Aandtrusted->B; - the cross-check entries are
A->A,A->B,B->A, andB->B; - the gate records a tab-separated matrix and hashes every produced compiler
output;
- the gate fails before expensive compilation if either compiler source lacks the
builder entry point.
Validation
make independent-independence-check verifies the source-level independence
contract without compiling either compiler. It fails if compiler2/main.0x0
imports the main compiler, calls main compiler entry points or internals, shells
out to a compiler command, drops required compiler2 public APIs, or stops
maintaining a substantial c2-* implementation namespace. The same source gate
also verifies that compiler2 keeps its object backend contract, ABI/source-note
metadata, backend diagnostics, and backend/mixed-link test hooks in place.
make independent-frontend-check verifies the current 7.1 slice by parsing real
source files and checking malformed-source diagnostics without importing or
wrapping compiler/main.0x0.
make independent-semantics-check verifies the current 7.2 semantic slice with
valid source files, explicit exports, file-backed imports, namespaced imports,
package imports, and duplicate-function, duplicate-parameter, unbound-symbol,
arity, missing-export, duplicate-export, malformed-import, unexported-import,
imported-call-arity, imported-call-type, missing-package-dependency,
capability-shape, effectful-builtin, effectful-user-call, return-type,
argument-type, type-arity, and if-condition-type failures.
make independent-backend-check verifies the current 7.3 backend slice by
emitting compiler2-owned ELF64 relocatable objects for positive and negative
signed integer returns, inline expression lowering, plus same-object direct
calls with and without I64 arguments, emitting compiler2-owned direct ELF
executables for integer-return, local-call, arithmetic, and
argument-passing source slices including examples/add.0x0, plus recursive
call, comparison, conditional, sequencing, and local-binding source slices,
and emitting compiler2-owned main -> Text literal and constant text-cat
stdout executables, plus constant text-len, text-eq?, text-at,
text-slice, is-space?, is-digit?, and is-alpha? I64/Bool/Text
lowering, plus constant int-to-text and text-to-int conversion lowering,
plus constant list, cons, tail, reverse, empty?, len, head,
and nth list lowering for list expressions known at compile time,
plus compile-time local substitution for constant Text/list/I64 bindings used
by those constant builtin lowerings, and same-file constant Text -> Text
call evaluation for direct Text executable output, plus a first runtime Text
path that writes (nth (argv) 0) from the process argument vector, lowers
(print <constant Text>), (print <same-file constant Text call>),
(print (nth (argv) 0)),
(print (read-stdin)), and (print (read-file (nth (argv) 0))) to stdout
with the documented trailing newline, streams (read-stdin) to stdout without
loading the whole stream, and streams
(read-file (nth (argv) 0)) to stdout without loading the whole file, plus
(write-file (nth (argv) 1) (read-file (nth (argv) 0))) as a streaming
argv-to-argv file copy, plus
(write-file (nth (argv) 0) <constant Text>) and
(write-file (nth (argv) 1) <constant Text>) as direct output-file writers
that stream embedded payloads without routing through stdout. These runtime
slices exit with 2 for missing argv, 3 for input/output open failure, and 4 for
output write failure,
checking the same I64 and Text paths through compiler2's canonical
compile-elf-program-file API,
inspecting ABI, symbol, segment, and .rela.text metadata, linking objects
with zero-link, inspecting linked executable metadata with zero-elf-info,
and running the results.
make independent-mixed-link-check verifies the current 7.5 mixed-output slice
by linking a main-compiler object that calls helper with a compiler2-emitted
helper object, then linking a compiler2 object that imports and calls
helper with a main-compiler-emitted helper object. It also links a
compiler2 object that imports a main-compiler (I64 I64) -> I64 helper and
emits I64 argument setup before the external relocation. All paths inspect
object and linked executable metadata with zero-object-info and
zero-elf-info, then run the result. The same gate also packages the compiler2
helper object into a deterministic archive, verifies that .text.local does
not enter the archive symbol index, links the main-compiler object against that
archive, compares the archive-linked executable with the direct executable, and
runs it.
make independent-compiler-api-check verifies the current compiler2 compiler
API slice by running compiler2's own compile-file and canonical
compile-module entry points through the source runner and checking the emitted
OISA module and function declarations. It also compares compiler2's OISA for
the supported examples/add.0x0 slice against the canonical compiler's OISA for
the same source, and repeats that byte-for-byte comparison for a source file that
uses a pkg: package import. The same API gate exercises
compile-elf-program-module on an in-memory main -> Text module, decodes the
returned ELF hex, inspects ABI/segment metadata with zero-elf-info, runs the
executable, and compares stdout bytes.
make diagnostic-classes-check verifies the current 7.6 diagnostic-class slice
by running actual failing inputs through the main compiler and compiler2, then
checking the shared classes documented in docs/diagnostics.html. The shared
classes include backend unsupported-slice failures, so object backend limits do
not get hidden under generic arity diagnostics.
make diagnostic-source-check is the lightweight companion gate for the same
slice. It verifies the documented class set, classifier patterns, compiler
diagnostic strings, and compiler2 smoke inputs without invoking either
compiler.
make independent-compiler-check must:
- build both compiler artifacts from the released trusted compiler;
- cross-build both compilers with each produced compiler;
- compile the ABI, lib0x0, object, linker, and optimizer conformance programs
through both compilers;
- prove mixed outputs are ABI-compatible;
- record hashes for every compiler output in
build/independent-compiler.
The milestone is not complete until both implementations can participate in the
normal release trust chain.