0x0LearnReferenceLibrariesMigration0x0.jmp0x1b.com

Compiler And Backend Guide

compiler/main.0x0 is the canonical compiler implementation. The seed in

seed/zero.s may run this source and its emitted OISA, but it must not duplicate

the compiler pipeline.

Pipeline

The production phase ownership contract lives in docs/compiler-pipeline.html.

The production memory contract and budgets live in docs/compiler-memory.html.

This guide describes the externally visible compiler entry points and backend

behavior.

The compiler has four major responsibilities:

1. Parse .0x0 source into a small tagged AST represented with lists.

2. Validate the current semantic slice before any backend emits output.

3. Emit OISA for the self-hosting compiler path.

4. Emit C for the compatibility native path.

5. Emit GAS x86-64 assembly for linkable object files in the current integer

subset.

6. Emit ELF64/x86-64 bytes for direct native executables and the self-hosted ELF

compiler artifact.

The public OISA entry points are:


compile-file
compile-module

compile-file loads (↥ "path.0x0") imports before validation and emission.

Relative import paths are resolved from the importing file's directory.

Transitive imports share a threaded seen set, so common dependencies are loaded

once across sibling imports. Import forms may also include an alias:


(↥ "path.0x0" alias)

Aliased imports expose imported functions as alias.name. Imported modules may

declare exported functions with (↦ name another-name). Without , all

functions are exported for aliased imports.

Import paths beginning with pkg: resolve through 0x0.lock local dependency

entries before loading. For example, (↥ "pkg:core-map") resolves the

dep core-map ... lockfile line and loads that path.

compile-module compiles an in-memory source string only; it does not resolve

imports because there is no source path to use as an import root.

The native compatibility entry points live in compiler/compat-main.0x0 and

are used only to build bin/zero-native for compatibility checks and seed

recovery:


compile-native-compiler-file
compile-native-program-file

The legacy linkable object assembly entry point also lives in

compiler/compat-main.0x0 until the production object-file milestone replaces

it with direct relocatable object output:


compile-object-asm-program-file

The direct ELF entry points are:


compile-elf-program-file
compile-elf-compiler-file

AST Representation

Tokens and AST nodes keep kind and value in their first two fields, with

source line and column stored after them:


(list kind value line column)

Token kinds include lp, rp, sym, str, and int. AST node kinds include

list, sym, str, and int.

This representation is intentionally simple because it must work in the

bootstrap evaluator, generated OISA, generated C runtime, and direct ELF runtime.

Emitters continue to read only kind and value; diagnostics can read the

compiler-owned line and column metadata.

The lexer/parser now reports source locations for parser-owned failures such as

unexpected closing parentheses, missing closing parentheses, trailing tokens, and

unterminated strings.

OISA Backend

The OISA emitter preserves source expression structure while normalizing module

and function forms:


(oisa-module <name>
  (func <name> (params ...) (body ...)))

Type and documentation annotations are skipped during emission. Type annotations

are checked by the semantic validation pass before emission; documentation

annotations remain source metadata.

File imports are resolved before OISA emission. Unaliased imported functions are

appended to the same function set used by validation and emission. Aliased

imports are appended with qualified alias.name function names, while the root

file's module declaration remains the emitted OISA module name.

Semantic Validation

The validation pass is shared by the OISA, C, and direct ELF entry points. It

currently enforces:

sides are inferable.

hatch.

=/!= operands while preserving Any as the dynamic escape hatch.

env, read-file, write-file, print, panic, and functions that are not

pure by annotation or default.

The checker intentionally returns Any for complex or not-yet-modeled type

cases instead of claiming full type inference. A future type slice should

replace that escape hatch with a complete type environment and source-span

diagnostics. The current accepted capability names are pure, io, file,

network, and process; lib/core/file.0x0 uses file for safe path-checked

file wrappers. Process metadata is available through argv and env, while

socket and subprocess runtime support remain future slices.

C Compatibility Backend

The C backend emits a complete C program as text from

compiler/compat-main.0x0. It exists for compatibility, cross-checking, seed

recovery, and native user-program execution through cc.

It is not the seed and it is not allowed to own language semantics. If source

syntax or compiler lowering changes, the authoritative production

implementation remains in compiler/main.0x0. Normal compiler production after

v0.1.0 uses the released ELF compiler builder; the C backend is optional

compatibility infrastructure and is not emitted into the normal compiler

artifact.

Linkable Object Backend

The legacy object backend in compiler/compat-main.0x0 emits GAS x86-64

assembly, then the generated native compiler can assemble it into a real .o

with cc -c. The object is linkable by the platform linker and currently uses

libc printf for integer result printing. The production object milestone will

replace this compatibility path with direct relocatable object output in

compiler/main.0x0.

The current object slice supports:

arguments;

Unsupported dynamic/text/list/file constructs fail at object-backend compile

time. This keeps the object slice honest while the direct ELF backend remains

the larger native runtime target.

Commands:


./bin/zero-native asm examples/add.0x0 --out build/native/add.obj.s
./bin/zero-native build-obj examples/add.0x0 -o build/native/add.obj.o
cc build/native/add.obj.o -o build/native/add.obj
./build/native/add.obj

The convenience command assembles and links in one step:


./bin/zero-native build-linked examples/add.0x0 -o build/native/add.linked

Direct ELF Backend

The ELF backend emits a Linux ELF64/x86-64 executable directly as hex text. The

wrapper turns that hex into bytes and marks the result executable.

The backend currently owns:

sequencing.

signed integer conversion for the supported slice.

integer/text result printing.

Runtime values in the ELF backend are passed as:


rax = value payload
r15 = tag

The current tag convention is:


0 = nil / Unit
1 = integer or boolean payload
2 = NUL-terminated text pointer
3 = cons/list pointer

Cons nodes are 24 bytes:


offset 0  = car payload
offset 8  = car tag
offset 16 = cdr payload

This layout is deliberately small, but it is now an ABI. Changes to it must

update this guide, the source comments near the backend, and the self-host gates.

Self-Host Gates

The normal self-host gate is:


compiler/main.0x0 --seed--> build/stage1.oisa
build/stage1.oisa      --vm----> build/stage2.oisa
build/stage2.oisa      --vm----> build/stage3.oisa
cmp build/stage2.oisa build/stage3.oisa

The full self-host gate also proves that the direct ELF OISA compiler artifact

can compile the compiler to the same OISA. The artifact is emitted by the

generated native helper after the seed path proves OISA self-hosting:


compiler/main.0x0 --zero-native compiler-elf--> build/native/zero-oisa-compiler
build/native/zero-oisa-compiler compiler/main.0x0 build/native/zero-elf-stage.oisa
cmp build/stage2.oisa build/native/zero-elf-stage.oisa

The trusted release compiler builder emits the next compiler executable:


build/release/v0.1.0/bin/zero-elf-compiler compiler/main.0x0 build/zero-next
./build/zero-next compiler/main.0x0 build/stage2
./build/stage2 compiler/main.0x0 build/stage3
cmp build/stage2 build/stage3

That path is the normal succession path after v0.1.0. The seed remains only for

make bootstrap-from-seed and make verify-seed-bootstrap.

Every repository change must pass the enforced guard before commit:


make selfhost-guard

That guard runs a clean normal self-host, the full ELF compiler self-host, and

the smoke suite so language, runtime, editor, package, and documentation changes

cannot silently drift away from the self-host chain.

Memory-sensitive compiler changes must also pass:


make memory-check

That gate samples peak RSS for the released compiler chain, verifies

stage2 == stage3, and writes the memory report included in release metadata.

Backend Change Checklist

A backend change is not complete until:

they touch linkable native output.

Do not rely on an untested backend promise. If the direct ELF backend cannot run

the slice, document the exact gap as a runtime maturity issue instead of hiding it

behind the C compatibility path.