Skip to content

oisee/minz

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

670 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MinZ Programming Language

15 verified examples — from abs_diff (4 instructions, optimal) to @smc compiled sprites (demoscene-quality output from high-level source). Every code block is actual compiler output, no hand-editing.

Highlight Output
abs_diff u8 SUB C / RET NC / NEG / RET4 instructions, identical to hand-ASM
popcount LUTGen LD H,lut^H / LD L,C / LD A,(HL)3 instructions + 256B table
min_of(a,b) JP minmax1 instruction (tail-call elimination)
@smc draw_row LD HL,0 / LD(HL),195 / INC HL / LD(HL),60 / RETcompiled sprite from Row2{b0:195,b1:60}
multi-return annotation -> (u16=HL, u16=DE) — correct, verified against allocator

Full showcase with source, output, T-state analysis and quality table


Hot off the press

  • HIR text parser + .hir file supportmz file.hir now works end-to-end. ParseHIR() round-trips any --emit=hir output identically. Enables writing compiler frontends in any language (emit HIR text, pipe to mz). 6 tests, 1,434-line parser. HIR Guide
  • @smc parameters — Phase A: baked immediates and compiled sprites@smc r0: u16 bakes a function parameter as a LD HL, imm16 immediate instead of an ABI register. Writing r0^ = Row2{ b0: 195, b1: 60 } emits LD (HL), 195 / INC HL / LD (HL), 60 — zero alloca, zero register passing. Auto-synthesised patcher draw_row_set_r0(new_addr: u16) patches the baked imm16 in ~40T. Two @smc params give HL+DE independently. This is the foundation for ZX Spectrum compiled sprites: draw in ~36T/row vs ~168T/row LDIR. Showcase section §10 | ADR-0019
  • Codegen correctness fixes (2026-03-11) — Three Z80 codegen bugs squashed: (1) function annotation multi-return showed -> (u16=HL, u16=HL) — now reads actual allocated registers via TermRet scan, correctly showing -> (u16=HL, u16=DE); (2) clobbers: BC false-positive in callers — ExtraRets of CALL instructions no longer counted as caller-side clobbers; (3) PUSH A; POP HL invalid widening — emitMov now detects 8-bit→16-bit and emits LD L, A; LD H, 0 instead. Bonus: ADD DE, HL invalid Z80 instruction fixed — 16-bit ADD now routes through HL with EX DE,HL when destination is DE.
  • @smc showcase section added to ASM Showcase — §10 shows real compiled output: single-row 36T draw, two-row sprite with HL+DE addressing, composition table (4 pieces that clicked together), and the full 16×8 sprite projection. ex10b_fib_iter.nanz, ex10c_fib_fold.nanz added to showcase-src/2026-03-11/.
  • The Nanz Language Book v2 — Complete rewrite grounded in actual source code. 13 chapters + 5 appendices: full pipeline walk-through (parse→HIR→MIR2→Z80), PBQP register allocator internals, optimization passes (BranchEquiv, CondRetSink, CmpSubCarry, LUTGen), iterator chain fusion, zero-cost abstractions (UFCS, interfaces, lambdas), QBE correctness oracle, PL/M-80 corpus seeding with idiomatic translation examples, and a compiled output gallery with real .a80 assembly from mz. Supersedes v1.
  • Open Bugs & Root Cause Analysis — 6 tracked MIR2/codegen bugs with full RCA: BUG-001 GCD parallel-copy bloat (PBQP affinity edges missing), BUG-002 forEach constant rematerialization, BUG-003 ptr[i] in while loop (invalid EX DE,HL / ADD F,DE), BUG-004 non-zero-lo LUT contract mismatch, BUG-005 applySubSwapNeg missing u16 guard, BUG-006 zero-size struct globals not emitted. Priority/severity/fix-size for each.
  • Nanz Z80 Output Quality & Allocator Trilogy — 823-line deep-dive covering three layers shipped in one session: (1) struct literal syntax Color{r: 255, g: 0, b: 0} with in-place construction (no alloca for assignment targets), enabling HL-chain optimization automatically; (2) LD (HL), n immediate store folding — constant fields now emit LD (HL), 255 (10T/2B) instead of LD C, 255; LD (HL), C (14T/3B), saving 4T/1B per constant field store; (3) StorageClass taxonomy and ADR-0019 — three tiers of global storage: StorageNormal (DB data section), StorageSMCGetter (values baked as instruction immediates in getter function), StoragePhantom (per-read-site immediates), and StorageBakedSprite (ZX Spectrum compiled sprite — pixel data AND screen addresses baked as instruction immediates in synthesised draw function, 346T vs ~1344T LDIR for 16×8 sprite). Foundation for ZX Spectrum baked sprite codegen: set_pos() patches address immediates; set_frame() patches pixel immediates.
  • Nanz Real ASM Showcase: abs_diff Optimal Z80 — 4 Instructions — Full five-pass optimization chain that transforms abs_diff(a,b:u8) from 8 naive instructions to SUB B / RET NC / NEG / RET (4 instructions, ~19T fast path). Passes: CondRetSink (hoist trivial return), SubSwapNeg (sub(b,a)→neg(r)), HoistReorderSubBeforeCmp (Sub before Cmp), CmpSubCarry (carry already in F — no CP emitted), PBQP interference elimination (ClassAcc mismatch gone). Inline survival analysis: TermCondRet becomes TermBrIfJR NC, .label at the call site — same 4 instructions, zero CALL/RET overhead. Also covers the u16 case: AND A; SBC HL,DE; RET NC fast path (30T) with 16-bit NEG via NEG+SBC sequence.
  • Closure capture fixforEach lambdas that write to outer local variables (s = s + x where s is declared outside the lambda) no longer panic. The hasFreeVars pass in LowerModule detects free-variable references in lambda bodies and skips standalone lowering; the lambda is only lowered inline via lowerFusedForEach where outer vars are correctly threaded as SSA block params. TestLambdaCapture added.
  • LUT Pointer Selection & PBQP Edge Costs — Page-aligned LUT fast path trimmed from 21T → 18T (−14%, −1 byte) by replacing LD HL, sym with LD H, sym^H — the low byte was immediately overwritten by the index so only the page base (high byte) matters. Full pointer-register timing table: HL/DE/BC all achieve 18T for page-aligned LUT; BC★ achieves 14T when the index is already in C (planned codegen check); IX/IY cost 38T due to the 12T (IX+d) DD-prefix penalty and are excluded from LUT access. Analysis documents where this belongs architecturally: instruction selection (current fix), post-allocation codegen check (BC★), and full PBQP edge costs (Phase 6e) — edge costs are the correct formulation for correlated allocation decisions (idx→C + ptr→BC cheaper than independent assignments). ADR-0017 written.
  • Phase 6: Register Allocator Revolution — PBQP, IX/IY, Copy Coalescing — Three phases shipped in one session: (6a) IX/IY indexed addressing in Z80 codegen — (IX+0) / (IX+1) displacement, undocumented LD IXl,L / LD IXh,H byte-copy (16T vs 21T for PUSH/POP), no more invalid bare (IX); (6b) full PBQP allocator replacing greedy — weighted cost vectors (useCount × slotCost), R0/R1/RN reduction rules, delta sort (2nd_best − best) ensures hot registers claim zero-cost locations first: r_heavy(10×) → A (0T), r_light(1×) → C (6T) vs greedy's arbitrary ordering; (6c) post-allocation copy coalescing eliminates trampolines at block boundaries by matching block param and arg physical locations — single-pass with recolored lock to avoid rotation cycles in loop phi-webs. Combined result: four simultaneously-live ClassPointer registers → HL/DE/BC/IX, zero $F0xx spills, correct (IX+0) addressing, 10 new tests, 23/23 packages green.
  • Nanz Week 1: Struct Methods, UFCS, Zero-Cost Interfaces + Phase 6 RCA — Go-style interfaces (interface Animal { speak }) compile to direct CALL — zero vtable, zero indirection. Three animals, three speak implementations, one all_speak caller: CALL Dog_speak / CALL Cat_speak / CALL Bird_speak — each 17T on Z80 vs ~55T for Go's interface dispatch. Struct methods (fun Dog.speak(self: Dog)), UFCS dispatch, operator overloading, struct-typed parameters, and interface declarations all wired into the HIR→MIR2→QBE pipeline. 3 new E2E tests verify the full chain (parse → HIR → MIR2 → QBE → native binary). Honest RCA of current Z80 codegen bugs (LD A, HL invalid instruction, self-param spill to $F0xx) with root causes traced to VarRefExpr.Ty = TyU8 hardcoding and missing PtrAdd(x, 0) → x fold. Phase 6 plan: spill cost model + ClassIX/IY as overflow pool, full PBQP graph coloring, copy coalescing.
  • MIR2→QBE: Native Backend & Correctness Oraclepkg/mir2qbe compiles MIR2 modules to QBE IL and runs them natively on arm64/x86_64. Full pipeline: Nanz/PL/M → HIR → MIR2 → QBE → cc → native binary. Correctness oracle: if Z80 emulator and native binary agree, the bug (if any) is in Z80 codegen. 4/4 E2E tests: PL/M abs_diff + fib, Nanz sum_array (real ptr[i] loop with bidirectional pointer type inference), Nanz abs_diff. Side-by-side QBE IL vs hand-written + arm64 disasm comparison. brew install qbe is all you need.
  • E2E Overview: Architecture, Frontends, MIR2, and PBQP Roadmap — comprehensive deep-dive: what Nanz/MinZ/PL/M-80 each are and why, MIR1 vs MIR2 design philosophy, how register classes map to PBQP, interprocedural contract optimization with before/after assembly, LUTGen, flag-return ABI, JRS, and the full roadmap to graph-coloring PBQP. Honest gap list included.
  • JRS pseudo-instruction in MZA — codegen now emits JRS for all local-label branches. MZA expands to JR (2 bytes) when offset fits and condition is JR-compatible (NZ/Z/NC/C), auto-promotes to JP (3 bytes) when offset > ±127 via existing multi-pass convergence, and emits JP directly for conditions JR doesn't support (PE/PO/P/M). Zero codegen complexity — MZA sorts it out.
  • LUTGen: compile-time lookup tables from ranged types — annotate a parameter with u8<0..255> and the compiler evaluates the function body at compile time for all 256 inputs, emitting a page-aligned DB table. A popcount loop that runs 8 iterations becomes 3 instructions + RET at runtime (LD HL, lut / LD L, C / LD A, (HL) / RET).
  • Nanz & PL/M: Factual Status, Real Examples — three separate pipelines, what each can do, real compiled Z80 output, honest gaps. Corrects earlier claim that "Nanz is what MinZ lowers to".
  • Native PL/M-80 V4.0 vs MIR2 Z80 Backendplm80c (Intel PL/M-80 V4.0, built from source) vs our MIR2 backend: −46% code size (80B→43B), zero memory traffic in register-allocated loop body vs 6 loads/4 stores per iteration. Full side-by-side listing with T-state analysis.
  • Full Pipeline Walk-Through: PL/M → Nanz → HIR → MIR2 → Z80 — all intermediate stages with real output: --emit=nanz, --emit=hir, --emit=mir2-raw, --emit=mir2, .a80, and mzd disassembly. Three-path comparison (PLM direct / PLM→Nanz / native Nanz). HIR dump reveals a type-inference bug in the PL/M frontend fixed by the Nanz round-trip.
  • PL/M-80 E2E Pipeline: mz file.plm -o file.comcompileViaHIR function wired, binary verified via MZE emulator (fib(10)=55, abs_diff(10,3)=7, max3(5,12,7)=12). --emit=mir2, --emit=mir2-raw, --emit=hir flags added.
  • PL/M-80 Frontend: 26/26 corpus, 1338 functions → HIR — full PL/M-80 parser (100% Intel 80 Tools corpus), preprocessor with $INCLUDE + LITERALLY alias chains, 1338 functions / 11661 statements lowered to HIR. ADR: 0014. Pipeline: PL/M-80 → HIR → MIR2 → Z80 asm end-to-end wired.
  • PL/M-80 Parser: 26/26 corpus coverage — LITERALLY macro chains, $INCLUDE CP/M resolution, '' escaped quotes, binary literals, record field access. ADR-0014.
  • MIR2 Architecture & Progress (~22%) — topology-aware holdsPhys, SoA256 layout (H=field/L=index), PBQP domain map, progress bar. ADRs: 0011 0012. Roadmap.
  • MIR2 Codegen Quality Sprint — 42 tests, 9 verified Z80 functions (gcd, max3, popcount, min8 + prior), shadow register guard, DSE pass, AND/OR/XOR immediate peepholes. Real assembly quality: min8 = 5 instructions, max3 = 7 on hot path.
  • v0.20.1: Profiler + Emulator upgrades — 7-channel profiler (exec/read/write/stack push/pop/IO + memory snapshot), stderr port $25, DI+HALT exit with A register as process exit code. Stack depth tracking via SP-delta detection.
  • Honest Assessment — Code-Verified Status — Every claim verified by live test runs: 75% compile rate, 1 production backend, what actually works vs. what doesn't
  • MIR Backend Test Suite — 11 handcrafted .mir programs, full MIR→Z80→binary→emulate pipeline validation (9/11 pass)
  • VSCode: Edit, Compile & Run in One Click — Cmd+Alt+R compiles and runs MinZ in the terminal. 3 SMC codegen fixes, loop rerolling in action, 25% binary size savings. Try it: examples/cpm/playground.minz
  • MIR Language Compatibility Deep Dive — why PL/M scores 9/10 and Ada 4/10, what to fix, MIR vs SDCC/cc65/z88dk/QBE/ACK (comparison)
  • MIR Analysis: Multi-Language IR? — 118 opcodes, 24 types, 13+ optimizer passes. Can it compile PL/M and Ada? (architecture guide)
  • VSCode Tooling Sprint — LSP server, full syntax highlighting, SLD source maps, DeZog debugging, 10 compile commands (guide)
  • Register Allocator Overhaul — 7.8x iterator speedup (207T → 26T per element), full MinZ→asm pipeline walkthrough
  • Iterator Reality Check — honest status: 11/11 E2E correct, before/after the overhaul
  • Iterator Status — 11/11 E2E, 26T/element post-overhaul, operation matrix, known bugs
  • Project Status — v0.19 roadmap and priorities

MinZ Logo

Modern Programming Language for Vintage Hardware

Version License

Write modern code. Run it on Z80, eZ80, 6502, and more.

Quick Start | Features | Examples | Targets | Toolchain


What is MinZ?

MinZ is a programming language that compiles modern, readable code to efficient assembly for retro hardware — primarily Z80 and eZ80 systems. It includes a self-contained toolchain: compiler, assembler, emulator, and remote runner. No external dependencies.

import stdlib.cpm.bdos;

fun main() -> void {
    @print("Hello from MinZ!");
    let fib_a: u16 = 0;
    let fib_b: u16 = 1;
    for i in 0..10 {
        print_u16(fib_a);
        putchar(32);  // space
        let next = fib_a + fib_b;
        fib_a = fib_b;
        fib_b = next;
    }
}

This compiles to Z80 assembly, assembles to a .com binary, and runs on CP/M:

$ mz fibonacci_cpm.minz -b z80 --target cpm -o fib.a80 && mza fib.a80 -o fib.com
$ mze fib.com -t cpm
Fibonacci:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55

Quick Start

Build from Source

git clone https://github.com/oisee/minz.git
cd minz/minzc
make all            # Build all 9 tools
make install-user   # Install to ~/.local/bin/

No external dependencies. Pure Go.

Compile and Run

# Compile MinZ to Z80 assembly
./mz ../examples/hello_print.minz -o hello.a80

# Assemble to binary
./mza hello.a80 -o hello.tap

# Run in emulator
./mze hello.tap

Multi-Target

mz program.minz -b z80 --target spectrum -o prog.a80   # ZX Spectrum
mz program.minz -b z80 --target cpm -o prog.a80        # CP/M
mz program.minz -b z80 --target agon -o prog.a80       # Agon Light 2
mz program.minz -b c -o prog.c                         # C99 (partial — simple programs only)
mz program.minz -b crystal -o prog.cr                  # Crystal (stub — not functional)

Features

Working

Feature Description
Types u8, u16, i8, i16, bool, void, pointers
Functions fun/fn declaration, overloading, multiple returns
Control flow if/else, while, for i in 0..n, loop {}
Structs Declaration, field access, UFCS method syntax
Arrays Declaration, indexing
Globals global counter: u8 = 0;
String interpolation "Hello #{name}!" (Ruby-style)
Inline assembly asm { LD A, 42 } blocks, [addr] bracket indirection
CTIE Compile-Time Interface Execution (trait monomorphization)
True SMC Self-modifying code optimization
@extern FFI extern fun putchar(c: u8) at 0x10; with RST optimization
Operator overloading v1 + v2 via impl blocks
Error propagation @error(code) with CY flag ABI
Enums enum State { IDLE, RUNNING } with values
Module system import stdlib.cpm.bdos;
Lambdas Closure syntax, zero-cost transform
PL/M-80 frontend Parse + HIR lowering for all 26 Intel 80 Tools corpus files (100%); 1338 functions, 11661 statements
Nanz frontend New active source language for the MIR2 backend; arithmetic, control flow, loops, function calls
LUTGen u8<lo..hi> ranged type annotation → compile-time table generation; popcount loop → 3-instruction LUT at runtime
Flag-return ABI Functions returning bool from a comparison pass the result via carry flag — no LD A, 0/1 materialization
Interprocedural CC opt Register class chosen per call-site: params coerced to A/B/C/HL/DE based on callee contract
JRS pseudo-instruction Codegen emits JRS for all branches; MZA picks JR (2B) or JP (3B) based on offset and condition

Partial / In Progress

Feature Status
Pattern matching Syntax parses, codegen partial
Iterator chains 9 ops on Z80 + inline lambda filters + fusion optimizer (inlines callbacks in DJNZ loops). 87+ tests, 11/11 E2E hex-verified, all pass. enumerate/reduce at MIR level (Z80 needs OpPush fix). See Status
MIR interpreter Arrays/structs working, not complete

Known Limitations

  • Register allocator has bugs with overlapping lifetimes in complex loops
  • Some loop/arithmetic combinations produce incorrect code
  • loadToHL can use stale values in multi-expression contexts
  • Loop rerolling can be too aggressive across function call boundaries

These are documented and being worked on. Simple programs (hello world, fibonacci, demos) work correctly. Complex programs with nested loops and heavy arithmetic may hit edge cases.


Code Examples

Nanz: New Active Frontend for MIR2

Nanz is the primary language for the HIR→MIR2→Z80 pipeline. Real compiled output:

fun abs_diff(a: u8, b: u8) -> u8 {
    if a > b { return a - b }
    return b - a
}

fun clamp(x: u8, lo: u8, hi: u8) -> u8 {
    if x < lo { return lo }
    if x > hi { return hi }
    return x
}

Generated Z80 (actual mz output):

abs_diff:
    CP C
    JR Z, .abs_diff_if_join2
    JR C, .abs_diff_if_join2
.abs_diff_if_then1:
    SUB C
    LD C, A
    RET
.abs_diff_if_join2:
    NEG
    ADD A, C
    LD C, A
    RET

clamp:
    CP D                    ; x vs lo
    JR NC, .clamp_if_join2
.clamp_if_then1:
    LD A, D
    RET
.clamp_if_join2:
    CP C                    ; x vs hi
    JR Z, .clamp_if_join4
    JR C, .clamp_if_join4
.clamp_if_then3:
    LD A, C
    RET
.clamp_if_join4:
    RET

PBQP Allocator: Hot Registers in Cheap Slots

(examples/nanz/05_four_pointers.nanz · 06_pbqp_weighted.nanz · 07_ix_load_store.nanz)

The PBQP allocator weights each virtual register's cost by its use count. A register used 10× pays 10× the slot cost, so the solver puts it in the cheapest location — even when that means displacing a low-use register.

Four simultaneously-live pointer registers → HL / DE / BC / IX (no spill):

// examples/nanz/05_four_pointers.nanz
fun four_ptrs(p0: ptr, p1: ptr, p2: ptr, p3: ptr) -> u8 {
    var v0: u8 = p0[0]
    var v1: u8 = p1[0]
    var v2: u8 = p2[0]
    var v3: u8 = p3[0]    // p3 → IX under register pressure
    var s01: u8 = v0 + v1
    var s23: u8 = v2 + v3
    return s01 + s23
}
four_ptrs:
    LD C, (HL)      ; p0 → HL  (cost 0)
    LD D, (DE)      ; p1 → DE  (cost 4)
    LD E, (BC)      ; p2 → BC  (cost 6)
    LD H, (IX+0)    ; p3 → IX  (cost 8) ← (IX+0) not $F0xx memory!
    LD A, C
    ADD A, D
    LD C, A
    LD A, E
    ADD A, H
    ...
    RET

High-use vs low-use — PBQP always puts the hot reg in the cheap slot:

// examples/nanz/06_pbqp_weighted.nanz
fun weighted(x: u8) -> u8 {
    var light: u8 = 1          // used 1×  — displaced to C
    var heavy: u8 = x          // used 10× — stays in A (0T per use)
    heavy = heavy + x          // ... repeated 9 more times
    ...
    return heavy + light
}
weighted:
    LD C, 1          ; light → C  (1× use, forced out of A)
    ADD A, A         ; heavy stays in A throughout (10× use, 0T/use)
    LD D, A
    ADD A, D
    ...              ; 8 more iterations — all in A, zero memory traffic
    ADD A, C         ; final: heavy(A) + light(C)
    RET

IX store/load — undocumented HL→IX copy (16T vs 21T PUSH/POP):

// examples/nanz/07_ix_load_store.nanz
fun roundtrip_ix(hl_ptr: ptr, de_ptr: ptr, bc_ptr: ptr, val: u8) -> u8 {
    bc_ptr[0] = val           // bc_ptr overflows to IX under 4-reg pressure
    var a: u8 = hl_ptr[0]
    var b: u8 = de_ptr[0]
    var back: u8 = bc_ptr[0]
    return a + b + back
}
roundtrip_ix:
    LD IXH, H         ; undocumented DD 67 — copy HL→IX (16T, not PUSH/POP=21T)
    LD IXL, L         ; undocumented DD 6D
    LD (IX+0), C      ; store val through IX pointer
    LD C, (DE)
    LD D, (BC)
    LD E, (HL)
    ...
    RET

LUTGen: Compile-Time Lookup Tables

Annotate with u8<0..255> — the compiler evaluates the function for all 256 values and emits a page-aligned table:

fun popcount(x: u8<0..255>) -> u8 {
    var n: u8 = 0
    var v: u8 = x
    while v != 0 {
        n = n + (v & 1)
        v = v >> 1
    }
    return n
}

The loop above never runs at runtime. Generated Z80:

popcount:
    LD HL, popcount_lut
    LD L, C                 ; C = input (index into table)
    LD A, (HL)              ; table lookup — H unchanged = page base
    RET

    ALIGN 256
popcount_lut:
    DB 0, 1, 1, 2, 1, 2, 2, 3, ...   ; 256 bytes, evaluated at compile time

Structs and Methods (UFCS)

struct Vec2 { x: i16, y: i16 }

impl Vec2 {
    fun add(self, other: Vec2) -> Vec2 {
        return Vec2 { x: self.x + other.x, y: self.y + other.y };
    }
    fun length_sq(self) -> i16 {
        return self.x * self.x + self.y * self.y;
    }
}

fun main() -> void {
    let v1 = Vec2 { x: 3, y: 4 };
    let v2 = Vec2 { x: 1, y: 2 };
    let v3 = v1 + v2;            // Zero-cost: CALL Vec2_add
    let len = v3.length_sq();    // Zero-cost: CALL Vec2_length_sq
}

Compile-Time Execution (CTIE)

@ctie
fun fibonacci(n: u8) -> u8 {
    if n <= 1 { return n; }
    return fibonacci(n-1) + fibonacci(n-2);
}

let fib10 = fibonacci(10);  // Becomes: LD A, 55 (no runtime cost)

Inline Assembly

asm fun fast_clear_screen() {
    LD HL, $4000
    LD DE, $4001
    LD BC, 6143
    LD (HL), 0
    LDIR
}

CP/M Program

import stdlib.cpm.bdos;

fun main() -> void {
    @print("Hello, CP/M!");
    putchar(13);
    putchar(10);
    let ch = getchar();
    putchar(ch);
}

Agon Light 2 Program

import stdlib.agon.mos;
import stdlib.agon.vdp;

fun main() -> void {
    mos_puts("Hello from Agon Light 2!");
    set_mode(3);
    fill_rect(10, 10, 100, 80, 4);
}

Error Handling

enum FileError { None, NotFound, Permission }

fun read_file?(path: u8) -> u8 ? FileError {
    if path == 0 {
        @error(FileError.NotFound);
    }
    return path;
}

Self-Modifying Code (True SMC)

@abi("smc")
fun draw_pixel(x: u8, y: u8) -> void {
    // Parameters patched directly into instruction immediates
    // Single-byte opcode changes: 7-20 T-states vs 44+ for memory reads
    let screen_addr = y * 32 + x;
    // ...
}

Zero-Cost Iterator Chains & Lambda Fusion (In Development)

MinZ aims to bring functional-style iterator chains to Z80 — with zero runtime overhead. The compiler fuses chains like .map().filter().forEach() into a single tight loop, inlining all lambdas and using DJNZ where possible.

Target syntax:

// Functional iterator chain — compiles to ONE loop, zero allocations
scores.iter()
    .map(|x| x + 5)
    .filter(|x| x >= 90)
    .forEach(|x| print_u8(x));

// In-place mutation with ! variants
enemies.filter!(|e| e.health > 0);
particles.forEach!(|p| p.update());

// Generators (planned)
gen fibonacci() -> u16 {
    let a: u16 = 0;
    let b: u16 = 1;
    loop {
        yield a;
        let tmp = a + b;
        a = b;
        b = tmp;
    }
}

What the compiler produces — the entire chain fuses into ~25 T-states/element:

; scores.iter().map(|x| x + 5).filter(|x| x >= 90).forEach(|x| print_u8(x))
;
; No intermediate arrays. No function call overhead. Just one DJNZ loop.

    LD HL, scores            ; source pointer
    LD B, scores_len         ; counter in B for DJNZ
.loop:
    LD A, (HL)               ; load element         (7 T)
    ADD A, 5                 ; .map(|x| x + 5)      (4 T)
    CP 90                    ; .filter(|x| x >= 90) (7 T)
    JR C, .skip              ; skip if < 90
    CALL print_u8            ; .forEach(...)
.skip:
    INC HL                   ; next element          (6 T)
    DJNZ .loop               ; dec B, loop          (13 T)

Compare: a naive indexed loop with separate map/filter passes would cost 60-150+ T-states/element and allocate intermediate arrays. The fused version uses O(1) memory and runs 3-5x faster.

Key optimizations:

  • Lambda inlining — closures compile to direct CALL or inline code, never heap-allocated
  • Iterator fusion — multi-stage chains merge into a single loop at compile time
  • DJNZ loops — arrays ≤255 elements use Z80's dedicated loop instruction (13 T-states vs 25+ for compare-jump)
  • Pointer arithmeticHL walks the array with INC HL, no index multiplication

Testing (v0.19.5): 87+ tests across 7 layers — every stage of the pipeline has dedicated coverage:

Layer Tests Status
E2E shell (hex-verified output) 11 all pass
Corpus (full compile to Z80) 18 all pass
Fusion optimizer (callback inlining) 7 all pass
MIR VM (DJNZ execution) 8 all pass
Codegen (Z80 patterns) 7 all pass
Semantic (IR generation) 20 all pass
Parser (chain conversion) 18 all pass

9 operations fully working on Z80: forEach, map, filter, take, skip, peek, inspect, takeWhile, and inline lambda filters (filter(|x| x > N) compiles to CP N+1 + JR C — no function call, ~27 T-states saved per iteration). Fusion optimizer inlines small callbacks directly into DJNZ loop bodies, eliminating CALL/RET overhead and enabling bare DJNZ instruction. enumerate and reduce work at MIR level, Z80 blocked by OpPush routing. See Iterator Implementation Status for details.

Documentation:


Platform Targets

Z80 Targets (Primary)

Target Status Binary Notes
ZX Spectrum Working .tap Main development target, tested via mze + ZXSpeculator
CP/M Working .com BDOS stdlib, tested via mze with CP/M mode
Agon Light 2 Working .bin eZ80/ADL mode, MOS + VDP stdlib, structural testing only
MSX Compiles varies Target config exists, limited testing

Backends

Backend Status Notes
Z80 ✅ Production Full-featured, optimized, 5500+ lines, MIR2 active target
QBE (native) ✅ Working MIR2→QBE IL→arm64/x86_64. Correctness oracle: 4/4 E2E tests. brew install qbe
C99 ⚠️ Partial Produced real binaries; variable redeclaration bug in scoped locals
M68k 🧪 Untested Most complete non-Z80 (28 opcodes, real register allocator); never assembled
i8080 🧪 Untested Structurally correct (all-memory approach); never assembled
6502 ❌ Broken Arithmetic uses $00 placeholder; never assembled
LLVM ❌ Broken JumpIf fallthrough hardcoded, type errors; llc fails
WASM ❌ Broken Label/jump emit as comments; WAT validation fails
Crystal ❌ Stub Control flow emits comments, function args always empty
Game Boy ❌ Stub Add, Sub, LoadVar, StoreVar all emit only comments

Only Z80 is production-quality. QBE is new (2026-03-09)pkg/mir2qbe translates MIR2 directly to QBE IL, which compiles to native arm64/x86_64 via qbe + cc. Used as a correctness oracle: same MIR2 module → Z80 emulator vs native binary; agreement means the pipeline is correct. See Report #045.

Language Frontends

Three separate source languages compile through the same HIR → MIR2 → Z80 backend:

Frontend Status Pipeline Notes
Nanz Active — primary MIR2 frontend Nanz → HIR → MIR2 → Z80 The new surface language; arithmetic, control flow, loops, LUTGen, flag-return ABI, interprocedural CC opt
PL/M-80 Working PL/M-80 → HIR → MIR2 → Z80 26/26 Intel 80 Tools corpus (100%); 1338 functions, 11661 statements
MinZ Frozen on MIR1 MinZ → MIR1 → old Z80 codegen Not being developed; will eventually route through HIR→MIR2 once the Participle parser → HIR wiring is done

Three pipelines, one backend. .nanz and .plm files go through compileViaHIR() → HIR → MIR2 → Z80. .minz files use the old MIR1 path (pkg/codegen/z80.go, 5,800 LOC). MIR1 is frozen; all new work goes into MIR2.

Nanz is a minimal, type-safe language designed as a clean target for the MIR2 backend. Working features: arithmetic/bitwise ops, if/else, while, for-range, function calls, u8/u16/i8/i16/bool types, u8<lo..hi> ranged types for LUT generation, interprocedural calling-convention optimization, flag-return ABI (comparison results passed via carry flag — no bool materialization).

PL/M-80 coverage (Intel 80 Tools corpus): algolm compiler, BASIC-E compiler/parser/synthesizer, ML80 assembler (l81/l82/l83/m81), TeX, CP/M utilities, Kermit — 1338 functions / 943 globals / 11661 statements lowered to HIR from 26 source files. Handles LITERALLY macro chains, $INCLUDE with CP/M device designators, binary literals, record field access, EXTERNAL procedures, all PL/M-80 statement forms. See ADR-0014.

Pipeline emit flags (works with .plm and .nanz input):

mz program.plm --emit=nanz       # Transpile to Nanz surface syntax (round-trip)
mz program.plm --emit=hir        # HIR typed-tree dump (types on every node)
mz program.plm --emit=mir2-raw   # MIR2 before optimisation (DSE/ReorderBlocks)
mz program.plm --emit=mir2       # MIR2 after optimisation passes
mz program.plm                   # .a80 assembly (default)
mz program.plm -o prog.com -t cpm  # Assemble to CP/M binary

The Nanz transpiler is lossless: mz prog.plm --emit=nanz | mz --stdin produces byte-identical assembly to compiling .plm directly.


Toolchain — End-to-End Development Ecosystem

MinZ provides a complete, self-contained development ecosystem. Every tool you need — from source code to running program to screenshot — is a single Go binary with zero external dependencies. No fragile toolchain of third-party assemblers, separate emulators, or external debuggers. One make builds everything.

Source Code                          Running Program
    |                                      |
    v                                      v
  [mz] compile ──> [mza] assemble ──> [mze] run (CP/M, headless)
    |                                  [mzx] run (ZX Spectrum, graphical)
    |                                  [mzrun] run (remote, DZRP)
    |                                      |
    v                                      v
  [mzd] disassemble <──────────────── [mzx --screenshot] capture
Tool Purpose Usage
mz MinZ compiler mz program.minz -o program.a80
mza Z80 assembler (table-driven, all Z80 ops including undocumented, [addr] bracket syntax) mza program.a80 -o program.com
mze Z80 emulator (1335/1335 FUSE tests, profiler, console I/O, stderr port) mze program.com -t cpm --console-io
mzx ZX Spectrum emulator (T-state accurate, AY, profiler, .sna/.tap/.trd/.scl, console I/O) mzx --snapshot game.sna
mzd Z80 disassembler (IDA-like analysis, xrefs, ROM tables) mzd program.bin --org 0x8000
mzrun Remote runner (DZRP protocol) mzrun program.minz --reset
mzv MIR VM runner (breakpoints, tracing, PNG export) mzv program.mir
mzr Interactive REPL ❌ Broken — compilation pipeline not wired
mzlsp LSP server (diagnostics, hover, goto-def, completion) auto-started by VSCode extension

MZX — ZX Spectrum Emulator

T-state accurate emulation with real display output. Supports 48K and Pentagon 128K models.

# Interactive emulation
mzx --snapshot game.sna
mzx --tap game.tap
mzx --model pentagon --rom 128-0.rom --rom1 trdos.rom --trd game.trd

# Load raw binary and run (no ROM needed)
mzx --load code.bin@8000 --set PC=8000,SP=FFFF,DI
mzx --run code.bin@8000   # shortcut for --load + --set PC + SP + DI

# Bare-metal console I/O (no ROM needed)
mzx --run code.bin@8000 --frames DI:HALT --console-io
# OUT ($23),A → stdout | IN A,($23) → stdin | OUT ($25),A → stderr
# DI + HALT → exit with A register as process exit code

# Console I/O with custom port or AY serial
mzx --run code.bin@8000 --frames DI:HALT --console-to-port '$FF'
mzx --run code.bin@8000 --frames DI:HALT --console-to-port ay

# BASIC console (RST $10, needs ROM)
mzx --snapshot game.sna --console

# Headless screenshots (for CI, automated testing, book illustrations)
mzx --snapshot game.sna --screenshot shot.png --frames 100
mzx --tap game.tap --screenshot shot.png --screenshot-on-stable 3

# Execution profiling (7-channel heatmap + memory snapshot)
mzx --snapshot demo.sna --profile heatmap.json --frames 500
# Profile includes: exec, read, write, stack_push, stack_pop, io, mem_snapshot
mzx --snapshot demo.sna --trace trace.jsonl --trace-frames 100:200

# Debugging
mzx --warn-on-halt --verbose --diag --snapshot game.sna

Features: FrameMap ULA rendering, beeper + AY-3-8912 audio (AYumi), ULA contention, .sna/.tap/.trd/.scl format support, full TR-DOS function dispatch, 7-channel execution profiler (exec/read/write/stack push/pop/IO + memory snapshot), basic-block tracer, conditional screenshots, T-state snapshots, DI+HALT exit with A as exit code, bare-metal console I/O (port $23 stdout, $25 stderr, or AY serial), 48K ROM included.

Live Testing with DZRP

For ZX Spectrum development, mzrun compiles, assembles, and uploads to a running emulator in one command:

# Start ZXSpeculator with DZRP enabled, then:
export DZRP_HOST=localhost DZRP_PORT=11000
mzrun game.minz --reset -v

Debug Flags

mz program.minz --dump-mir       # Show MIR intermediate representation
mz program.minz --dump-ast       # AST in JSON format
mz program.minz --viz out.dot    # MIR visualization (Graphviz)
mz program.minz -d               # Verbose compilation details
mz program.minz --compile-trace  # Structured log of all optimization decisions

Standard Library

Stdlib modules are organized by domain. Quality varies — some modules are well-tested, others are experimental.

Tested and Working

Module Description
cpm/bdos CP/M BDOS calls: putchar, getchar, print_string, file I/O
agon/mos Agon MOS API: mos_putchar, mos_puts, file I/O (eZ80 ADL mode)
agon/vdp Agon VDP graphics: modes, shapes, sprites, buffer commands
text/format Number formatting: u8_to_str, u16_to_hex
mem/copy Fast memory ops: memcpy, memset (LDIR-based)

Available but Less Tested

Module Description
math/fast Sin/cos/sqrt lookup tables (256 entries)
math/random LFSR PRNG, noise functions
graphics/screen Pixel/line/circle drawing (ZX Spectrum)
input/keyboard Keyboard matrix, debouncing
text/string strlen, strcmp, strcpy, strcat
sound/beep Beeper SFX
time/delay Frame timing, delays

Experimental

Module Description
glsl/* GLSL-style shader library: fixed-point math, raymarching, SDFs

Optimization Pipeline

MinZ applies optimizations at multiple levels:

  1. CTIE — Pure functions with constant args execute at compile time
  2. MIR optimizer — Constant folding, strength reduction, dead code elimination
  3. True SMC — Self-modifying code patches parameters into instruction immediates
  4. Loop rerolling — Detects repeated call sequences, collapses to loops
  5. Peephole optimizer — 35+ Z80-specific assembly patterns

Example: fibonacci(10) with CTIE generates LD A, 55 — zero runtime cost.


Project Structure

minz/
  minzc/             Compiler & toolchain (Go, ~90K LOC)
    cmd/               CLI tools
      minzc/             mz — MinZ compiler
      mza/               mza — Z80 assembler
      mze/               mze — Z80 emulator (headless)
      mzx/               mzx — ZX Spectrum emulator (graphical)
      mzd/               mzd — Z80 disassembler
      mzrun/             mzrun — DZRP remote runner
      mzr/               mzr — REPL
    pkg/               Core packages
      parser/            Participle-based parser
      semantic/          Type checking, analysis (~11K lines)
      ir/                Intermediate representation
      codegen/           Z80 (production), C (partial), + 8 experimental backends
      optimizer/         MIR + peephole optimizers
      z80asm/            Z80 assembler engine (table-driven)
      spectrum/          ZX Spectrum emulation (ULA, AY, memory, ports)
      emulator/          Z80 CPU emulation (remogatto/z80, FUSE-tested)
      disasm/            Disassembler with IDA-like analysis
  stdlib/            Standard library (.minz)
    agon/              Agon Light 2 (MOS, VDP)
    cpm/               CP/M (BDOS)
    graphics/          Screen drawing
    math/              Fast math, PRNG
    text/              String, formatting
    ...
  examples/          270+ example programs
  docs/              Technical documentation
  reports/           Progress reports (date-numbered)

Current Status (March 2026)

MinZ is under active development. The Z80 backend is mature and produces working binaries for ZX Spectrum, CP/M, and Agon Light 2. A new compiler backend — MIR2 — is now the active development target, fed by the Nanz and PL/M-80 frontends via a typed HIR layer.

What works well:

  • MIR2 pipeline: Nanz/PL/M-80 → HIR → MIR2 → Z80, fully wired end-to-end
  • LUTGen: u8<lo..hi> ranged types → compile-time table generation, verified via emulator
  • Flag-return ABI + interprocedural CC optimization: comparison results travel via flags, no bool materialization
  • JRS pseudo-instruction: codegen emits JRS, MZA picks JR vs JP based on offset — saves 1B per short branch
  • Complete self-contained toolchain: compile → assemble → emulate → screenshot
  • T-state accurate ZX Spectrum emulation with display, audio, tape/disk support
  • Execution profiler with memory/IO heatmaps and basic-block trace export
  • Multi-target compilation (same source for Spectrum, CP/M, Agon)
  • Compile-time execution (CTIE) for constant expressions
  • Z80 CPU emulation verified against FUSE test suite (gold standard)

What needs work:

  • MIR2: pointer-indexed array access (ptr[i] in loops) — broken due to HL conflict between base pointer and index arithmetic; use ForEachStmt (sequential scan) instead
  • MIR2: non-zero-lo LUT (e.g. u8<10..20>) — contract opt changes param class after LUTGen builds body; unit tests pass, pipeline broken
  • MinZ (.minz): register allocator stale HL tracking in loops — blocks complex programs (ADR-0006)
  • MinZ: 9/11 advanced feature tests fail
  • Non-Z80 backends: only C produces any working binaries; rest are stubs/broken
  • MZR REPL: broken — compilation pipeline not wired through semantic analysis

Metrics (verified 2026-03-09):

  • 71/73 core examples compile (97%), 131/173 all examples (75%)
  • ~125K lines of Go in the compiler + toolchain
  • MIR2: 53/53 unit tests pass; E2E fib(1..10), clamp, abs_diff, max3 verified via MZE
  • MIR2→QBE: 4/4 E2E tests — PL/M + Nanz → native arm64 binary via QBE (correctness oracle)
  • PL/M-80: 26/26 Intel 80 Tools corpus files parse + compile → Z80 (100%)
  • 1335/1335 FUSE Z80 tests pass — gold-standard CPU verification including all undocumented opcodes
  • 87+ iterator tests across 7 layers — 11/11 E2E hex-verified (MinZ/MIR1 pipeline)
  • 24/24 Go test packages pass, 0 fail
  • 9 working toolchain binaries (mzr REPL is broken), all pure Go, zero external dependencies

Development

See docs/GenPlan.md for the development roadmap and current priorities.


Contributing

# Build all tools
cd minzc
make all

# Run all tests (emulator, assembler, spectrum, parser, etc.)
make test-all

# Test an example end-to-end
./mz ../examples/hello_print.minz -o /tmp/hello.a80
./mza /tmp/hello.a80 -o /tmp/hello.tap
./mze /tmp/hello.tap

# Screenshot an example
./mzx --rom roms/48.rom --snapshot demo.sna --screenshot shot.png --frames 50

Report issues at github.com/oisee/minz/issues.


License

MIT. See LICENSE for details.


MinZ: Modern syntax for vintage hardware.

About

Minz /mɪnts/ - Systems programming for Z80. Features TRUE SMC lambdas, revolutionary ABI for seamless ASM integration, Lua metaprogramming. TSMC delivers 14.4% fewer instructions vs C. Optimized Z80 assembly for retro/embedded.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors