uip	0133
title	Urbit Testing Procedures
description	Establishes procedures for testing across Arvo and Vere
author	~hanfel-dovned, ~mopfel-winrux
status	Last Call
type	Process
created	2025-04-01

Abstract

This UIP proposes the adoption of standardized testing practices for the Arvo kernel. It defines clear expectations for unit, integration, and regression testing within Arvo, and establishes the groundwork for expanded guidance across Vere and emerging testing methodologies.

The aim is to improve software correctness, developer onboarding, and long-term maintainability of the Urbit stack by institutionalizing a culture of structured, thoughtful testing.

Motivation

Historically, Urbit's development emphasized correctness and conceptual elegance, yet testing infrastructure and discipline remained ad hoc and unevenly applied. This fragmentation leads to:

Undocumented expectations for contributors
Difficulties catching regressions across Kelvin decrements
Limited visibility into test coverage via CI

By unifying standards and aligning expectations across the ecosystem, we address all of the above. This UIP captures and formalizes the best emerging practices from both contexts and lays a foundation for continued iteration.

Status

Discussions will continue to finalize this document.

Specification

This document formalizes the following testing standards for all code in the %base desk.

Arvo Standards

Arvo's functional architecture makes it well-suited to precise unit testing and isolated module validation. Cores expose well-defined interfaces, allowing developers to reason about behavior and test correctness without relying on dynamic analysis or anomaly detection. This design emphasizes thorough unit and integration tests as the primary means of ensuring robustness.

Unit Testing

Unit testing ensures the correct functionality of isolated code modules. Arvo's structure encourages small, focused tests that validate each core's behavior in isolation.

Hoon Language & Libraries

All Hoon language features and standard library functions—including hoon.hoon, lull.hoon, zuse.hoon, and arvo.hoon, as well as all files in /lib—MUST have unit tests.

Tests should use the -test thread, typically in conjunction with the associated /lib/test library, optionally supplemented with the %quiz property testing library. Test files should reside in /tests and follow the naming convention:

urbit/tests/[base-desk-file-path].hoon

For partial tests of individual arms, append the arm name to the path.

Example:
This test verifies addition behavior, including identity and basic arithmetic. It should be located at tests/sys/hoon/math/add.hoon.

/+  *test
|%
::
::  Test addition (+)
::
++  test-add
  ;:  weld
    ::  Checks standard addition
    ::
    %+  expect-eq
      !>  2
      !>  (add 1 1)
    ::  Checks identity property (0 + n = n)
    ::
    %+  expect-eq
      !>  5
      !>  (add 0 5)
  ==
--

Agents

Agent tests should use the test-agent library for structured verification of state and cards.

Example:
This test builds the %time agent, initializes it, pokes it, and checks for the expected %wait card. It should be located at tests/app/time.hoon.

/+  *test-agent
/=  time-agent  /app/time
|%
++  test-poke
  %-  eval-mare
  =/  m  (mare ,~)
  ^-  form:m
  ;<  *  bind:m  (do-init %time time-agent)
  ;<  caz=(list card)  bind:m  (do-poke noun+!>(~))
  ;<  =bowl  bind:m  get-bowl
  %+  ex-cards
    caz
  :~  (ex-arvo /(scot %da now.bowl) %b %wait `@da`+(now.bowl))
  ==
--

Vanes

Each vane should have unit tests covering its API. The style used in current Eyre tests is preferred and should be adopted across vanes when possible. A universal test-vane library may be impractical, but per-vane test libraries are encouraged to maintain consistency and reduce duplication.

Marks

Mark files require unit tests for both grab and grow arms, ensuring correct serialization and deserialization behavior.

Example:
This test validates both JSON and noun conversions for the loob mark. It should be located at tests/mar/loob.hoon.

/+  *test
/=  loob-mar  /mar/loob
|%
++  test-grow-json
  %+  expect-eq
    !>  [%b %.y]
    !>  json:~(grow loob-mar %.y)
++  test-grow-noun
  %+  expect-eq
    !>  %.y
    !>  noun:~(grow loob-mar %.y)
++  test-grab-noun
  %+  expect-eq
    !>  %.y
    !>  (noun:~(grab loob-mar *?) 0)
--

Regression Testing

Regression tests confirm that known bugs remain fixed and do not recur. Each test should reproduce the original conditions of a bug and verify that the issue has been resolved.

Regression tests MUST accompany bug fixes. They should be integrated into the relevant test suite, or placed in /tests/bug/ if standalone.

Example:
Github Issue #6095 involves a crash in the Hoon parser due to rune ordering. The pull request that fixes this bug should include the following test located at /tests/bug/gh-6095.hoon.

/+  *test
|%
::
::  Test that a core with a luslus prior to all lusbars successfully compiles
::
++  test-chapter
  %-  expect-success
    |.
    %-  ream
    '|%  ++  foo  ~  +|  %bar  ++  baz  ~  --'
--

Integration Testing

Integration testing verifies behavior across code boundaries, explicitly identifying module interaction surfaces and ensuring reliable inter-component communication. However, the combinatorial explosion of emergent behaviors across these boundaries makes measuring integration test coverage difficult, necessitating the use of heuristics to target cases most likely to expose issues—for example, establishing many vanes' basic functionality inherently involves integration testing due to interactions with Vere I/O. Generators and threads, while in theory viable for unit testing, so often interact with vane state or even external endpoints that they fall firmly within integration testing territory.

For this reason, it's difficult to establish a clear organizational structure for integration testing akin to Arvo's unit testing methodology. Instead, Urbit should seek to develop an overall increase in discipline around pairing new features in core with userspace implementations that test those features in real-world scenarios.

Example:
This test validates the interaction between a %wait task sent to Behn and the expected %wake response after a delay.

/-  spider
/+  *strandio
=,  strand=strand:spider
^-  thread:spider
|=  arg=vase
=/  m  (strand ,vase)
^-  form:m
=/  delay=@dr  (need !<((unit @dr) arg))
;<  t1=@da  bind:m  get-time
=/  =task:behn  [%wait (add delay t1)]
=/  =card:agent:gall  [%pass /timer %arvo %b task]
;<  ~  bind:m  (send-raw-card card)
;<  res=(pair wire sign-arvo)  bind:m  take-sign-arvo
?>  ?=([%timer ~] p.res)
?>  ?=([%behn %wake *] q.res)
%-  (slog ~[leaf+"Gift: {<+.q.res>}"])
?~  error.q.res
  ;<  t2=@da  bind:m  get-time
  %-  (slog ~[leaf+"Time elapsed: {<`@dr`(sub t2 t1)>}"])
  (pure:m !>(~))
%-  (slog u.error.q.res)
(pure:m !>(~))

Backwards Compatibility

This UIP does not alter the behavior of existing code but RECOMMENDS a consistent review and enforcement policy going forward. Contributors MAY need to add tests retroactively when modifying untested components.

Security Considerations

Adopting these standards improves the security posture of the Urbit codebase by:

Catching regressions that may reintroduce known vulnerabilities
Identifying logic bugs that only manifest during execution
Ensuring correctness of serialization, cryptographic operations, and system boundaries
Providing a foundation for automated verification in CI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abstract

Motivation

Status

Specification

Arvo Standards

Unit Testing

Hoon Language & Libraries

Agents

Vanes

Marks

Regression Testing

Integration Testing

Backwards Compatibility

Security Considerations

Copyright

FilesExpand file tree

UIP-0133.md

Latest commit

History

UIP-0133.md

File metadata and controls

Abstract

Motivation

Status

Specification

Arvo Standards

Unit Testing

Hoon Language & Libraries

Agents

Vanes

Marks

Regression Testing

Integration Testing

Backwards Compatibility

Security Considerations

Copyright