| uip | 0133 |
|---|---|
| title | Urbit Testing Procedures |
| description | Establishes procedures for testing across Arvo and Vere |
| author | ~hanfel-dovned, ~mopfel-winrux |
| status | Last Call |
| type | Process |
| created | 2025-04-01 |
This UIP proposes the adoption of standardized testing practices for the Arvo kernel. It defines clear expectations for unit, integration, and regression testing within Arvo, and establishes the groundwork for expanded guidance across Vere and emerging testing methodologies.
The aim is to improve software correctness, developer onboarding, and long-term maintainability of the Urbit stack by institutionalizing a culture of structured, thoughtful testing.
Historically, Urbit's development emphasized correctness and conceptual elegance, yet testing infrastructure and discipline remained ad hoc and unevenly applied. This fragmentation leads to:
- Undocumented expectations for contributors
- Difficulties catching regressions across Kelvin decrements
- Limited visibility into test coverage via CI
By unifying standards and aligning expectations across the ecosystem, we address all of the above. This UIP captures and formalizes the best emerging practices from both contexts and lays a foundation for continued iteration.
Discussions will continue to finalize this document.
This document formalizes the following testing standards for all code in the %base desk.
Arvo's functional architecture makes it well-suited to precise unit testing and isolated module validation. Cores expose well-defined interfaces, allowing developers to reason about behavior and test correctness without relying on dynamic analysis or anomaly detection. This design emphasizes thorough unit and integration tests as the primary means of ensuring robustness.
Unit testing ensures the correct functionality of isolated code modules. Arvo's structure encourages small, focused tests that validate each core's behavior in isolation.
All Hoon language features and standard library functions—including hoon.hoon, lull.hoon, zuse.hoon, and arvo.hoon, as well as all files in /lib—MUST have unit tests.
Tests should use the -test thread, typically in conjunction with the associated /lib/test library, optionally supplemented with the %quiz property testing library. Test files should reside in /tests and follow the naming convention:
urbit/tests/[base-desk-file-path].hoon
For partial tests of individual arms, append the arm name to the path.
Example:
This test verifies addition behavior, including identity and basic arithmetic. It should be located at tests/sys/hoon/math/add.hoon.
/+ *test
|%
::
:: Test addition (+)
::
++ test-add
;: weld
:: Checks standard addition
::
%+ expect-eq
!> 2
!> (add 1 1)
:: Checks identity property (0 + n = n)
::
%+ expect-eq
!> 5
!> (add 0 5)
==
--
Agent tests should use the test-agent library for structured verification of state and cards.
Example:
This test builds the %time agent, initializes it, pokes it, and checks for the expected %wait card. It should be located at tests/app/time.hoon.
/+ *test-agent
/= time-agent /app/time
|%
++ test-poke
%- eval-mare
=/ m (mare ,~)
^- form:m
;< * bind:m (do-init %time time-agent)
;< caz=(list card) bind:m (do-poke noun+!>(~))
;< =bowl bind:m get-bowl
%+ ex-cards
caz
:~ (ex-arvo /(scot %da now.bowl) %b %wait `@da`+(now.bowl))
==
--
Each vane should have unit tests covering its API. The style used in current Eyre tests is preferred and should be adopted across vanes when possible. A universal test-vane library may be impractical, but per-vane test libraries are encouraged to maintain consistency and reduce duplication.
Mark files require unit tests for both grab and grow arms, ensuring correct serialization and deserialization behavior.
Example:
This test validates both JSON and noun conversions for the loob mark. It should be located at tests/mar/loob.hoon.
/+ *test
/= loob-mar /mar/loob
|%
++ test-grow-json
%+ expect-eq
!> [%b %.y]
!> json:~(grow loob-mar %.y)
++ test-grow-noun
%+ expect-eq
!> %.y
!> noun:~(grow loob-mar %.y)
++ test-grab-noun
%+ expect-eq
!> %.y
!> (noun:~(grab loob-mar *?) 0)
--
Regression tests confirm that known bugs remain fixed and do not recur. Each test should reproduce the original conditions of a bug and verify that the issue has been resolved.
Regression tests MUST accompany bug fixes. They should be integrated into the relevant test suite, or placed in /tests/bug/ if standalone.
Example:
Github Issue #6095 involves a crash in the Hoon parser due to rune ordering. The pull request that fixes this bug should include the following test located at /tests/bug/gh-6095.hoon.
/+ *test
|%
::
:: Test that a core with a luslus prior to all lusbars successfully compiles
::
++ test-chapter
%- expect-success
|.
%- ream
'|% ++ foo ~ +| %bar ++ baz ~ --'
--
Integration testing verifies behavior across code boundaries, explicitly identifying module interaction surfaces and ensuring reliable inter-component communication. However, the combinatorial explosion of emergent behaviors across these boundaries makes measuring integration test coverage difficult, necessitating the use of heuristics to target cases most likely to expose issues—for example, establishing many vanes' basic functionality inherently involves integration testing due to interactions with Vere I/O. Generators and threads, while in theory viable for unit testing, so often interact with vane state or even external endpoints that they fall firmly within integration testing territory.
For this reason, it's difficult to establish a clear organizational structure for integration testing akin to Arvo's unit testing methodology. Instead, Urbit should seek to develop an overall increase in discipline around pairing new features in core with userspace implementations that test those features in real-world scenarios.
Example:
This test validates the interaction between a %wait task sent to Behn and the expected %wake response after a delay.
/- spider
/+ *strandio
=, strand=strand:spider
^- thread:spider
|= arg=vase
=/ m (strand ,vase)
^- form:m
=/ delay=@dr (need !<((unit @dr) arg))
;< t1=@da bind:m get-time
=/ =task:behn [%wait (add delay t1)]
=/ =card:agent:gall [%pass /timer %arvo %b task]
;< ~ bind:m (send-raw-card card)
;< res=(pair wire sign-arvo) bind:m take-sign-arvo
?> ?=([%timer ~] p.res)
?> ?=([%behn %wake *] q.res)
%- (slog ~[leaf+"Gift: {<+.q.res>}"])
?~ error.q.res
;< t2=@da bind:m get-time
%- (slog ~[leaf+"Time elapsed: {<`@dr`(sub t2 t1)>}"])
(pure:m !>(~))
%- (slog u.error.q.res)
(pure:m !>(~))
This UIP does not alter the behavior of existing code but RECOMMENDS a consistent review and enforcement policy going forward. Contributors MAY need to add tests retroactively when modifying untested components.
Adopting these standards improves the security posture of the Urbit codebase by:
- Catching regressions that may reintroduce known vulnerabilities
- Identifying logic bugs that only manifest during execution
- Ensuring correctness of serialization, cryptographic operations, and system boundaries
- Providing a foundation for automated verification in CI
Copyright and related rights waived via CC0.