Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,9 +1,23 @@
.PHONY: help

include make/zcashd-compat.mk
include make/perf.mk

help:
@echo "Available targets:"
@echo ""
@echo " Perf harness (deterministic isolated-cohort bench):"
@echo " perf-build-local Build the instrumented (commit-metrics) bench binary"
@echo " perf-run Run an isolated bench against the cohort (PERF_LABEL/PERF_STOP)"
@echo " perf-analyze Bottleneck attribution over the CSV window (PERF_LABEL/PERF_LO/PERF_HI)"
@echo " perf-dashboard Live metrics dashboard for the running bench node"
@echo " perf-verify-isolation Confirm the bench sees only the two cohort peers"
@echo " perf-seed-serving Deploy + sync the two serving nodes from public mainnet"
@echo " perf-peers Capture serving node_id@ip:8234 into cohort.env"
@echo " perf-freeze-serving Redeploy serving nodes cohort-isolated (static range)"
@echo " perf-status Report serving node service state + version"
@echo ""
@echo " zcashd-compat:"
@echo " compat-docker-build Build Docker zcashd-compat image"
@echo " compat-zcashd-prepare Fetch/verify zcashd-compat artifact for Docker build"
@echo " compat-docker-start Start Docker zcashd-compat with mounted snapshots"
Expand Down
1 change: 1 addition & 0 deletions book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
- [Mempool Architecture](dev/diagrams/mempool-architecture.md)
- [Upgrading the State Database](dev/state-db-upgrades.md)
- [Pruned Storage Mode](dev/pruned-storage.md)
- [Private Zakura Dev Networks](dev/private-zakura-network.md)
- [Zebra versioning and releases](dev/release-process.md)
- [Continuous Integration](dev/continuous-integration.md)
- [Continuous Delivery](dev/continuous-delivery.md)
Expand Down
69 changes: 69 additions & 0 deletions book/src/dev/private-zakura-network.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Private Zakura Dev Networks

When the team iterates on breaking changes to the Zakura (v2) P2P stack, separate
experiments must not interfere with each other or leak into the public network.
A node bootstraps from a few peers, but discovery and gossip then pull in the
rest of the network, so two engineers testing different changes at the same time
would otherwise collide.

A _private Zakura dev network_ (a "cohort") solves this without changing
consensus. Set a tag in the config and a node only forms Zakura (v2) connections
with peers that advertise the same tag. Public nodes and other cohorts ignore it,
and it ignores them, but it still runs on the real chain (same genesis, network
magic, and activation heights), so it validates exactly what production does.

This only scopes the **Zakura v2 overlay**. A dev node may still maintain legacy
TCP connections to public peers; the isolation applies to the v2 stack that is
under test. The tag has no effect unless `v2_p2p` is enabled.

## Configuration

Give every node in the group the same tag under `[network.zakura]`:

```toml
[network.zakura]
# Any node sharing this exact string joins the same private overlay.
# A different string, or no string, is a different (or the public) overlay.
dev_network = "evan-breaking-change"

# Seed the cohort by listing each other's native Zakura endpoints as
# `node_id@direct_addr`. The cohort then self-organizes via cohort-tagged
# discovery; it never reaches public or other-cohort peers.
bootstrap_peers = [
"ed25519nodeid1@10.0.0.1:8234",
"ed25519nodeid2@10.0.0.2:8234",
]
```

```toml
[network]
# Keep v2 enabled so the Zakura overlay runs.
v2_p2p = true
```

## How it works

The tag scopes the node's Zakura identity at a single point. With a tag set, the
node advertises `ZakuraNetworkId::Configured` and a chain id derived from the
real genesis hash and the tag (a domain-separated hash, so a cohort id can never
collide with a real chain's genesis). Both fields are already exchanged and
validated in the Zakura handshake, the legacy→Zakura upgrade prelude, and signed
discovery records, so:

- a **public mainnet** Zakura node (`network_id = Mainnet`) and a dev node reject
each other with `WrongNetwork`, falling back to a legacy connection;
- two **different cohorts** (both `Configured`, different chain id) reject each
other with `WrongChain`, and their discovery records fail validation on import;
- **same-tag** peers match on both fields, complete the Zakura upgrade, and gossip
cohort-tagged records so the group grows among itself.

Because the tag is mixed only into the Zakura peer-matching chain id and never
into block validation, consensus is unaffected.

## Verifying isolation

Launch two nodes with the same `dev_network` and `bootstrap_peers` pointing at
each other, plus a third node without the tag. Watch the `zakura.p2p.handshake.*`
metrics (or the Zakura trace tables): the two tagged nodes upgrade to Zakura and
discover each other, while the untagged node stays on legacy with no v2 upgrade.
All three keep syncing the same chain.
5 changes: 5 additions & 0 deletions deploy/deployer/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Local build cache (binaries keyed on commit SHA) and fetched logs.
.build-cache/
logs/
# Operator's real fleet config (keep nodes.example.toml tracked).
nodes.toml
72 changes: 72 additions & 0 deletions deploy/deployer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# zebrad deploy tool

A small, dependency-free operator tool to build `zebrad` from a per-node commit,
distribute it to a fleet over SSH, run it as a systemd service that logs to a
deterministic file, and pull those logs back by node name.

It reuses the build → scp → install-with-`.bak`-backup → `systemctl restart` →
rollback pattern from `.github/workflows/deploy-zcashd-compat.yml`, generalized to
a dynamic multi-node config.

## Requirements

- Python 3.11+ (uses the stdlib `tomllib`; no third-party packages).
- A working SSH key for every node's `ssh_string` (key-based auth; the tool runs
ssh in `BatchMode`, so password prompts are not supported).
- A local Rust toolchain + `protoc` to build `zebrad` (same as a normal workspace
build). Builds run on this host; the resulting binary is copied to every node,
so nodes must share the build host's architecture and a compatible glibc
(DigitalOcean Ubuntu x86_64 droplets do).

## Config

Copy `nodes.example.toml` to `nodes.toml` and edit. Each `[[nodes]]` entry needs:

- `name` — used for `--node` selection and `logs/<name>.log`.
- `ssh_string` — the ssh/scp destination, e.g. `root@167.99.162.47`.
- `commit` — branch, tag, or SHA to build from (must be fetched locally).

`[defaults]` supplies fleet-wide values (service name, paths, network, ssh
`port`); any field can be overridden per node. `nodes.toml` is gitignored.

## Commands

```bash
cd deploy/deployer

# Build each unique commit into .build-cache/zebrad-<sha> (reused if present).
python3 deploy.py build --config nodes.toml

# Build-if-needed, distribute, install the unit, restart. Parallel; rolls back
# a node to <bin_path>.bak if its restart fails. Non-zero exit if any node fails.
python3 deploy.py deploy --config nodes.toml
python3 deploy.py deploy --config nodes.toml --node node-a # one node
python3 deploy.py deploy --config nodes.toml --no-restart # stage only

# Service state + version per node.
python3 deploy.py status --config nodes.toml

# Pull logs (deterministic log_file from the rendered config).
python3 deploy.py logs fetch --config nodes.toml # -> logs/<name>.log
python3 deploy.py logs fetch --config nodes.toml --lines 2000 # last N lines only
python3 deploy.py logs follow --config nodes.toml --node node-a # live tail -F
```

## How the build cache works

`commit` is resolved to a full SHA (`git rev-parse`). The binary is cached at
`.build-cache/zebrad-<sha>`. A cached binary is reused only if its embedded
`zebrad --version` matches the SHA, otherwise it is rebuilt. Two nodes on the same
commit build once. Each build happens in a throwaway detached `git worktree`, so
your dirty working tree is never touched. Use `--force` to rebuild unconditionally.

## What gets installed on a node

- Binary at `bin_path` (default `/usr/local/bin/zebrad`), previous kept as `.bak`.
- Rendered config at `config_path` (default `/etc/zebrad/zebrad.toml`) with
`[tracing] log_file` pointed at `log_file`.
- Unit at `/etc/systemd/system/<service_name>.service` running
`zebrad -c <config_path> start` with `Restart=always`.

The deterministic `log_file` is the single source of truth shared by the running
node (writer) and `logs fetch`/`logs follow` (reader).
Loading