Interface skeleton describing the shape of a safety-hook + context-injection layer for agent-facing tool surfaces. Production safety controls and production context schemas live in the proprietary UnboxAPI runtime at unboxapi.pro.
This repository ships interface declarations and one illustrative example hook. It is NOT a production safety control, NOT a content classifier, NOT a prompt-injection defence, and NOT a spend or authorization gate.
The example
logging-passthroughhook allows every call and logs to stdout. It exists to show the shape of the interface — it must not be deployed as a safety mechanism. Consumers who need real spend caps, prompt-injection defences, EU AI Act disclosure, audit-log integrity, or any other safety property must implement those themselves (or use the proprietary UnboxAPI runtime).If you deploy this repository's example hook into a production agent pipeline, you have no safety layer.
UnboxAPI-SafetyKit defines, in a language-agnostic spec plus TypeScript
declaration files:
- the shape of a
SafetyHook— abefore/afterveto interface that an agent runtime can call around every tool invocation; - the shape of a
ContextProvider— an interface that returns per-tool-call context to inject into the prompt; - a trivial reference implementation (
logging-passthrough) that demonstrates the contract and does nothing safety-relevant.
The full language-agnostic specification is in
docs/interface-spec.md. TypeScript declarations
are in interface/. The reference hook is in
examples/logging-passthrough.ts.
This repository ships only the interface skeleton. It does not ship:
- spend caps, quantity caps, allow-lists, blocked-parameter lists, or precondition enforcement;
- prompt-injection classifiers, content filters, or jailbreak detectors;
- the UnboxAPI production rule library;
- production context schemas (brand voice, operational rules, sales context, product knowledge) or their LLM-generated values;
- EU AI Act disclosure wrappers or audit-log integrity guarantees.
Those live in the proprietary UnboxAPI runtime at https://unboxapi.pro.
v0.1.0 — interface-only public release.
- The interface shape is stable for v0.x. Breaking changes will bump the minor version pre-1.0 and the major version post-1.0.
- External contributions are not accepted at v0.1.0. Please open an Issue with feedback; pull requests will be closed. A CLA and contribution guide will be added in a later release.
// interface/safety-hook.d.ts (excerpt)
export interface SafetyHook {
before(ctx: ToolCallContext): Promise<HookOutcome> | HookOutcome;
after?(ctx: ToolCallContext, response: Readonly<unknown>):
Promise<HookOutcome> | HookOutcome;
}
export interface HookOutcome {
allowed: boolean;
reason?: string;
sanitizedArgs?: Record<string, unknown>;
}// examples/logging-passthrough.ts (excerpt — NOT A SAFETY CONTROL)
export const loggingPassthrough: SafetyHook = {
before(ctx) {
console.log(`[safetykit] before ${ctx.toolName}`);
return { allowed: true };
},
};See docs/interface-spec.md for the full contract.
This repository's threat surface is documented in
docs/threat-model.md and covers three explicit
risks: (a) the skeleton being mistaken for a production safety control,
(b) prompt-injection / rule-bypass paths through any implementation of this
interface, (c) supply-chain integrity of the published artifact.
The headline risk is (a): a consumer wires loggingPassthrough into a real
agent pipeline and assumes they have a safety layer. They do not. The
"Not production safety" banner above, the file-level comments on the
example hook, and the in-line assert on import-time exist to make this
mis-assumption as hard as possible.
If you implement this interface in your own runtime, you MUST:
- Validate every field of
argsagainst your own schema before acting. This interface does not constrain argument shapes. - Treat string fields (
reason, anymetadatavalue, the response payload) as untrusted user data. Never inline-concatenate hook outputs into LLM system prompts; pass as bounded, escaped user-role content. - Enforce spend, quantity, and authorization caps at the call site, independent of any hook decision. Hooks are advisory unless your own code makes them load-bearing.
- Time-bound every
before/aftercall (e.g. 250 ms hard timeout). A misbehaving hook implementation must not stall the tool surface. - Apply your own audit-log integrity controls (signed log entries, append- only storage). The interface does not provide an audit log.
- Cap the size of any string returned by a
ContextProvider(e.g. ≤ 4 KiB) before concatenating into a prompt.
See SECURITY.md for the coordinated-disclosure policy and threat-model summary.
Releases are signed and Sigstore-attested. To verify a release tarball:
cosign verify-attestation \
--type slsaprovenance \
--certificate-identity-regexp '.*' \
--certificate-oidc-issuer https://token.actions.githubusercontent.com \
unboxapi-safetykit-v0.1.0.tar.gzA CycloneDX SBOM is attached to every GitHub release.
Apache License, Version 2.0. See LICENSE and NOTICE.
This repository is maintained by UnboxAPI. The proprietary UnboxAPI runtime, production safety rule library, vetted vertical SemanticMaps, and compliance tooling are at https://unboxapi.pro.
UnboxAPI-OpenSpec— the SemanticMap schema spec (DHA-26).