Skip to content

founder-OmniPA/UnboxAPI-SafetyKit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UnboxAPI SafetyKit

Interface skeleton describing the shape of a safety-hook + context-injection layer for agent-facing tool surfaces. Production safety controls and production context schemas live in the proprietary UnboxAPI runtime at unboxapi.pro.

License Version


⚠️ NOT PRODUCTION SAFETY

This repository ships interface declarations and one illustrative example hook. It is NOT a production safety control, NOT a content classifier, NOT a prompt-injection defence, and NOT a spend or authorization gate.

The example logging-passthrough hook allows every call and logs to stdout. It exists to show the shape of the interface — it must not be deployed as a safety mechanism. Consumers who need real spend caps, prompt-injection defences, EU AI Act disclosure, audit-log integrity, or any other safety property must implement those themselves (or use the proprietary UnboxAPI runtime).

If you deploy this repository's example hook into a production agent pipeline, you have no safety layer.


What this is

UnboxAPI-SafetyKit defines, in a language-agnostic spec plus TypeScript declaration files:

  • the shape of a SafetyHook — a before/after veto interface that an agent runtime can call around every tool invocation;
  • the shape of a ContextProvider — an interface that returns per-tool-call context to inject into the prompt;
  • a trivial reference implementation (logging-passthrough) that demonstrates the contract and does nothing safety-relevant.

The full language-agnostic specification is in docs/interface-spec.md. TypeScript declarations are in interface/. The reference hook is in examples/logging-passthrough.ts.

What this is not

This repository ships only the interface skeleton. It does not ship:

  • spend caps, quantity caps, allow-lists, blocked-parameter lists, or precondition enforcement;
  • prompt-injection classifiers, content filters, or jailbreak detectors;
  • the UnboxAPI production rule library;
  • production context schemas (brand voice, operational rules, sales context, product knowledge) or their LLM-generated values;
  • EU AI Act disclosure wrappers or audit-log integrity guarantees.

Those live in the proprietary UnboxAPI runtime at https://unboxapi.pro.

Status

v0.1.0 — interface-only public release.

  • The interface shape is stable for v0.x. Breaking changes will bump the minor version pre-1.0 and the major version post-1.0.
  • External contributions are not accepted at v0.1.0. Please open an Issue with feedback; pull requests will be closed. A CLA and contribution guide will be added in a later release.

Quick look

// interface/safety-hook.d.ts (excerpt)
export interface SafetyHook {
  before(ctx: ToolCallContext): Promise<HookOutcome> | HookOutcome;
  after?(ctx: ToolCallContext, response: Readonly<unknown>):
    Promise<HookOutcome> | HookOutcome;
}

export interface HookOutcome {
  allowed: boolean;
  reason?: string;
  sanitizedArgs?: Record<string, unknown>;
}
// examples/logging-passthrough.ts (excerpt — NOT A SAFETY CONTROL)
export const loggingPassthrough: SafetyHook = {
  before(ctx) {
    console.log(`[safetykit] before ${ctx.toolName}`);
    return { allowed: true };
  },
};

See docs/interface-spec.md for the full contract.

Threat model (READ THIS)

This repository's threat surface is documented in docs/threat-model.md and covers three explicit risks: (a) the skeleton being mistaken for a production safety control, (b) prompt-injection / rule-bypass paths through any implementation of this interface, (c) supply-chain integrity of the published artifact.

The headline risk is (a): a consumer wires loggingPassthrough into a real agent pipeline and assumes they have a safety layer. They do not. The "Not production safety" banner above, the file-level comments on the example hook, and the in-line assert on import-time exist to make this mis-assumption as hard as possible.

Consumer hardening (READ THIS TOO)

If you implement this interface in your own runtime, you MUST:

  1. Validate every field of args against your own schema before acting. This interface does not constrain argument shapes.
  2. Treat string fields (reason, any metadata value, the response payload) as untrusted user data. Never inline-concatenate hook outputs into LLM system prompts; pass as bounded, escaped user-role content.
  3. Enforce spend, quantity, and authorization caps at the call site, independent of any hook decision. Hooks are advisory unless your own code makes them load-bearing.
  4. Time-bound every before / after call (e.g. 250 ms hard timeout). A misbehaving hook implementation must not stall the tool surface.
  5. Apply your own audit-log integrity controls (signed log entries, append- only storage). The interface does not provide an audit log.
  6. Cap the size of any string returned by a ContextProvider (e.g. ≤ 4 KiB) before concatenating into a prompt.

See SECURITY.md for the coordinated-disclosure policy and threat-model summary.

Verifying release artifacts

Releases are signed and Sigstore-attested. To verify a release tarball:

cosign verify-attestation \
  --type slsaprovenance \
  --certificate-identity-regexp '.*' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  unboxapi-safetykit-v0.1.0.tar.gz

A CycloneDX SBOM is attached to every GitHub release.

License

Apache License, Version 2.0. See LICENSE and NOTICE.

Maintainers

This repository is maintained by UnboxAPI. The proprietary UnboxAPI runtime, production safety rule library, vetted vertical SemanticMaps, and compliance tooling are at https://unboxapi.pro.

Related

About

Interface skeleton for safety-hook + context-injection layer. NOT a production safety control.

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors