Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions research-assistant-prompt-safety-guard/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Research Assistant Prompt Safety Guard

This module adds a focused safety gate for the AI-Powered Research Assistant Suite in issue #16. It protects auto peer-review and reproducibility packets from two failure modes that are specific to AI research assistants:

- hostile or hidden manuscript/supplement instructions that try to steer the assistant or reveal private prompts
- unsupported assistant-generated review claims, missing evidence anchors, and hallucinated citations before a reviewer packet is shown

The implementation is dependency-free and uses synthetic data only. It does not call external APIs, read private files, or require credentials.

## Run

```bash
node research-assistant-prompt-safety-guard/test.js
node research-assistant-prompt-safety-guard/demo.js
```

The demo writes:

- `reports/prompt-safety-packet.json`
- `reports/prompt-safety-report.md`
- `reports/summary.svg`
- `reports/demo.mp4`

## Scope

The guard evaluates manuscript text, supplements, declared claims, artifacts, citation corpus entries, and draft assistant review findings. It emits:

- prompt-injection findings from visible and hidden manuscript channels
- evidence-support checks for assistant review findings
- blocker/warning/action summaries for reviewer packet readiness
- a deterministic Markdown reviewer report and SVG summary

This is intentionally separate from prior issue #16 slices such as broad assistant suites, protocol/evidence traces, statistics review, citation context reconciliation, benchmark leakage audits, figure/table consistency, uncertainty calibration, grant-fit review, limitations disclosure, and supplement readiness.
39 changes: 39 additions & 0 deletions research-assistant-prompt-safety-guard/acceptance-notes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Acceptance Notes

This PR is designed as a narrow, reviewable slice for issue #16 rather than a broad research-assistant clone.

## Distinctness

The scope is prompt-injection and assistant-output safety for AI peer-review packets. It avoids duplicating previous issue #16 submissions around:

- broad assistant suites
- evidence or protocol trace modules
- statistics review
- research-gap replication planning
- rebuttal packs
- ethics/data availability
- citation-context reconciliation
- benchmark leakage
- figure/table consistency
- analysis-variable provenance
- domain review templates
- grant fit
- limitations disclosure
- uncertainty calibration
- supplement readiness

## Verification

Expected local checks:

```bash
node research-assistant-prompt-safety-guard/test.js
node research-assistant-prompt-safety-guard/demo.js
node --check research-assistant-prompt-safety-guard/index.js research-assistant-prompt-safety-guard/sample-data.js research-assistant-prompt-safety-guard/test.js research-assistant-prompt-safety-guard/demo.js
git diff --check
ffprobe -v error -show_entries format=duration,size -show_entries stream=codec_name,width,height -of default=noprint_wrappers=1 research-assistant-prompt-safety-guard/reports/demo.mp4
```

## Safety

The module is synthetic-data-only and does not execute model output, shell commands from manuscript text, network requests, or credential reads.
30 changes: 30 additions & 0 deletions research-assistant-prompt-safety-guard/demo.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
'use strict';

const fs = require('fs');
const path = require('path');
const sampleData = require('./sample-data');
const {
evaluateResearchAssistantSafety,
toMarkdownReport,
toSvgSummary
} = require('./index');

const reportsDir = path.join(__dirname, 'reports');
fs.mkdirSync(reportsDir, { recursive: true });

const result = evaluateResearchAssistantSafety(sampleData);

fs.writeFileSync(
path.join(reportsDir, 'prompt-safety-packet.json'),
`${JSON.stringify(result, null, 2)}\n`
);
fs.writeFileSync(
path.join(reportsDir, 'prompt-safety-report.md'),
toMarkdownReport(result)
);
fs.writeFileSync(
path.join(reportsDir, 'summary.svg'),
toSvgSummary(result)
);

console.log(`status=${result.status}, blockers=${result.summary.blockers.length}, warnings=${result.summary.warnings.length}`);
Loading