Skip to content

Add research assistant prompt safety guard#315

Open
sravan27 wants to merge 1 commit into
SCIBASE-AI:mainfrom
sravan27:codex/research-assistant-prompt-safety-16
Open

Add research assistant prompt safety guard#315
sravan27 wants to merge 1 commit into
SCIBASE-AI:mainfrom
sravan27:codex/research-assistant-prompt-safety-16

Conversation

@sravan27
Copy link
Copy Markdown

Summary

/claim #16

Adds a focused research-assistant-prompt-safety-guard module for the AI-Powered Research Assistant Suite. This protects AI peer-review/reproducibility packets from malicious manuscript or supplement instructions and unsupported AI-generated review claims before reviewer packets are shown.

This is intentionally separate from existing broad assistant, evidence/protocol trace, statistics review, citation-context, benchmark leakage, figure/table, uncertainty, grant-fit, limitations-disclosure, and supplement-readiness slices.

What it checks

  • visible prompt-injection instructions in manuscript text
  • hidden/suppressed instruction channels in supplements
  • unsafe requests to expose prompts or secrets
  • hallucinated citation/evidence anchors in assistant findings
  • accept/reject recommendations that conflict with unresolved evidence blockers

Validation

  • node research-assistant-prompt-safety-guard/test.js -> 5 tests passed
  • node research-assistant-prompt-safety-guard/demo.js -> status=quarantine_assistant_packet, blockers=6, warnings=1
  • node --check research-assistant-prompt-safety-guard/index.js research-assistant-prompt-safety-guard/sample-data.js research-assistant-prompt-safety-guard/test.js research-assistant-prompt-safety-guard/demo.js -> passed
  • git diff --check -> passed
  • ffprobe confirmed reports/demo.mp4 is H.264, 1280x720, 12s

Artifacts

  • research-assistant-prompt-safety-guard/reports/prompt-safety-packet.json
  • research-assistant-prompt-safety-guard/reports/prompt-safety-report.md
  • research-assistant-prompt-safety-guard/reports/summary.svg
  • research-assistant-prompt-safety-guard/reports/demo.mp4

Synthetic data only; no credentials, private data, external APIs, or model calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant