Skip to content

feat: add content moderation module#745

Open
Deywumi-debug wants to merge 1 commit into
rinafcode:mainfrom
Deywumi-debug:feat/content-moderation
Open

feat: add content moderation module#745
Deywumi-debug wants to merge 1 commit into
rinafcode:mainfrom
Deywumi-debug:feat/content-moderation

Conversation

@Deywumi-debug
Copy link
Copy Markdown
Contributor

Summary

Adds a ModerationModule that scans submitted content for inappropriate material before it reaches the platform. All checks run on every POST /moderation/check request and any violation auto-rejects the content.

Changes

File Change
src/modules/moderation/moderation.service.ts Core moderation logic
src/modules/moderation/moderation.controller.ts POST /moderation/check endpoint
src/modules/moderation/moderation.module.ts Module wiring with HttpModule
src/modules/moderation/dto/moderate-content.dto.ts Input DTO
src/modules/moderation/dto/moderation-result.dto.ts Output DTO
src/modules/moderation/moderation.service.spec.ts 14 unit tests
src/app.module.ts Register module behind ENABLE_MODERATION flag

Acceptance Criteria

  • OpenAI moderation API — calls POST https://api.openai.com/v1/moderations via HttpService; requires OPENAI_API_KEY env var; skipped gracefully if absent or on network failure
  • Profanity filtering — local regex patterns matching word stems, case-insensitive, no external dependency
  • Spam detection — four signals: 10+ repeated characters, 3+ URLs, known spam phrases, >70% uppercase
  • Auto-reject obvious violations — any triggered flag sets allowed: false, autoRejected: true, and a human-readable reason

API

POST /moderation/check
Content-Type: application/json

{ "content": "string (max 10,000 chars)" }

Response:

{
  "allowed": false,
  "autoRejected": true,
  "flags": ["spam", "profanity"],
  "reason": "detected as spam; contains profanity"
}

Configuration

OPENAI_API_KEY=sk-...     # optional — local checks still run without it
ENABLE_MODERATION=true    # feature flag, default: true

Testing

14 unit tests covering: clean content pass-through, profanity (case-insensitive), spam signals (repeated chars, URL flooding, spam phrases, excessive caps), OpenAI flagged/not-flagged/error/no-key, auto-reject with reason, and multiple flag accumulation.

npm test -- src/modules/moderation/moderation.service.spec.ts --no-coverage

Notes

  • ModerationService is exported from the module so other modules (courses, posts, etc.) can inject it to gate content submission at the service layer.
  • OpenAI failures degrade gracefully — a network error does not cause a false rejection.
  • The module is feature-flagged (ENABLE_MODERATION) and can be disabled without touching application code.
    closes Add automated content safety scanning #560

- OpenAI moderation API integration (POST /moderation/check)
- Local profanity filtering with stem-matching regex patterns
- Spam detection (repeated chars, URL flooding, spam phrases, excessive caps)
- Auto-reject on any violation with flags and human-readable reason
- ModerationService exported for use by other modules
- 14 unit tests, all passing
- Gated behind ENABLE_MODERATION feature flag
@RUKAYAT-CODER
Copy link
Copy Markdown
Contributor

Kindly resolve conflict and fix workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add automated content safety scanning

2 participants