LLM 멀티 에이전트 기반 웹 모의해킹 자동화 프레임워크

Claude Agent SDK를 활용한 5개 에이전트 협업 구조의 웹 보안 진단 자동화 시스템

프로젝트 한 줄 요약

OWASP Juice Shop, DVWA 같은 승인된 웹 애플리케이션을 대상으로, 정찰·계획·후보 검증·판단·보고서 작성을 5개 LLM 에이전트가 협업해 수행하는 자동화 프레임워크.

핵심 차별점

본 프로젝트는 LLM을 무제한 신뢰하지 않는다. 현재 MVP 스켈레톤은 오케스트레이터가 각 실행 지점에서 evaluate_pretooluse를 선적용해 안전 결정을 강제하며, SDK hook wiring은 다음 단계에서 통합한다.

범위 외 URL·검색어 차단 (WebFetch / WebSearch / Playwright navigate 모두 검사)
Destructive Bash 명령 차단 (rm, dd, shutdown, mkfs 등)
카탈로그 forbidden_payloads 코드 레벨 강제 (LLM 프롬프트와 이중 방어)
에이전트별 HTTP 메서드 화이트리스트 (Recon은 GET/HEAD + 로그인 POST 예외만)
Write 경로는 output/ 하위만 허용
비용 상한 초과 시 다음 호출 발생 전에 즉시 block
차단·승인 결정은 output/run-log.md에 감사 로그로 기록

상세는 05-safety-filter.md. 본 안전 장치가 본 프로젝트의 가장 큰 설득력이다.

빠른 개요

목적: 웹 모의해킹 절차를 LLM 멀티 에이전트로 자동화
트랙: K-Shield 주니어 모의해킹·취약점 진단 트랙 수료 과제
기간: 2026년 4월 27일 ~ 5월 13일 (17일)
1차 데모: 2026년 4월 29일 (수요일 회의)
MVP 게이트: 2026년 5월 7일 (D11)
발표·제출: 2026년 5월 13일 (수, D17)
팀 구성: 4명
기술 스택: Claude Agent SDK + Playwright MCP + Python

문서 구성

00-glossary.md — 용어집 (finding/verdict/candidate 등 핵심 용어 정의, 표현 원칙)
01-project-overview.md — 프로젝트 전체 기획
02-architecture.md — 시스템 구조와 데이터 흐름
03-multi-agent-pattern.md — 멀티 에이전트 패턴 분석
04-agent-design.md — 5개 에이전트 상세 설계
05-safety-filter.md — 위험 명령 차단 규칙
06-test-catalog.md — 검증 가능한 테스트 항목
07-development-guide.md — 개발 환경과 SDK 사용법
08-mvp-roadmap.md — 일정과 역할 분담
09-collaboration-playbook.md — 협업 시작 절차와 병렬 개발 규칙
10-consistency-audit.md — 배포 전 문서·코드 정합성 감사 리포트

역할 분담안 (팀 합의안)

역할	핵심 책임	주요 산출물	우선 읽을 문서
오케스트레이터/통합 담당	전체 파이프라인 실행 흐름 구성, 에이전트 입출력 연결, 비용/오류/실행 로그 처리	orchestrator, 실행 로그, 비용 로그	01, 02, 04, 07, 09
Recon/Executor 담당	Playwright 기반 페이지/endpoint/form/parameter 수집, 정책 통과 범위 내 제한 검증, evidence 저장	recon_output, execution_result, evidence	02, 04, 05, 06, 07
Planner/Verifier 담당	Recon 결과 + 카탈로그 매핑 기반 test_plan 생성, Executor 결과 분석 및 confidence 판단	test_plan, verification_result, findings	02, 04, 06, 07
Safety/Reporter/QA/문서 담당	정책 규칙/마스킹/차단 조건 정의, Safety Filter 테스트, 보고서 초안 생성, 문서 정합성 관리	policy, 테스트 결과, report, 문서 수정	01, 04, 05, 07, 08, 09

현재 스캐폴딩 구조

multi-agent-pentest-framework/
├── docs/                         # 설계/개발 문서
├── orchestrator.py               # 메인 진입점
├── agents/                       # 5개 에이전트 스텁
│   ├── recon.py
│   ├── planner.py
│   ├── executor.py
│   ├── verifier.py
│   └── reporter.py
├── prompts/                      # 에이전트 시스템 프롬프트
├── safety/                       # 안전 필터(Safety Filter) 모듈
│   ├── filter.py
│   ├── patterns.py
│   ├── rate_limiter.py
│   └── masking.py
├── catalog/                      # 테스트 카탈로그
│   └── tests.yaml
├── mappings/                     # 주통기 항목 매핑
│   └── kisa_mapping.yaml
├── targets/                      # 대상 정의
│   ├── juice-shop.yaml
│   └── dvwa.yaml
├── policies/                     # 승인 범위·정책
│   └── default-policy.yaml
├── utils/                        # 공통 유틸
│   ├── config.py
│   ├── contracts.py
│   ├── logging.py
│   ├── cost_tracker.py
│   └── subagent.py
├── scripts/                      # 실행 보조 스크립트
│   ├── hello_agent.py
│   ├── check_baseline.py
│   └── check_smoke.py
├── tests/                        # 계약 테스트/fixture
│   ├── test_agent_contracts.py
│   └── fixtures/danger_cases.yaml
├── CONTRIBUTING.md               # 협업 규칙
├── requirements.txt
├── pyproject.toml
├── Makefile
├── .env.example
└── output/                       # 실행 결과
    ├── recon_output.json
    ├── test_plan.json
    ├── findings.json
    ├── run-log.md
    ├── report.md
    ├── execution_results/
    └── evidence/

팀 협업 바로 시작

# 1) 가상환경 + 의존성
make setup

# 2) 환경 변수 파일 생성
cp .env.example .env

# 3) 협업 기준선 점검
make baseline
make test

# 4) 실행 스모크 점검 (선택)
make smoke

# 5) 스켈레톤 실행
.venv/bin/python orchestrator.py \
  --target targets/juice-shop.yaml \
  --policy policies/default-policy.yaml \
  --mapping mappings/kisa_mapping.yaml

협업 규칙: CONTRIBUTING.md
병렬 개발 플레이북: 09-collaboration-playbook.md
계약 기준선 점검(결정적): scripts/check_baseline.py (make baseline)
실행 스모크 점검(환경 의존): scripts/check_smoke.py (make smoke)
구조 기준선: tests/test_repository_structure.py가 make test에 포함되어 필수 디렉터리/파일 누락을 점검

한계 및 면책 사항

본 도구는 학습 및 포트폴리오 목적으로 개발됩니다.
자동 생성된 finding은 확정 취약점이 아니라 후보이며, 최종 확정은 분석가 검토를 통해 수행합니다.
본 도구는 승인된 테스트 환경에서만 사용해야 하며, 무단 점검을 금지합니다.
자세한 한계는 01-project-overview.md 마지막 섹션을 참조하세요.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM 멀티 에이전트 기반 웹 모의해킹 자동화 프레임워크

프로젝트 한 줄 요약

핵심 차별점

빠른 개요

문서 구성

역할 분담안 (팀 합의안)

현재 스캐폴딩 구조

팀 협업 바로 시작

한계 및 면책 사항

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
agents		agents
catalog		catalog
docs		docs
mappings		mappings
output		output
policies		policies
prompts		prompts
safety		safety
scripts		scripts
targets		targets
tests		tests
utils		utils
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Makefile		Makefile
README.md		README.md
orchestrator.py		orchestrator.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

LLM 멀티 에이전트 기반 웹 모의해킹 자동화 프레임워크

프로젝트 한 줄 요약

핵심 차별점

빠른 개요

문서 구성

역할 분담안 (팀 합의안)

현재 스캐폴딩 구조

팀 협업 바로 시작

한계 및 면책 사항

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages