feat: archive build/deploy logs to MinIO for post-eviction retrieval#119
Open
vigneshrajsb wants to merge 4 commits intomainfrom
Open
feat: archive build/deploy logs to MinIO for post-eviction retrieval#119vigneshrajsb wants to merge 4 commits intomainfrom
vigneshrajsb wants to merge 4 commits intomainfrom
Conversation
Adds a pino formatters.level option so logs include string severity labels (e.g. "level":"info") rather than numeric codes (e.g. "level":30). This fixes log severity mapping in Groundcover. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Build and deploy job logs are permanently lost once k8s Job pods are
evicted or TTL-expired (~24h). This adds MinIO as an optional in-cluster
object store to archive logs at completion time, serving them back to the
UI even after the live pods are gone.
## New files
- src/server/lib/objectStore/s3Client.ts
MinIO client singleton configured via MINIO_* env vars
- src/server/services/logArchival.ts
LogArchivalService with archiveLogs, getArchivedLogs, getArchivedMetadata,
listArchivedJobs, ensureBucket, configureRetention
- src/server/services/types/logArchival.ts
ArchivedJobMetadata interface
## Modified files
- src/shared/config.ts / next.config.js
Export MINIO_ENDPOINT, MINIO_PORT, MINIO_ACCESS_KEY, MINIO_SECRET_KEY,
MINIO_BUCKET, MINIO_USE_SSL (all with safe defaults)
- src/server/services/types/globalConfig.ts
Add logArchival?: { enabled: boolean; retentionDays: number } to GlobalConfig
- src/server/services/types/logStreaming.ts
Add 'Archived' to status union; add archivedLogs?: string field
- src/server/lib/nativeBuild/engines.ts
After waitForJobAndGetLogs(), archive logs when logArchival.enabled=true
Both success and error paths are covered
- src/server/lib/nativeHelm/helm.ts
Same pattern for native Helm deploy jobs
- src/server/lib/kubernetes/getNativeBuildJobs.ts
Merge archived build jobs (not present in live k8s) into the listing
Add source?: 'live' | 'archived' field to BuildJobInfo
- src/server/lib/kubernetes/getDeploymentJobs.ts
Same for deploy jobs / DeploymentJobInfo
- src/server/services/logStreaming.ts
When k8s returns NotFound, attempt archived log lookup before returning
NotFound. Returns status='Archived' with archivedLogs when found.
- helm/web-app/Chart.yaml + helm/environments/local/lifecycle.yaml
Add minio subchart dependency (disabled by default in local values)
## Storage schema
lifecycle-logs/
{namespace}/{jobType}/{serviceName}/{jobName}/
logs.txt - full log content
metadata.json - job info (status, duration, sha, engine, timestamps)
## Enabling
All archival ops are gated on globalConfig.logArchival.enabled.
Insert into global_config to activate:
{ "logArchival": { "enabled": true, "retentionDays": 14 } }
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix JobMonitor log ordering: wait for job completion before fetching logs so the full output is captured rather than a mid-run snapshot - Add startedAt/completedAt/duration to JobMonitor.getJobStatus via kubectl job JSON, thread timing through engines.ts and helm.ts so archived metadata has accurate timestamps - Upgrade live k8s jobs with no pod to source='archived' when an archive exists in MinIO, so they remain selectable in the UI - Extend logStreaming archived fallback to also trigger when the k8s job exists but its pod has been cleaned up (!podInfo.podName) - Add source field to NativeBuildJobInfo OpenAPI schema - Add MinIO helm_resource to Tiltfile; remove erroneous minio subchart dependency from helm/web-app/Chart.yaml - Add ALLOWED_ORIGINS to local lifecycle.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Build and deploy job logs are permanently lost once k8s Job pods are evicted or TTL-expired (~24h):
getNativeBuildJobs,getDeploymentJobs)NotFoundstate with no recovery pathSolution
Add MinIO as an optional in-cluster S3-compatible object store. Logs are archived at job completion time and served back transparently — the UI sees a new
Archivedstatus instead ofNotFound.Architecture
Changes
New files
src/server/lib/objectStore/s3Client.tssrc/server/services/logArchival.tssrc/server/services/types/logArchival.tsModified files
src/shared/config.tsnext.config.jssrc/server/services/types/globalConfig.tslogArchival?: { enabled, retentionDays }src/server/services/types/logStreaming.ts'Archived'status; addarchivedLogs?fieldsrc/server/lib/nativeBuild/engines.tssrc/server/lib/nativeHelm/helm.tssrc/server/lib/kubernetes/getNativeBuildJobs.tssourcefield toBuildJobInfosrc/server/lib/kubernetes/getDeploymentJobs.tssourcefield toDeploymentJobInfosrc/server/services/logStreaming.tshelm/web-app/Chart.yamlhelm/environments/local/lifecycle.yamlminio:config sectionKey design decisions
Feature-gated: all MinIO calls check
globalConfig.logArchival?.enabled. Enabling the infra (MinIO pod) is safe — nothing archives until the flag is set in DB.Non-blocking: archival failures are caught and logged as warnings — they never fail the build/deploy flow.
Deduplication: merged archived jobs are deduplicated by
jobNameagainst live k8s results, so a completing job never appears twice.Enabling
global_config:{ "logArchival": { "enabled": true, "retentionDays": 14 } }Related PRs
Test plan
pnpm lintpasses ✅pnpm ts-check— no new errors (3 pre-existing in engines.ts) ✅pnpm test— 951/951 pass ✅logArchival.enabled=false(default): system behavior identical to before, no MinIO callslogArchival.enabled=true: trigger a build, verifylogs.txt+metadata.jsonappear in MinIO bucketsource='archived'staticContent(not WebSocket)🤖 Generated with Claude Code