Skip to content

Conversation

@undead2146
Copy link
Member

@undead2146 undead2146 commented Jan 14, 2026

Overview

This PR introduces a architectural overhaul and feature expansion for GenHub's content management system. Implementing a Universal Content Pipeline and a Creator Publishing System, transforming how content is discovered, managed, and installed.

Features

1. Universal Content Pipeline

A standardized 3-tier architecture for consistent content handling:

  • Orchestration: System-wide coordination via IContentOrchestrator.
  • Providers: Source-specific facades (GitHub, ModDB, CNC Labs, AOD Maps, Generals Online).
  • Components: Plug-and-play Discoverers, Resolvers, and Manifest Factories.

2. Creator Publishing System

Empowers creators to take control of their content delivery:

  • Self-Hosted Catalogs: Creators host a PublisherCatalog (JSON) on their own infrastructure.
  • Subscription Model: Users subscribe to creator URLs to receive real-time updates.
  • Cross-Publisher Referrals: Creators can refer users to other trusted publishers within the catalog.

3. Redesigned Downloads Browser

A premium browsing experience for discovery:

  • Publisher Sidebar: Quick navigation between different content sources.
  • Source-Specific Filters: Custom filter panels (e.g., ModDB sections, CNCLabs map types) for precise searching.
  • Interactive Acquisition: Real-time progress tracking for downloads and manifest generation.

4. Advanced Web Discovery

  • Playwright Integration: Robust WAF bypass and automation detection handling for sources like ModDB.
  • Multi-Variant Handling: Support for releases containing multiple games or performance variants (30Hz vs 60Hz).

🛠 Technical Details

  • Core Interfaces: IContentDiscoverer, IContentResolver, IContentDeliverer, IPublisherManifestFactory.
  • Infrastructure: Added PlaywrightService for robust scraping.
  • UI: Redesigned DownloadsBrowserViewModel with advanced filtering logic.

📚 Documentation

New documentation available:

  • Content Pipeline Architecture
  • Creator Publishing Guide
  • Universal Parser Details

Greptile Overview

Greptile Summary

This PR delivers a major architectural transformation introducing a Universal Content Pipeline and Creator Publishing System for GenHub. The implementation standardizes content discovery, resolution, and delivery across multiple sources (GitHub, ModDB, CNC Labs, AOD Maps, Generals Online) through a 3-tier architecture with orchestration, provider-specific facades, and pluggable components.

Key Changes

  • Universal Content Pipeline: standardized IContentDiscoverer, IContentResolver, IContentDeliverer interfaces with source-specific implementations for 7+ providers
  • Creator Publishing System: self-hosted JSON catalog support enabling creators to publish content via subscriptions and cross-publisher referrals
  • Playwright Integration: robust WAF bypass using headless Chromium for scraping ModDB and other protected sources
  • Redesigned Downloads Browser: publisher sidebar navigation, dynamic filter panels (ModDB sections, CNC Labs map types), infinite scroll pagination
  • Rich Web Parsing: comprehensive ModDBPageParser extracting files, videos, images, articles, reviews, and comments from ModDB pages
  • Version Selection: VersionSelector supporting Latest/Stable/Specific policies for content releases
  • Multi-Variant Support: handling releases with multiple game variants (30Hz vs 60Hz) or performance options

Issues Found

Resource disposal issues flagged in previous review threads remain unaddressed:

  • Static _browserLock disposal in PlaywrightService (singleton but incorrect pattern)
  • Static Playwright fields in ModDBDiscoverer never disposed
  • Static HttpClient in ContentDetailViewModel never disposed
  • Instance _fileLock in PublisherSubscriptionStore not disposed

Debug markers [TEMP] present in ModDBResolver, ModDBManifestFactory, CNCLabsMapResolver.

Minor: hardcoded "ModDB" string in ModDBPageParser.cs:610 should use constant.

No dedicated tests for new PublisherCatalog, VersionSelector, or ModDBPageParser classes.

Architecture Quality

The content pipeline architecture is well-designed with clear separation of concerns. The 3-tier pattern (Orchestration → Providers → Components) provides excellent extensibility. Documentation is comprehensive with flowcharts and detailed guides. The creator publishing system enables decentralized content distribution which aligns with the project's community-driven nature.

Confidence Score: 3/5

  • safe to merge with known resource disposal issues that should be addressed in follow-up
  • architecture is solid and well-documented, but multiple resource disposal issues (static fields, semaphores, HttpClient) exist from previous review that remain unaddressed; debug markers [TEMP] need cleanup; missing test coverage for core new features (PublisherCatalog, VersionSelector, ModDBPageParser); no critical logic errors detected
  • pay attention to PlaywrightService.cs, ModDBDiscoverer.cs, ContentDetailViewModel.cs, and PublisherSubscriptionStore.cs for resource disposal issues

Important Files Changed

Filename Overview
GenHub/GenHub/Features/Content/Services/Parsers/ModDBPageParser.cs new 894-line parser for ModDB pages extracting files, videos, images, articles, reviews, comments; uses AngleSharp + Playwright; one minor hardcoded string
GenHub/GenHub/Features/Downloads/ViewModels/DownloadsBrowserViewModel.cs new 640-line downloads browser with publisher sidebar, filter panels, pagination, and parallel discovery across providers
GenHub/GenHub/Features/Content/Services/Tools/PlaywrightService.cs new 243-line Playwright service for WAF bypass; has disposal issue with static _browserLock (already flagged)
GenHub/GenHub/Features/Content/Services/ContentDiscoverers/ModDBDiscoverer.cs refactored ModDB discoverer using Playwright; static Playwright fields never disposed (already flagged); verbose debug logs
GenHub/GenHub/Features/Downloads/ViewModels/ContentDetailViewModel.cs new 556-line detail view with download, install, rich metadata display; static HttpClient disposal issue (already flagged); outdated TODO comment (already flagged)
GenHub/GenHub/Features/Content/Services/Catalog/PublisherSubscriptionStore.cs new 279-line subscription store for creator catalogs; _fileLock semaphore not disposed (already flagged)
GenHub/GenHub/Infrastructure/DependencyInjection/ContentPipelineModule.cs refactored DI module registering all content pipelines, discoverers, resolvers, deliverers; PlaywrightService registered as singleton

Sequence Diagram

sequenceDiagram
    participant User
    participant UI as DownloadsBrowserViewModel
    participant Orch as ContentOrchestrator
    participant Disc as IContentDiscoverer<br/>(ModDB/GitHub/etc)
    participant PW as PlaywrightService
    participant Res as IContentResolver
    participant Parser as ModDBPageParser
    participant MF as ManifestFactory
    participant Pool as ContentManifestPool
    participant DL as DownloadService

    User->>UI: Select Publisher + Apply Filters
    UI->>Disc: DiscoverAsync(ContentSearchQuery)
    
    alt ModDB with WAF
        Disc->>PW: FetchAndParseAsync(url)
        PW->>PW: Launch Chromium Browser
        PW-->>Disc: IDocument (AngleSharp)
    end
    
    Disc->>Parser: ParseAsync(url, html)
    Parser-->>Disc: ParsedWebPage (files, images, articles)
    Disc-->>UI: ContentDiscoveryResult (SearchResults)
    
    User->>UI: Click Download on Item
    UI->>Res: ResolveAsync(SearchResult)
    Res->>Parser: ParseAsync(detailUrl)
    Parser-->>Res: ParsedWebPage with file metadata
    Res->>MF: CreateManifestAsync(ParsedWebPage)
    MF-->>Res: ContentManifest
    Res-->>UI: OperationResult<ContentManifest>
    
    UI->>DL: DownloadFileAsync(url, progress)
    DL-->>UI: Download Progress Updates
    
    UI->>Pool: AddManifestAsync(manifest, sourceDir)
    Pool->>Pool: Store in CAS + Create Reference
    Pool-->>UI: Success
    
    UI-->>User: Download Complete
Loading

Context used (3)

  • Rule from dashboard - What: All compiler warnings and linter warnings across the entire codebase must be resolved before m... (source)
  • Context from dashboard - Use dedicated constants classes instead of hardcoding constants string, integers or variables in ser... (source)
  • Context from dashboard - Coding style used in the application which PullRequests and coding style has to be applied to. (source)

@undead2146 undead2146 changed the base branch from main to development January 14, 2026 17:15
@undead2146 undead2146 changed the title UI/downloads feat(content-pipeline): implement content pipeline and creator publishing system Jan 14, 2026
@community-outpost community-outpost deleted a comment from greptile-apps bot Jan 14, 2026
greptile-apps[bot]

This comment was marked as resolved.

…tor publishing system

- Standardize content discovery, resolution, and acquisition across multiple sources (GitHub, ModDB, CNCLabs, AODMaps).
- Introduce Creator Publishing system with self-hosted JSON catalogs and subscriptions.
- Redesign Downloads browser with publisher-specific filters and sidebar navigation.
- Implement Playwright-based scraping for robust web discovery.
- Extensive documentation and architectural flowcharts for the new system.
greptile-apps[bot]

This comment was marked as resolved.

@undead2146
Copy link
Member Author

@greptile Review this PR regarding the overal infrastructure and implementation.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

105 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

private readonly ILogger<PublisherSubscriptionStore> _logger;
private readonly IConfigurationProviderService _configurationProvider;
private readonly string _subscriptionsFilePath;
private readonly SemaphoreSlim _fileLock = new(1, 1);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: instance-level _fileLock SemaphoreSlim is never disposed

The _fileLock semaphore is created as an instance field but PublisherSubscriptionStore doesn't implement IDisposable or IAsyncDisposable. This causes a resource leak. Since the class is registered as a singleton (ContentPipelineModule.cs:127), the lock will live for the application lifetime, but it's still best practice to implement disposal.

Suggested change
private readonly SemaphoreSlim _fileLock = new(1, 1);
private readonly SemaphoreSlim _fileLock = new(1, 1);

Add IAsyncDisposable to the class and dispose the lock:

public class PublisherSubscriptionStore : IPublisherSubscriptionStore, IAsyncDisposable
{
    // ... existing code ...

    public async ValueTask DisposeAsync()
    {
        _fileLock.Dispose();
        GC.SuppressFinalize(this);
    }
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: GenHub/GenHub/Features/Content/Services/Catalog/PublisherSubscriptionStore.cs
Line: 27:27

Comment:
**style:** instance-level `_fileLock` SemaphoreSlim is never disposed

The `_fileLock` semaphore is created as an instance field but `PublisherSubscriptionStore` doesn't implement `IDisposable` or `IAsyncDisposable`. This causes a resource leak. Since the class is registered as a singleton (ContentPipelineModule.cs:127), the lock will live for the application lifetime, but it's still best practice to implement disposal.

```suggestion
    private readonly SemaphoreSlim _fileLock = new(1, 1);
```

Add `IAsyncDisposable` to the class and dispose the lock:
```csharp
public class PublisherSubscriptionStore : IPublisherSubscriptionStore, IAsyncDisposable
{
    // ... existing code ...

    public async ValueTask DisposeAsync()
    {
        _fileLock.Dispose();
        GC.SuppressFinalize(this);
    }
}
```

How can I resolve this? If you propose a fix, please make it concise.

@undead2146 undead2146 closed this Jan 24, 2026
@undead2146 undead2146 reopened this Jan 24, 2026
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

}

/// <inheritdoc />
public string ParserId => "ModDB";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hardcoded "ModDB" string should use ModDBConstants.PublisherType or define a parser ID constant

Per constants.md style guide, string literals should be defined in dedicated constants classes.

Context Used: Context from dashboard - Use dedicated constants classes instead of hardcoding constants string, integers or variables in ser... (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: GenHub/GenHub/Features/Content/Services/Parsers/ModDBPageParser.cs
Line: 610:610

Comment:
hardcoded `"ModDB"` string should use `ModDBConstants.PublisherType` or define a parser ID constant

Per constants.md style guide, string literals should be defined in dedicated constants classes.

**Context Used:** Context from `dashboard` - Use dedicated constants classes instead of hardcoding constants string, integers or variables in ser... ([source](https://app.greptile.com/review/custom-context?memory=53453b3b-b708-4856-b1b0-0cbc8bfe5330))

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant