diff --git a/.github/workflows/weekly-tag-scan.yml b/.github/workflows/weekly-tag-scan.yml index f8f2752..8dbf601 100644 --- a/.github/workflows/weekly-tag-scan.yml +++ b/.github/workflows/weekly-tag-scan.yml @@ -41,9 +41,16 @@ jobs: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} run: node scripts/export-tags.js + - name: Create App Token + id: app-token + uses: actions/create-github-app-token@v3 + with: + client-id: ${{ vars.APP_CLIENT_ID }} + private-key: ${{ secrets.APP_PRIVATE_KEY }} + - name: Compute new tags and manage PR env: - GH_TOKEN: ${{ secrets.PR_TOKEN }} + GH_TOKEN: ${{ steps.app-token.outputs.token }} REPO_URL: ${{ github.server_url }}/${{ github.repository }} run: | # Collect lines added vs main (new tags). diff --git a/README.md b/README.md index 243c3ec..e7d8774 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # AI & Biodiversity Change Global Center Catalog -Repository for web-based ABC Center code, data, model, and spaces catalog. This catalog is designed to use the GitHub API for searching all code repositories created under the [ABC GitHub Organization](https://github.com/ABC-Center) and the Hugging Face API for searching all dataset, model, and spaces repositories created under the [ABC Hugging Face Organization](https://huggingface.co/ABC-Center). Non-ABC Org GitHub repositories can be manually included through the `ADDITIONAL_REPOS` parameter in the [config file](public/config.yaml). +Repository for web-based ABC Center code, data, model, and spaces catalog. This catalog is designed to use the GitHub API for searching all code repositories created under the [ABC GitHub Organization](https://github.com/ABC-Center) and the Hugging Face API for searching all dataset, model, and spaces repositories created under the [ABC Hugging Face Organization](https://huggingface.co/ABC-Center). Non-ABC Org GitHub and Hugging Face repositories can be manually included through the `ADDITIONAL_REPOS` and `ADDITIONAL_HF_REPOS` parameters, respectively, in the [config file](public/config.yaml). This repository was generated and personalized from the [Imageomics Catalog](https://github.com/Imageomics/catalog). The following sections are pulled from the source repo. @@ -8,7 +8,7 @@ This repository was generated and personalized from the [Imageomics Catalog](htt The website is styled using the [tailwindcss](https://tailwindcss.com/) package. -* **Real-time Data Fetching:** Displays all public Imageomics repositories, fetched through the GitHub and Hugging Face APIs. Includes semantically meaningful virtual markers: +* **Real-time Data Fetching:** Displays all public ABC Center repositories, fetched through the GitHub and Hugging Face APIs. Includes semantically meaningful virtual markers: * "New" badge highlights products created within the last 30 days; * "šŸš€ version-tag" badge indicates a new release within the last 2 weeks for GitHub repos, and links to that release; * Star (ā­ļø) or like (ā¤ļø) counts displayed for GitHub or Hugging Face repos, respectively; @@ -18,7 +18,7 @@ The website is styled using the [tailwindcss](https://tailwindcss.com/) package. * **Sorting:** Sort items by last updated, date created, stars/likes ascending or descending, or alphabetically. * **URL Parameter Support:** Persist and share search states via URL hash (`#type=datasets&q=fish`) or query parameters (`?type=datasets`). Supports `type`, `q` (search query), `sort`, and `tag` parameters. * **Responsive Design:** The layout is optimized for use on computers and mobile devices. -* **Thematic Styling:** Uses Imageomics color scheme for a cohesive look and feel. +* **Thematic Styling:** Uses ABC color scheme for a cohesive look and feel. * **Longevity:** This site is run through GitHub Pages, ensuring continued access through GitHub without needing to otherwise provision dedicated infrastructure. ## Project Structure @@ -28,8 +28,7 @@ The site runs based on four primary files: * `public/config.yaml`: Contains all customizable settings including organization names, colors, branding, and API settings. This is the main file to edit for personalization. Placed in `public/` so Vite copies it to `dist/` without bundling, keeping it editable after deployment. * `index.html`: The main HTML file that provides the structure of the webpage and links to the CSS and JavaScript files. Config values are applied dynamically from `config.yaml`. * `style.css`: Custom styling for the application, including color schemes, layout, and animations. Colors are set via CSS custom properties that are populated from `config.yaml`. -* `main.js`: Handles the application's logic, including API calls, data filtering, sorting, and dynamic rendering of the catalog items. - * **Note:** Model API calls do ***not*** return `cardData` unless explicitly fetched *by model*, so there is extra logic required to fetch Model metadata. +* `main.js`: Handles the application's logic, including config loading, API calls, data filtering, sorting, and dynamic rendering of the catalog items. Relies on the build-time Node script `fetch-releases.js` for version-tag badge feature. Two additional files support the build tooling: @@ -54,6 +53,14 @@ nvm use A `.nvmrc` file is included, so `nvm use` will automatically select the correct version in the project directory. +### Formatting Standard + +**What is needed:** VS Code "Format on Save" enabled with CSS & HTML format enabled or [linter(s)](https://github.com/caramelomartins/awesome-linters) for package languages (JavaScript, HTML, and CSS) with the following settings: + + * **Indent Size:** 4 + * **Wrap Line Length:** 120 + * **Rules:** Remove trailing whitespace and empty tabs. + ## Testing Tests are written with [Vitest](https://vitest.dev/) and run automatically in CI on every pull request to `main`. To run them locally: `npm test` (single run) or `npm run test:watch` (watch mode). diff --git a/docs/app-authentication.md b/docs/app-authentication.md new file mode 100644 index 0000000..a94aae3 --- /dev/null +++ b/docs/app-authentication.md @@ -0,0 +1,69 @@ +# App Authentication + +For this app to run the full [weekly tag scan workflow](../.github/workflows/weekly-tag-scan.yml), it must be allowed to open and modify pull requests. GitHub recommends using [GitHub Apps](https://docs.github.com/en/apps/overview) for authentication, over Personal Access Tokens (PATs). + +## Creating Your Catalog Automation App + +This App is essentially a more secure, drop-in replacement for a PAT. It is used to create a transient token with which the tag scan workflow can create or modify a PR (as per the [tag grouping process](tag-grouping-process.md)). I mostly followed [this guide](https://aembit.io/blog/replacing-a-github-personal-access-token-with-a-github-application/) while "creating" the App (registering it), but will outline the decisions based on [GitHub's instructions for Registering a GitHub App for an Organization](https://docs.github.com/en/apps/creating-github-apps/registering-a-github-app/registering-a-github-app#registering-a-github-app): + +### "Create" a Private App + +> [!NOTE] +> Only Organization Owners can complete this process. + +1. Navigate to `Organization settings > Developer settings > GitHub Apps` + +2. Select "New GitHub App" and give it a descriptive name, e.g., `ORGANIZATION_NAME Catalog Automation App`. + This way it will be unique across GitHub. + +4. Describe the App, for instance: + ``` + App for automated ORGANIZATION_NAME Catalog workflows (weekly tag-scan PR). + ``` + +5. Set the Homepage URL to your GitHub Organization (`https://github.com/ORGANIZATION_NAME`) + +Skip all remaining sections until you get to "Permissions": + +5. Expand the **Repository permissions** section and set the required permissions, specifically: + - "Contents": "Read and write" + - "Pull requests": "Read and write" + - "Metadata" is set to "Read-only" by default. Do not touch this. + +6. When asked "Where can this GitHub App be installed?", limit installation to your org by selecting "Only on this account". + +7. Select "Create GitHub App". + +**Congratulations, you've made an App for authentication!** + + The App has no associated code, no webhooks, and is private within the Organization. + +### Install your App + +Follow [GitHub's Instructions to install your App in your Org](https://docs.github.com/en/apps/using-github-apps/installing-your-own-github-app). + +> [!IMPORTANT] +> During installation, be sure to choose "Only select repositories" under "Repository Access", then select your Catalog repository from the dropdown. + +### Authenticate with your App + +Now that the App is installed in your Org with access to your Catalog repository, you need to provide the repository with the means to authenticate your App. For this to work, the App Client ID must be [stored as a variable](https://docs.github.com/en/actions/how-tos/write-workflows/choose-what-workflows-do/use-variables#defining-configuration-variables-for-multiple-workflows) and the App's private key as a [secret](https://docs.github.com/en/actions/how-tos/write-workflows/choose-what-workflows-do/use-secrets?tool=webui#creating-secrets-for-a-repository). +Together that will allow for the [Authentication as an App](https://docs.github.com/en/apps/creating-github-apps/authenticating-with-a-github-app/authenticating-as-a-github-app-installation), so the tag-scan PR is clearly automated. + +#### Get App Identifiers + +Navigate to your GitHub App settings and generate a private key ([GitHub's Instructions](https://docs.github.com/en/apps/creating-github-apps/authenticating-with-a-github-app/managing-private-keys-for-github-apps#generating-private-keys)). This will automatically download a `.pem` file to your computer. ***Keep this secure; do not share it!*** (See also [GitHub's recommendations](https://docs.github.com/en/apps/creating-github-apps/authenticating-with-a-github-app/managing-private-keys-for-github-apps#storing-private-keys) for managing private keys.) + +Before leaving this page, copy your App's "Client ID". + +#### Add Identifiers to Catalog Repo + +In your Catalog repository, navigate to `Settings > Secrets and variables > Actions`. +- Under the "Variables" Tab, create a new **repository variable**: + - Paste your App's Client ID into the "Value" box. + - Name the variable `APP_CLIENT_ID`. +- Under "Secrets", create a new **repository secret**: + - Name it "APP_PRIVATE_KEY". + - Paste your private key in the "Secret" box. Be sure to include both the header (`-----BEGIN RSA PRIVATE KEY-----`) and footer (`-----END RSA PRIVATE KEY-----`). +> [!Important] +> ***Clear your clipboard after!*** diff --git a/docs/personalization.md b/docs/personalization.md new file mode 100644 index 0000000..3a6d161 --- /dev/null +++ b/docs/personalization.md @@ -0,0 +1,89 @@ +# Personalizing Your Catalog + +Welcome to your new catalog repo! The primary way to personalize this catalog is through the `config.yaml` file, which contains all customizable settings. After using the template, you'll need to update the following: + +## Primary Configuration File + +[**`public/config.yaml`**](../public/config.yaml): This is the main file to edit. It contains all configuration options (e.g., organization names, colors, branding, and API settings) with inline comments explaining each setting. Replace Imageomics-specific values with those appropriate for your organzation and catalog repository. + +### Organization & Repository Settings + + * `ORGANIZATION_NAME`: Your GitHub/Hugging Face organization name (lowercase for API calls) + * `ORG_NAME`: Display name for your organization (can differ from API name); used as fallback site title if `CATALOG_TITLE` is not set + * `CATALOG_REPO_NAME`: Repository name for the catalog itself (used for stats badge) + +### Branding + + * `CATALOG_TITLE`: Page title and main heading + * `CATALOG_DESCRIPTION`: Subtitle/description text displayed under the title + * `LOGO_URL`: URL to your organization's logo image (used in `main.js` line 565) + * `FAVICON_URL`: URL to your favicon image (used in `index.html` line 80) + + For both `LOGO_URL` and `FAVICON_URL`, you can use an external URL, a relative path if the image is in your repo (e.g., `./images/logo.png` or `images/logo.png`), or GitHub's raw URL format (e.g., `https://github.com/username/repo/raw/branch/path/to/image.png`) + +#### Colors + + * `COLORS.primary`: Primary brand color (used for heading) + * `COLORS.secondary`: Secondary brand color (used for borders, GitHub ribbon) + * `COLORS.accent`: Accent color (used for links, focus states, "New" badge) + * `COLORS.accentDark`: Dark mode accent color (used for link hover states in dark mode) + * `COLORS.tag`: Tag background color + +### API & Behavior Settings + + * `PLATFORM`: Coding platform used (default: 'github', Codeberg and GitLab support in development) + * `API_BASE_URL`: Hugging Face API base URL (default: `"https://huggingface.co/api/"`) + * `REFRESH_INTERVAL_DAYS`: Number of days to consider an item "new" (default: `30`) + * `ADDITIONAL_REPOS`: Array of forked or non-org GitHub repositories to include, formatted `/` (non-forks are included by default). Use `[]` if there are none you wish to include + * `ADDITIONAL_HF_REPOS`: Array of Hugging Face repos from outside the org to include. Each entry specifies `repo` (`/`) and `type` (`datasets`, `models`, or `spaces`). Use `[]` if there are none you wish to include + +### Typography + + * `FONT_FAMILY`: Font family for the site (default: `"Inter"`) + +After modifying `config.yaml`, refresh your browser to see changes. The color scheme will automatically apply to all UI elements throughout the site. + +## Version and Requirements + +[**`package.json`**](../package.json): Update this file with your information and that of your catalog repository (version and URL). This file will auto-update the `package-lock.json` through `npm install`, and should have the version updated for new releases. + +In this file, update: + +- [ ] name: What is the name of this repository? +- [ ] version*: What software version is *this* catalog (start with 1.0.0)? +- [ ] description: Describe your catalog repo: what is the org it represents? +- [ ] repository URL: URL for the repository hosting your catalog. +- [ ] author: Who is the repo creator? +- [ ] bug URL: Link to the repository issue tracker or other reporting mechanism. +- [ ] homepage URL: Link to repository README. + +*Version will need to be updated each time you release a new version of your catalog. + +### Versioning + +When releasing a new version, be sure to run the following in the repo root, then push the updated `package-lock.json` to your repository. + +```console +npm install +``` + +### Local Preview + +To preview the production build locally, in the repo root, run: + +```console +npm run preview +``` + +Then open the local URL printed by Vite (typically ) in your browser of choice. + +## Setting Up Tag Groups + +Tags from GitHub topics and Hugging Face card metadata are free-form text, so the same concept often appears under multiple spellings (`computer-vision`, `computer vision`, `cv`). Tag groups normalize these into a single canonical tag shown in the filter dropdown, and are configured in `public/tag-groups.js`. + +When first setting up your catalog, run the export script to generate a full list of your organization's current raw tags (saved to `scripts/tag-export.txt`), then use that list to build your initial `tag-groups.js`. A weekly GitHub Actions workflow will automatically open a pull request whenever 5 or more new tags (relative to the last committed baseline in `scripts/tag-export.txt`) are detected, keeping your tag groups up to date over time. + +> [!IMPORTANT] +> **Required token**: The weekly tag scan workflow requires a fine-grained access token with **Pull requests: Read and write** permission on the catalog repo. Follow the instructions in [App Authentication](app-authentication.md) to create and install a private Catalog Automation App for token generation. + +See **[tag-grouping-process.md](tag-grouping-process.md)** for full setup instructions, conventions, and guidance on using AI assistance for the initial grouping pass. diff --git a/docs/tag-grouping-process.md b/docs/tag-grouping-process.md index e1277d6..b69a121 100644 --- a/docs/tag-grouping-process.md +++ b/docs/tag-grouping-process.md @@ -20,7 +20,7 @@ const TAG_GROUPS = { - The **value array** lists every raw API tag that should be normalized to that key. - Raw tags not present in any array pass through unchanged and appear as-is in the UI. - Raw tags that contain a colon (e.g. `license:mit`, `format:parquet`) are automatically - filtered out as Hugging Face system metadata so they never reach the UI. This can be changed in [main.js](../main.js), in the `normalizeTag` function. + filtered out as Hugging Face system metadata so they never reach the UI. This can be changed in [src/normalizeTag.js](../src/normalizeTag.js), by removing the marked option line. - Raw tags are maintained and matched-against for keyword searching and do appear in repo cards. --- @@ -71,9 +71,8 @@ the GitHub and Hugging Face APIs, diffs them against the committed baseline in `scripts/tag-export.txt`, and opens (or updates) a pull request titled **`[Tag Scan] New tags detected — review tag-groups.js`** whenever 5 or more new tags appear. -> **Prerequisite:** The workflow requires a fine-grained PAT stored as a repository secret named -> `PR_TOKEN` with **Pull requests: Read and write** permission. Without it, the workflow will find -> new tags and push the branch successfully, but fail when attempting to open the PR. +> [!IMPORTANT] +> **Required token**: The weekly tag scan workflow requires a fine-grained access token with **Pull requests: Read and write** permission on the catalog repo. Follow the instructions in [App Authentication](app-authentication.md) to create and install a private Catalog Automation App for token generation. You should update `public/tag-groups.js` when that PR is opened or updated. diff --git a/index.html b/index.html index 6863ce8..0630243 100644 --- a/index.html +++ b/index.html @@ -23,20 +23,18 @@ - - +