Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,15 @@ dist/
# Environments
.env
.venv
venv/

# Python caches / local test artifacts
__pycache__/
*.py[cod]
.pytest_cache/
.hypothesis/
data/run/

# API keys
doubao_api.txt
doubao_api.txt
.gitnexus
61 changes: 59 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ As shown above, our method produces results that are more accurate, visually ali
- `html_generator.py`: Takes the detected component data and generates a complete HTML layout with generated code for each module.
- `image_replacer.py`: A script to replace placeholder divs in the final HTML with actual cropped images.
- `mapping.py`: Maps the detected UIED components to logical page regions.
- `docs/asset-aware-html-generation.md`: Design plan for asset-aware screenshot-to-HTML generation and handoff to a reference UI asset system.
- `requirements.txt`: Lists all the necessary Python dependencies for the project.
- `doubao_api.txt`: API key file for the Doubao model (should be kept private and is included in `.gitignore`).

Expand Down Expand Up @@ -112,6 +113,64 @@ As shown above, our method produces results that are more accurate, visually ali

The typical workflow is a multi-step process as follows:

### Convert a Custom Screenshot to a Rough Asset-Aware HTML Preview

For the current ScreenCoder handoff workflow, use the asset-aware preview path when you need a fast end-to-end result from a UI design image without calling a model API. It reads the input image size, builds a semantic UI schema, extracts crop-source assets, and writes a standalone HTML preview plus handoff files for `web-ui-reference`:

```bash
python screen_to_schema.py \
--image data/input/menu.png \
--output data/run/menu/screen-schema.json \
--preview data/run/menu/index.html \
--handoff data/run/menu/handoff.md \
--reference-json data/run/menu/reference-handoff.json
```

Optional hosted OCR can enrich region content with recognized text blocks. The preferred open-source-model path is PaddleOCR PP-OCRv4 hosted on Replicate:

```bash
export REPLICATE_API_TOKEN=...
python screen_to_schema.py \
--image data/input/menu.png \
--ocr-provider replicate_paddleocr \
--ocr-output data/run/menu/ocr.json \
--output data/run/menu/screen-schema.json \
--preview data/run/menu/index.html \
--handoff data/run/menu/handoff.md \
--reference-json data/run/menu/reference-handoff.json
```

If the token is missing, ScreenCoder records an OCR manual gate in the schema handoff notes and still writes the non-OCR preview package.

Key outputs:

- `screen-schema.json`: structured page/region/component/asset-role schema.
- `asset-crops/`: source-image snippets for `crop-source` roles.
- `index.html`: rough standalone HTML/CSS reconstruction preview.
- `handoff.md` and `reference-handoff.json`: component/asset intake notes for `web-ui-reference`.

This path is intentionally approximate: ScreenCoder identifies and packages an editable first draft; `web-ui-reference` remains responsible for final reusable component implementations, asset licensing, visual regression, and Godot contracts.

### Convert a Custom Screenshot through the Legacy Model/UIED Pipeline

Put the target screenshot in the project, for example:

```bash
mkdir -p data/input
cp your-menu.png data/input/menu.png
```

Then run the unified pipeline:

```bash
python screen_to_html.py \
--image data/input/menu.png \
--work-dir data/run/menu \
--api-key doubao_api.txt
```

The final HTML is written to `data/run/menu/menu_layout_final.html`. Intermediate files in the same directory include detected layout boxes, gray placeholder HTML, UIED component boxes, mapping overlays, and cropped image assets.

1. **Initial Generation with Placeholders:**
Run the Python script to generate the initial HTML code for a given screenshot.
- Block Detection:
Expand Down Expand Up @@ -157,5 +216,3 @@ The typical workflow is a multi-step process as follows:
## Acknowledgements

This project builds upon several outstanding open-source efforts. We would like to thank the authors and contributors of the following projects: [UIED](https://github.com/MulongXie/UIED), [DCGen](https://github.com/WebPAI/DCGen), [Design2Code](https://github.com/NoviScl/Design2Code)


32 changes: 25 additions & 7 deletions UIED/run_single.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
from os.path import join as pjoin
import argparse
import cv2
import os
import numpy as np
import multiprocessing
from pathlib import Path


def resize_height_by_longest_edge(img_path, resize_length=800):
Expand Down Expand Up @@ -30,6 +32,13 @@ def color_tips():


if __name__ == '__main__':
parser = argparse.ArgumentParser(description="Run UIED component detection for one screenshot.")
parser.add_argument("--image", default="data/input/test4.png", help="Input screenshot path.")
parser.add_argument("--output-root", default="data/tmp", help="Output root directory.")
parser.add_argument("--output-json", default=None, help="Expected or copied output JSON path.")
parser.add_argument("--show", action="store_true", help="Show debug windows when supported.")
args = parser.parse_args()

# Set multiprocessing start method to 'spawn' for macOS compatibility.
# This must be done at the very beginning of the main block.
try:
Expand Down Expand Up @@ -61,14 +70,14 @@ def color_tips():
key_params = {'min-grad':10, 'ffl-block':5, 'min-ele-area':50,
'merge-contained-ele':True, 'merge-line-to-paragraph':False, 'remove-bar':True}

# set input image path
input_path_img = 'data/input/test4.png'
output_root = 'data/tmp'
input_path_img = args.image
output_root = args.output_root

resized_height = resize_height_by_longest_edge(input_path_img, resize_length=800)
color_tips()
if args.show:
color_tips()

is_ip = False
is_ip = True
is_clf = False
is_ocr = False
is_merge = False
Expand All @@ -90,7 +99,7 @@ def color_tips():
classifier['Elements'] = CNN('Elements')
# classifier['Noise'] = CNN('Noise')
ip.compo_detection(input_path_img, output_root, key_params,
classifier=classifier, resize_by_height=resized_height, show=False)
classifier=classifier, resize_by_height=resized_height, show=args.show)

if is_merge:
import detect_merge.merge as merge
Expand All @@ -99,4 +108,13 @@ def color_tips():
compo_path = pjoin(output_root, 'ip', str(name) + '.json')
ocr_path = pjoin(output_root, 'ocr', str(name) + '.json')
merge.merge(input_path_img, compo_path, ocr_path, pjoin(output_root, 'merge'),
is_remove_bar=key_params['remove-bar'], is_paragraph=key_params['merge-line-to-paragraph'], show=True)
is_remove_bar=key_params['remove-bar'], is_paragraph=key_params['merge-line-to-paragraph'], show=args.show)

if args.output_json:
expected = Path(output_root) / "ip" / f"{Path(input_path_img).stem}.json"
requested = Path(args.output_json)
if expected.exists() and expected.resolve() != requested.resolve():
requested.parent.mkdir(parents=True, exist_ok=True)
requested.write_text(expected.read_text())
if not requested.exists():
raise FileNotFoundError(f"UIED output JSON was not created: {requested}")
154 changes: 154 additions & 0 deletions asset_roles.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
"""Asset role constants and prefab defaults for asset-aware HTML generation."""

from __future__ import annotations


ALLOWED_ASSET_ROLES = frozenset(
{
"ritual-background",
"ornate-border",
"paper-panel",
"brass-plate",
"red-seal",
"corner-rivet",
"symbolic-icon",
"portrait-frame",
"quick-card-illustration",
"title-logotype",
"ink-divider",
"blood-smear",
"rope-knot",
"hanging-tag",
"progress-petal",
"damage-scratch",
}
)


ALLOWED_ASSET_STRATEGIES = frozenset(
{
"css-procedural",
"crop-source",
"reference-asset",
"manual-art",
}
)


_DEFAULT_STRATEGY_BY_ROLE = {
"ritual-background": "css-procedural",
"ornate-border": "reference-asset",
"paper-panel": "reference-asset",
"brass-plate": "reference-asset",
"red-seal": "css-procedural",
"corner-rivet": "reference-asset",
"symbolic-icon": "reference-asset",
"portrait-frame": "reference-asset",
"quick-card-illustration": "crop-source",
"title-logotype": "manual-art",
"ink-divider": "css-procedural",
"blood-smear": "css-procedural",
"rope-knot": "reference-asset",
"hanging-tag": "reference-asset",
"progress-petal": "css-procedural",
"damage-scratch": "css-procedural",
}


PREFAB_ASSET_ROLE_HINTS = {
"ritual-title-stack": [
"title-logotype",
"ink-divider",
"red-seal",
"blood-smear",
],
"ornate-action-plaque": [
"paper-panel",
"brass-plate",
"ornate-border",
"red-seal",
"corner-rivet",
],
"journey-status-panel": [
"paper-panel",
"ornate-border",
"hanging-tag",
"progress-petal",
"damage-scratch",
],
"relic-quick-card": [
"paper-panel",
"portrait-frame",
"quick-card-illustration",
"symbolic-icon",
"corner-rivet",
],
"title-stack": [
"title-logotype",
"ink-divider",
"red-seal",
],
"primary-actions": [
"paper-panel",
"brass-plate",
"ornate-border",
"red-seal",
"corner-rivet",
],
"recent-run-panel": [
"paper-panel",
"ornate-border",
"hanging-tag",
"progress-petal",
"damage-scratch",
],
"quick-links": [
"paper-panel",
"symbolic-icon",
"rope-knot",
"hanging-tag",
],
"corner-system-actions": [
"brass-plate",
"corner-rivet",
"symbolic-icon",
],
"background-ornament": [
"ritual-background",
"ornate-border",
"blood-smear",
"damage-scratch",
],
}


DEFAULT_PREFABS = frozenset(PREFAB_ASSET_ROLE_HINTS)


def normalize_asset_role(role: str) -> str:
"""Normalize and validate an asset role name."""

if not isinstance(role, str):
raise TypeError("asset role must be a string")

normalized = role.strip().lower().replace("_", "-")
if normalized not in ALLOWED_ASSET_ROLES:
raise ValueError(f"unknown asset role: {role!r}")

return normalized


def default_roles_for_prefab(prefab: str) -> list[str]:
"""Return default asset roles for a prefab, or an empty list when unknown."""

if not isinstance(prefab, str):
raise TypeError("prefab must be a string")

return list(PREFAB_ASSET_ROLE_HINTS.get(prefab.strip(), ()))


def default_strategy_for_role(role: str) -> str:
"""Return the default generation strategy for an asset role."""

normalized = normalize_asset_role(role)
return _DEFAULT_STRATEGY_BY_ROLE[normalized]
Loading