This project runs a GUI app in a Docker container on an isolated X server (Xvfb), exposes it through VNC/noVNC, and records/replays mouse/keyboard automation against the app window.
The included example is Firefox (docker-compose-firefox.yml), but the recording/replay flow is generic and can be reused for other GUI apps by changing compose environment values.
- Containerized GUI stack:
Xvfb+ window manager +x11vnc+ noVNC - Host-side recorder:
record-macro.py - Host-side replayer:
replay-macro.py - Timed macro converter/player:
timed_xmacro.py - Screenshot assertions during replay via
Check ...lines (check-image.sh) #APPmetadata stored inside recordings so replay can find/start the right window
Start the container:
docker compose -f docker-compose-firefox.yml up --buildOpen the GUI in your browser:
http://localhost:6080/vnc.html
The compose file starts Firefox and exposes the isolated display through noVNC.
Record against the Firefox compose file:
./record-macro.py docker-compose-firefox.ymlDuring recording:
- Open noVNC (
http://localhost:6080/vnc.html) - Interact with the app
- Press
Pauseto stop recording (default stop key) - Press
F2anytime to capture a screenshot and insert aCheck ...line (default screenshot/check key)
Default hotkeys:
STOP_KEYSYM=PauseCHECK_KEYSYM=F2
Override if your browser/keyboard/noVNC intercepts them:
STOP_KEYSYM=F9 CHECK_KEYSYM=F3 ./record-macro.py docker-compose-firefox.ymlNotes:
record-macro.pyauto-arms the recorder and injects the stop key forxmacrorec2.CHECK_KEYSYMandSTOP_KEYSYMmust be different.F2is the default to avoid Firefox focus behavior triggered byF6.
For docker-compose-firefox.yml, the derived run name is firefox, so outputs go to:
recordings/firefox/firefox.xmacrorecordings/firefox/firefox-check-001.pngrecordings/firefox/firefox-check-002.png
Replay the recorded macro:
./replay-macro.py docker-compose-firefox.ymlOptional speed override:
REPLAY_SPEED=2.0 ./replay-macro.py docker-compose-firefox.yml
REPLAY_SPEED=0.5 ./replay-macro.py docker-compose-firefox.ymlreplay-macro.py:
- Loads
recordings/<run-name>/<run-name>.xmacro - Reads
#APPmetadata from the recording - Focuses/positions the app window in the container
- Replays actions using
xdotool - Executes screenshot checks when it encounters
Check ...lines
If a Check fails, replay exits non-zero.
Recorded macros include metadata lines like:
#APP startcommand='firefox https://browserbench.org/Speedometer3.1/ --no-default-browser-check'
#APP windowtitle=Firefox
#APP windowclass=firefox
These are used by replay (and screenshot checks) to find the app window by class/title and optionally start the app if the window is missing.
Supported fields:
startcommand(optional): shell command to launch inside the containerwindowtitle: matcher forxdotool search --namewindowclass: matcher forxdotool search --class
record-macro.py reads defaults from the compose file environment:
APP_STARTCOMMANDAPP_WINDOW_TITLEAPP_WINDOW_CLASS
You can override them at record time with host environment variables:
APP_STARTCOMMAND='xterm' \
APP_WINDOW_TITLE='xterm' \
APP_WINDOW_CLASS='xterm' \
./record-macro.py docker-compose-firefox.ymlDuring recording, pressing F2 inserts a line like:
Check firefox/firefox-check-001.png
During replay, the container script:
- Finds the app window (using
#APPclass/title metadata) - Captures the current window image
- Compares it to the reference image
- Fails replay if the RMSE exceeds the threshold
Default threshold:
CHECK_MAX_RMSE=0.01
Example override:
CHECK_MAX_RMSE=0.02 ./replay-macro.py docker-compose-firefox.ymlIgnore dynamic regions (toolbars, timestamps, animations):
CHECK_IGNORE_RECT=0,0,420,40 ./replay-macro.py docker-compose-firefox.ymlMultiple rectangles (semicolon-separated):
CHECK_IGNORE_RECT="0,0,420,40;300,580,120,30" ./replay-macro.py docker-compose-firefox.ymlRectangle format is x,y,width,height.
The recorded .xmacro files are timed macros generated by timed_xmacro.py and include:
#WAIT_SEC <seconds>comments between events#APP ...metadata lines- xmacro events (
MotionNotify,ButtonPress,KeyStrPress, etc.) Check <path>.pngassertion lines
The container can force the app window to a fixed size/position before replay/checking. Configure in compose environment:
AUTO_POSITION(1or0)WINDOW_XWINDOW_YWINDOW_WIDTHWINDOW_HEIGHTAPP_WINDOW_CLASSAPP_WINDOW_TITLE
This helps keep screenshots and click coordinates stable.
timed_xmacro.py also exposes subcommands directly:
record: readxmacrorec2output and write timed macroreplay: emit timed xmacro lines to stdoutreplay-xdotool: emit normalizedxdotoolactions to stdoutapp-meta: read effective#APPmetadata from a macro file
Example:
python3 timed_xmacro.py app-meta --input recordings/firefox/firefox.xmacro --format tsv- If the stop key does not work in noVNC/browser, set
STOP_KEYSYMto another key (for exampleF9). - If screenshot/check hotkey affects app UI (Firefox
F6focuses toolbar), use a differentCHECK_KEYSYM. The default isF2. - If screenshot checks fail due to minor UI variation, increase
CHECK_MAX_RMSEslightly or mask dynamic regions withCHECK_IGNORE_RECT. - If window geometry or theme changes move UI elements, rebuild/restart and re-record the macro.
- X access control is disabled (
Xvfb -ac) for local convenience. - VNC/noVNC is exposed without a password in this demo setup.
- Use only on a trusted local machine/network unless you harden the configuration.