Skip to content

Conversation

@juj
Copy link
Collaborator

@juj juj commented Oct 21, 2025

One of the optimizations of Closure compiler is to choose the string quote character that results in fewer escaping in the string itself.

I.e. Closure compiler will turn

var x = '\'\'\'\'\'';
var y = "\"\"\"\"\"";

into

var x = "'''''";
var y = '"""""';

by selecting the string quote char that allows the contents to have fewer escapse.

However, in the SINGLE_FILE mode, we emit the Wasm code inside the .js file after Closure has run. So all Closure sees is

  return binaryDecode("<<< WASM_BINARY_DATA >>>");

when optimizing.

This PR implements the same smart string quote selection optimization directly into the binaryEncode() function, by checking if there are fewer 's or "s in the binary content that is to be encoded.

juj added 2 commits October 22, 2025 00:30
This is an automatic change generated by tools/maint/rebaseline_tests.py.

The following (1) test expectation files were updated by
running the tests with `--rebaseline`:

```
codesize/test_codesize_hello_single_file.json: 5404 => 5366 [-38 bytes / -0.70%]

Average change: -0.70% (-0.70% - -0.70%)
```
@juj juj force-pushed the optimize_binary_encoding_string_quotes branch from 4757d23 to b9068ad Compare October 22, 2025 00:26
@juj
Copy link
Collaborator Author

juj commented Oct 27, 2025

Ping, any thoughts here?

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if its worth the extra complexity here in the code generatation.

Is it worth those 7 bytes?

BTW, are you making use of this in Unity? Do you, or you users, use the -sSINGLE_FILE setting a lot? (And why?)

@juj
Copy link
Collaborator Author

juj commented Oct 27, 2025

I wonder if its worth the extra complexity here in the code generatation.

Is it worth those 7 bytes?

The 7 bytes is only in a test case that itself is only 5400 bytes in size.

Running the encoder on one of the smallest .wasm files from Unity output:

  • old encoding: 1,924,603 bytes
  • new encoding: 1,896,794 bytes (-27,809 bytes)

So the wins do scale with the size of the .wasm file, so not just a constant 7 bytes.

BTW, are you making use of this in Unity? Do you, or you users, use the -sSINGLE_FILE setting a lot? (And why?)

Yes. The SINGLE_FILE output from Unity is for use in the MRAID standard. What they do is standardize on the CDN storage and lifecycle of an interactive (playable) ad.

The restrictions that the CDN people state is that the ads need to be a single .html file containing everything embedded in it (no extra XHRs or Fetches allowed), and the size limit is <= 5MB uncompressed.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation.

@juj juj merged commit 4fc1036 into emscripten-core:main Oct 27, 2025
32 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants