Commit 6b3aa95
authored
Fix loading jars with encoded paths (#2759)
Fixes #2576
and #2103, which are both
that the model assembler can't load jars if the jar's path has
percent-encoded characters. This may happen for artifacts resolved by
coursier (see linked issues).
The model assembler can load Smithy files from a jar - model discovery.
This works by looking for a 'META-INF/smithy/manifest'
file within the jar, which contains the paths of smithy model files
within the jar. You can give the model assembler a class loader to use
for model discovery, but you can also add a jar to discover models from
using a regular `addImport`. The problem is with the latter case.
Because jars are like zip files, you can't read files within the jar
using regular file system operations. Instead, you can either use the
JarFile api to open the jar, and read specific entries (like a zip
file), or you can directly read specific entries via a special URL with
the format `jar:file:/path/to/jar.jar!/path/in/jar`. The latter is the
way that our model discovery apis work. The problem was that URL doesn't
perform any encoding or decoding itself, so callers/consumers have to
take care of it, which we were not doing. For example, if the jar file
has a path of `/foo/bar%45baz.jar`, we would create a URL like:
`jar:file:/foo/bar%45baz.jar!/META-INF/smithy/manifest`, and when the
JDK tried to use this URL to read from the file system, it would decode
it and attempt to read `/foo/[email protected]`. Since we weren't encoding the
URL to begin with, this was only a problem when the path contained `%`.
The URL class has a section in its javadoc about
this footgun, strongly recommending using `Path::toURI`, `URI::toURL`,
when you want to make URLs from paths. And in more recent jdk versions,
URL constructors are deprecated in favor of `URI::toURL`. I found that
the `Path::toURI` part is particularly important, because it handles
encoding for paths correctly, and makes sure windows paths are handled
properly. I couldn't use `URI::toURL` though, because the manifest path
needs to be appended, and the scheme changed to `jar:` (technically the
`file:/...` is all the "scheme-specific part"). So instead, when creating
a manifest URL from a path, `createSmithyJarManifestUrl` will now create
a `Path` and convert that to `URI` to get an encoded `file:/` URI that
is also correct for windows, then constructs the final `URL` using the
string form of that `URI`. This should still meet `URL`'s requirement
that it be created with an already-correct URL.
Another thing to note is that `createSmithyJarManifestUrl` can accept
stuff like `file:/foo.jar` or `jar:file:/foo.jar`. This change does
not apply to those - they aren't paths obviously, and we don't know
if they were properly encoded by the caller. If they were, trying to
extract the path and do the same path -> URI -> URL conversion would
result in a double-encoded URL that doesn't point to the correct
location, and trying to instead just do URI -> URL is pointless (also
`jar:file:/foo.jar` can't be directly turned into a `URI` - the `jar:`
scheme requires the `!/` part). So for these, the previous behavior
is kept, and I added a bit to the javadoc to say they need to be valid
URLs.
This technically changes the `SourceLocation::filename` for jars that
are `addImport`ed, when the jar path has reserved characters (like
spaces). Such jars were importable before because URL doesn't validate
encoding, and the JDK decoding the URL would be a no-op since it wasn't
encoded to begin with. So we could _just_ encode `%`, meaning only the
newly-allowed paths would have any encoding. Disregarding the fact that
this side-steps the assumed pre-conditions of URL, it would actually
exacerbate an existing inconsistency between `SourceLocation::filename`
of models discovered on the classpath, and models discovered from
imports. The JDK encodes the URLs of resources on the classpath, so
their models' source locations will be encoded. There was an additional
inconsistency on windows, where imported jar models would have a URL
like `jar:file:C:\foo\bar.jar!/META-INF/smithy/foo.smithy`, which the
jdk could still gracefully read from, but is incorrect and inconsistent
with discovered models which correctly look like
`jar:file:/C:/foo/bar.jar!/META-INF/smithy/foo.smithy`. So this change
means that all `jar:file:` `SourceLocation::filename`s will look the
same and are well-formed URLs/URIs.
As a consequence of changing `SourceLocation::filename`, I had to fix
some code in PluginContext which was relying on being able to compare
the source location filename with a stringified version of each
configured source path, in order to figure out which shapes/metadata
are part of sources. This was working because 1. Encoded paths didn't
work at all, and 2. `jar:file:` filenames on windows had windows path
separators. So I updated this comparison code to compare `jar:file:`
filenames to a string version of each configured source's _URI_.
The fact that the specific format of the filename was being relied on is
mildly concerning, but considering the fact that models discovered on
the classpath already had a different format, I think this is ok. I
would also be surprised if there's a lot of code out there manually
importing jars, instead of providing a classloader.1 parent 5bd4acf commit 6b3aa95
File tree
8 files changed
+213
-12
lines changed- smithy-build/src
- main/java/software/amazon/smithy/build
- test/java/software/amazon/smithy/build/plugins
- smithy-model/src
- main/java/software/amazon/smithy/model/loader
- test
- java/software/amazon/smithy/model/loader
- resources/software/amazon/smithy/model/loader/assembler-valid-jar/META-INF
- smithy
8 files changed
+213
-12
lines changedLines changed: 26 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| |||
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
| 41 | + | |
40 | 42 | | |
41 | 43 | | |
42 | 44 | | |
| |||
50 | 52 | | |
51 | 53 | | |
52 | 54 | | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
53 | 64 | | |
54 | 65 | | |
55 | 66 | | |
| |||
223 | 234 | | |
224 | 235 | | |
225 | 236 | | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
226 | 252 | | |
227 | 253 | | |
228 | 254 | | |
| |||
Lines changed: 85 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
| 14 | + | |
13 | 15 | | |
| 16 | + | |
| 17 | + | |
14 | 18 | | |
| 19 | + | |
| 20 | + | |
15 | 21 | | |
| 22 | + | |
16 | 23 | | |
17 | 24 | | |
18 | 25 | | |
| |||
26 | 33 | | |
27 | 34 | | |
28 | 35 | | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
29 | 49 | | |
30 | 50 | | |
31 | 51 | | |
| |||
236 | 256 | | |
237 | 257 | | |
238 | 258 | | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
239 | 324 | | |
Lines changed: 22 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| 17 | + | |
16 | 18 | | |
17 | 19 | | |
18 | 20 | | |
| |||
180 | 182 | | |
181 | 183 | | |
182 | 184 | | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
183 | 188 | | |
184 | 189 | | |
185 | 190 | | |
186 | 191 | | |
187 | 192 | | |
188 | | - | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
189 | 210 | | |
190 | 211 | | |
191 | 212 | | |
192 | 213 | | |
193 | 214 | | |
194 | | - | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | 215 | | |
205 | 216 | | |
206 | 217 | | |
| |||
Lines changed: 53 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1459 | 1459 | | |
1460 | 1460 | | |
1461 | 1461 | | |
| 1462 | + | |
| 1463 | + | |
| 1464 | + | |
| 1465 | + | |
| 1466 | + | |
| 1467 | + | |
| 1468 | + | |
| 1469 | + | |
| 1470 | + | |
| 1471 | + | |
| 1472 | + | |
| 1473 | + | |
| 1474 | + | |
| 1475 | + | |
| 1476 | + | |
| 1477 | + | |
| 1478 | + | |
| 1479 | + | |
| 1480 | + | |
| 1481 | + | |
| 1482 | + | |
| 1483 | + | |
| 1484 | + | |
| 1485 | + | |
| 1486 | + | |
| 1487 | + | |
| 1488 | + | |
| 1489 | + | |
| 1490 | + | |
| 1491 | + | |
| 1492 | + | |
| 1493 | + | |
| 1494 | + | |
| 1495 | + | |
| 1496 | + | |
| 1497 | + | |
| 1498 | + | |
| 1499 | + | |
| 1500 | + | |
| 1501 | + | |
| 1502 | + | |
| 1503 | + | |
| 1504 | + | |
| 1505 | + | |
| 1506 | + | |
| 1507 | + | |
| 1508 | + | |
| 1509 | + | |
| 1510 | + | |
| 1511 | + | |
| 1512 | + | |
| 1513 | + | |
| 1514 | + | |
1462 | 1515 | | |
Lines changed: 19 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
16 | 17 | | |
17 | 18 | | |
18 | 19 | | |
| |||
132 | 133 | | |
133 | 134 | | |
134 | 135 | | |
| 136 | + | |
| 137 | + | |
135 | 138 | | |
136 | | - | |
| 139 | + | |
137 | 140 | | |
138 | 141 | | |
139 | 142 | | |
| |||
147 | 150 | | |
148 | 151 | | |
149 | 152 | | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
150 | 168 | | |
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
Lines changed: 5 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
0 commit comments