Opened 3 years ago
Closed 2 years ago
#2130 closed defect (fixed)
http proxy should silently drop illegal characters + trailing text in URI rather than mark as invalid
Reported by: | Reportage | Owned by: | zzz |
---|---|---|---|
Priority: | minor | Milestone: | 0.9.40 |
Component: | apps/i2ptunnel | Version: | 0.9.32 |
Keywords: | http proxy | Cc: | |
Parent Tickets: | Sensitive: | no |
Description (last modified by )
Proxy generates error when square braces are featured in URL; URL is correctly parsed when not using proxy. In the event that an illegal character is detected, it would be helpful to identify the offending character in the proxy error message.
Warning: Invalid Request URI The request URI is invalid, and probably contains illegal characters. If you clicked a link, check the end of the URI for any characters the browser has mistakenly added on.
Subtickets
Change History (12)
comment:1 Changed 3 years ago by
Description: | modified (diff) |
---|
comment:2 Changed 3 years ago by
Component: | other → apps/i2ptunnel |
---|---|
Resolution: | → not a bug |
Status: | new → closed |
comment:3 Changed 3 years ago by
Resolution: | not a bug |
---|---|
Status: | closed → reopened |
You're right about the characters, they are technically illegal. The bug was incorrectly described. The expected behavior when the proxy identifies an illegal character is to mirror the behavior of an unproxied browser, which appears to be to drop everything in the URL trailing and including the illegal character.
Compare: http://uj3wazyk5u4hnvtk.onion/torrent/19562797/Vikings.S05E07.HDTV.x264-KILLERS[ettv] and http://thepiratebay.i2p/torrent/19562797/Vikings.S05E07.HDTV.x264-KILLERS[ettv]
So, to summarize:
- Illegal characters in URLs passed through the proxy should be dropped, and anything trailing
- A warning level log entry could indicate which characters have been silently dropped and the resulting modified URL
comment:4 Changed 3 years ago by
Summary: | http proxy marks legitimate characters in URI as invalid → http proxy should silently drop illegal characters + trailing text in URI rather than mark as invalid |
---|
comment:5 Changed 3 years ago by
Priority: | major → minor |
---|
So you're asking that we be 'liberal in what we accept'.
Some history:
We used to do our own URI parsing in the proxy. That led to a lot of bugs, corner cases, and possible security issues. So we switched to the Java URI class quite a while ago. It happens to be a strict parser.
One result is that we now rely on browsers to be strict in what they output. I think this is reasonable, they have years and years of experience in dealing with malformed input and fixing it up into something compliant.
Unfortunately you didn't say what browser you are testing with. I did some testing using Firefox 57.0.3 on linux. I didn't see that it ever dropped illegal chars and things trailing, at least when entering in the URI bar. I didn't try creating a html page with bad links and clicking on them. So I don't see any basis for your proposal, and silently dropping things could have other security issues.
Interestingly, trac's URI parser for this page includes the [ but drops the ] , see your comment 3 above.
What I did find is that for the most part, firefox does do the fixup. Illegal chars are %-encoded before sending to the proxy.
But for the 8 chars in the 'unwise' section of RFC 2396 2.3.4, it's a grab bag:
{}` are %-encoded
[]| are passed thru unchanged resulting in URI error in our proxy
\ is converted to /
RFC 2396 has been replaced by 3986 but does not appear to affect things.
I checked the javadocs for URI in Java 7, 8, and 9, and they all still reference 2396, like the javadoc I linked in comment 2.
Here's an example of the exception thrown from URI, we could output the message part of it to the error page, do you think that would be an improvement?
java.net.URISyntaxException: Illegal character in path at index 17: http://stats.i2p/]foo at java.net.URI$Parser.fail(URI.java:2848) at java.net.URI$Parser.checkChars(URI.java:3021) at java.net.URI$Parser.parseHierarchical(URI.java:3105) at java.net.URI$Parser.parse(URI.java:3053) at java.net.URI.<init>(URI.java:588) at net.i2p.i2ptunnel.I2PTunnelHTTPClient.clientConnectionRun(I2PTunnelHTTPClient.java:485) at net.i2p.i2ptunnel.I2PTunnelClientBase$BlockingRunner.run(I2PTunnelClientBase.java:807) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
In summary: Firefox does the fixup for us, for all but a few chars.
I don't know why firefox doesn't fixup []| chars for us. For further research.
comment:6 Changed 3 years ago by
I'm testing in Firefox, which does display the offending characters in the URL bar, though from testing urls containing square braces, it appears that they aren't parsed/required to reach the url, which is why I suggested dropping the character + trailing text to mirror the behavior I'm seeing in Firefox.
e.g. http://uj3wazyk5u4hnvtk.onion/torrent/19562797/Vikings.S05E07.HDTV.x264-KILLERS[ettv] can also be reached via http://uj3wazyk5u4hnvtk.onion/torrent/19562797/Vikings.S05E07.HDTV.x264-KILLERS which suggests that the everything after and including the opening square brace is being displayed in the URL bar but silently dropped before it's parsed.
comment:7 Changed 3 years ago by
Research (not the easiest thing to google):
Chrome doesn't escape []. Firefox used to but doesn't now.
https://stackoverflow.com/questions/40568/are-square-brackets-permitted-in-urls
https://bugs.chromium.org/p/chromium/issues/detail?id=74424
It may be that they don't escape [] in paths since they are allowed to wrap IPv6 host addresses, so it would be a little more difficult.
If we do a fixup, it would only be to escape the three chars that firefox doesn't, unless we find others.
comment:8 Changed 3 years ago by
Added exception text to error page (and eepget command line output) in 7153b4300ac4ec2708b21e6834e46c067aa55af8 to be 0.9.32-18
comment:9 Changed 3 years ago by
Milestone: | undecided → 0.9.33 |
---|---|
Resolution: | → fixed |
Status: | reopened → closed |
escape []| chars in path or query in 318433e6dbf4af518cc72271a9c0b5fe62917ded 0.9.32-18
comment:10 Changed 2 years ago by
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Firefox 65 and Chromium 71 does not url-encode curly brackets ({}) in GET requests, which I2P client HTTP tunnel can't handle. Re-opening this ticket.
comment:11 Changed 2 years ago by
Milestone: | 0.9.33 → 0.9.40 |
---|---|
Owner: | set to zzz |
Status: | reopened → accepted |
comment:12 Changed 2 years ago by
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
This was only an issue with the curly brackets in the query string. Firefox seemed to escape them in the path.
In 7812513c6f456db6a1a7e73a4a2bdf77c5359ba7 0.9.39-1
https://www.ietf.org/rfc/rfc2396.txt
https://docs.oracle.com/javase/6/docs/api/java/net/URI.html
RFC 2396 sec. 2.4.3 and 3.3
"[]" are excluded chars and must be escaped in the path or query
if you think I'm reading it wrong, feel free to reopen