Opened 17 months ago

Closed 2 months ago

#2130 closed defect (fixed)

http proxy should silently drop illegal characters + trailing text in URI rather than mark as invalid

Reported by: Reportage Owned by: zzz
Priority: minor Milestone: 0.9.40
Component: apps/i2ptunnel Version: 0.9.32
Keywords: http proxy Cc:
Parent Tickets:

Description (last modified by Reportage)

Proxy generates error when square braces are featured in URL; URL is correctly parsed when not using proxy. In the event that an illegal character is detected, it would be helpful to identify the offending character in the proxy error message.

Warning: Invalid Request URI

The request URI is invalid, and probably contains illegal characters. If you clicked a link, check the end of the URI for any characters the browser has mistakenly added on.

Subtickets

Change History (12)

comment:1 Changed 17 months ago by Reportage

  • Description modified (diff)

comment:2 Changed 17 months ago by zzz

  • Component changed from other to apps/i2ptunnel
  • Resolution set to not a bug
  • Status changed from new to closed

https://www.ietf.org/rfc/rfc2396.txt
https://docs.oracle.com/javase/6/docs/api/java/net/URI.html

RFC 2396 sec. 2.4.3 and 3.3
"[]" are excluded chars and must be escaped in the path or query

if you think I'm reading it wrong, feel free to reopen

comment:3 Changed 17 months ago by Reportage

  • Resolution not a bug deleted
  • Status changed from closed to reopened

You're right about the characters, they are technically illegal. The bug was incorrectly described. The expected behavior when the proxy identifies an illegal character is to mirror the behavior of an unproxied browser, which appears to be to drop everything in the URL trailing and including the illegal character.

Compare: http://uj3wazyk5u4hnvtk.onion/torrent/19562797/Vikings.S05E07.HDTV.x264-KILLERS[ettv] and http://thepiratebay.i2p/torrent/19562797/Vikings.S05E07.HDTV.x264-KILLERS[ettv]

So, to summarize:

  • Illegal characters in URLs passed through the proxy should be dropped, and anything trailing
  • A warning level log entry could indicate which characters have been silently dropped and the resulting modified URL

comment:4 Changed 17 months ago by Reportage

  • Summary changed from http proxy marks legitimate characters in URI as invalid to http proxy should silently drop illegal characters + trailing text in URI rather than mark as invalid

comment:5 Changed 17 months ago by zzz

  • Priority changed from major to minor

So you're asking that we be 'liberal in what we accept'.

Some history:
We used to do our own URI parsing in the proxy. That led to a lot of bugs, corner cases, and possible security issues. So we switched to the Java URI class quite a while ago. It happens to be a strict parser.

One result is that we now rely on browsers to be strict in what they output. I think this is reasonable, they have years and years of experience in dealing with malformed input and fixing it up into something compliant.

Unfortunately you didn't say what browser you are testing with. I did some testing using Firefox 57.0.3 on linux. I didn't see that it ever dropped illegal chars and things trailing, at least when entering in the URI bar. I didn't try creating a html page with bad links and clicking on them. So I don't see any basis for your proposal, and silently dropping things could have other security issues.

Interestingly, trac's URI parser for this page includes the [ but drops the ] , see your comment 3 above.

What I did find is that for the most part, firefox does do the fixup. Illegal chars are %-encoded before sending to the proxy.

But for the 8 chars in the 'unwise' section of RFC 2396 2.3.4, it's a grab bag:

{}` are %-encoded
[]| are passed thru unchanged resulting in URI error in our proxy
\ is converted to /

RFC 2396 has been replaced by 3986 but does not appear to affect things.
I checked the javadocs for URI in Java 7, 8, and 9, and they all still reference 2396, like the javadoc I linked in comment 2.

Here's an example of the exception thrown from URI, we could output the message part of it to the error page, do you think that would be an improvement?

java.net.URISyntaxException: Illegal character in path at index 17: http://stats.i2p/]foo
	at java.net.URI$Parser.fail(URI.java:2848)
	at java.net.URI$Parser.checkChars(URI.java:3021)
	at java.net.URI$Parser.parseHierarchical(URI.java:3105)
	at java.net.URI$Parser.parse(URI.java:3053)
	at java.net.URI.<init>(URI.java:588)
	at net.i2p.i2ptunnel.I2PTunnelHTTPClient.clientConnectionRun(I2PTunnelHTTPClient.java:485)
	at net.i2p.i2ptunnel.I2PTunnelClientBase$BlockingRunner.run(I2PTunnelClientBase.java:807)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

In summary: Firefox does the fixup for us, for all but a few chars.
I don't know why firefox doesn't fixup []| chars for us. For further research.

comment:6 Changed 17 months ago by Reportage

I'm testing in Firefox, which does display the offending characters in the URL bar, though from testing urls containing square braces, it appears that they aren't parsed/required to reach the url, which is why I suggested dropping the character + trailing text to mirror the behavior I'm seeing in Firefox.

e.g. http://uj3wazyk5u4hnvtk.onion/torrent/19562797/Vikings.S05E07.HDTV.x264-KILLERS[ettv] can also be reached via http://uj3wazyk5u4hnvtk.onion/torrent/19562797/Vikings.S05E07.HDTV.x264-KILLERS which suggests that the everything after and including the opening square brace is being displayed in the URL bar but silently dropped before it's parsed.

comment:7 Changed 17 months ago by zzz

Research (not the easiest thing to google):
Chrome doesn't escape []. Firefox used to but doesn't now.

https://stackoverflow.com/questions/40568/are-square-brackets-permitted-in-urls
https://bugs.chromium.org/p/chromium/issues/detail?id=74424

It may be that they don't escape [] in paths since they are allowed to wrap IPv6 host addresses, so it would be a little more difficult.

If we do a fixup, it would only be to escape the three chars that firefox doesn't, unless we find others.

comment:8 Changed 17 months ago by zzz

Added exception text to error page (and eepget command line output) in 7153b4300ac4ec2708b21e6834e46c067aa55af8 to be 0.9.32-18

comment:9 Changed 17 months ago by zzz

  • Milestone changed from undecided to 0.9.33
  • Resolution set to fixed
  • Status changed from reopened to closed

escape []| chars in path or query in 318433e6dbf4af518cc72271a9c0b5fe62917ded 0.9.32-18

comment:10 Changed 2 months ago by val

  • Resolution fixed deleted
  • Status changed from closed to reopened

Firefox 65 and Chromium 71 does not url-encode curly brackets ({}) in GET requests, which I2P client HTTP tunnel can't handle. Re-opening this ticket.

comment:11 Changed 2 months ago by zzz

  • Milestone changed from 0.9.33 to 0.9.40
  • Owner set to zzz
  • Status changed from reopened to accepted

comment:12 Changed 2 months ago by zzz

  • Resolution set to fixed
  • Status changed from accepted to closed

This was only an issue with the curly brackets in the query string. Firefox seemed to escape them in the path.
In 7812513c6f456db6a1a7e73a4a2bdf77c5359ba7 0.9.39-1

Note: See TracTickets for help on using tickets.