Version 27 (modified by zzz, 8 years ago) (diff)


Comments on thesis

General comments:

*please add*

Specific comments:

Sec. 3.1:

A routerInfo does not contain a "self-signed certificate". It contains a pair of public keys and a "null certificate", which is really just a placeholder for future stuff. It is, however, signed by one of the router's private keys.

Sec. 3.3 Eepsites:

The "identifier" (what we call a "destination") is actually 387 bytes in binary and 516 characters when encoded in Base 64, not 517 bytes. Leasesets are looked up in the netDB using the 32-byte SHA256 Hash of the destination.

Sec, 4.1:

Long paths might be much harder than in ref. 22; our absolute limit is 7 hops max (due to the max length of the tunnel build request message). As of 0.8.4, here are further restrictions enforced: Router A will not build a tunnel A-A. An unmodified router B will not build a tunnel A-B-A (although a hostile B could build this tunnel). A tunnel A-B-C-A cannot be prevented even with non-hostile B and C. So the longest without cooperation would be A-B-C-A-D-E-A. More complex long paths using multiple tunnels or garlic messages may be possible, although there are also maximum message expiration times enforced.

Actually, I2P doesn't use peers from the same /16 in the same tunnel. It does allow multiples in the fast tier. Since your attack doesn't require two attackers in the same tunnel, the /16 restriction may not be relevant here.

Fig. 4.2: outbound tunnel labeled as inbound; The "monitor peers" from Fig. 4.1 with red and black stripes are now labeled "A" in this figure, which is confusing.

Sec. 5:

You say that each peer was configured for 64 KBps max but isn't that true only for the 40 attack peers? What was the bandwidth configuration for the 30 monitor peers? Was 64 KBps really high enough to be included in the victim's fast tier? Or were the monitor peers modified such that they would only provide "fast" service for the victim?

Figures 5.4 and 5.5: What's the difference between these two figures? Just two different examples?

Table 5.5:

What about 3-hop, which is the default for eepsites? 2-hop eepsites are not very secure and 1-hop is trivial. Really need data for 3-hop.

"for the duration of the measurement": How long was it? minutes, hours, days? The time-to-deanonymize would be good to include here. It isn't clear if you deanonymize in one tunnel lifetime (10 minutes) or it takes multiple successful placements of the monitor peers over a long time period.

Sec 6 Discussion:

The I2P network is still relatively small but is growing quickly. How about a prediction or sensitivity analysis for a network 10X, 100X larger? The analysis starts with "a" monitor peers out of the victim's fast pool of 30 peers. There's no analysis or discussion of how many monitor peers of a given bandwidth you need in the entire network to attain the number "a" in the victim's fast tier.

In fact, most fast peers are from a Class "O" (greater than 128 KBytes/sec) group of routers and those are about 20% of the network - so there's perhaps 400 peers that could potentially be in the fast group in today's network of 2000 - 3000 routers.

So isn't this really about an adversary taking over a large proportion of the entire network, or at least of the network's fast routers? Is I2P any more vulnerable at X % hostile peers compared to other networks? Once you have a large number of hostile fast peers in the network, is the traffic analysis of your attack any quicker or more reliable than other attacks, e.g. first and last node in a tunnel (ref: "one ping enough" paper or blog post about Tor)

Also not discussed - effect of leaseset size (number of leases or inbound tunnels) which is user-configurable from 1 to 6. It also is configurably dynamic, with less leases when the server is idle. A high number of leases makes it quicker for an adversary to enumerate the fast peers. I assume you used the default setting of 2 leases for your experimental victim.

Also not discussed - you started from the end, i.e. the identity of the victim, then your whole experiment was to confirm it. To find the victim from scratch would require another O(n) in time or resources, where n is the size of the network, i.e. you have to run the experiment on each router in the network.

Unidirectional tunnels as a "bad design decision":

A bold and absolute statement not fully supported by the paper. It clearly mitigates other attacks and it's not clear how to trade off the risk of the attack described here (either currently or after implementing one or more of the recommendations below) with attacks on a bidirectional tunnel architecture.

Also, given future increases in network size and implementation of recommendations, the tradeoff of analysis time vs. false-positive rate may change. For example, a 30x increase in analysis time for unidirectional tunnels is not insignificant. What if the fast tier size was increased to 1000? Then the time would be 1000x.

Paper's recommendations:

1) Limit churn:

Possibilities: Increase 45 sec evaluation cycle, increase 30-peer fast max and/or 75-peer high-cap max. Downside: larger group makes it more likely for an attacker to eventually be in a tunnel.

Not a possibility: Increasing 10-minute tunnel lifetime (unfortunately it is essentially hard-coded in the network now)

2) Distributed HTTP services:

This is supported via "multihoming", whereby multiple routers may host an eepsite. This requires some additional setup, and of course requires the user to operate multiple routers. Truly distributed hosting is under development through a port of the tahoe-lafs distributed file system to I2P.

3) Use random peers for leases (guard nodes):

By this you mean, I think, using random peers outside the fast tier for the inbound tunnel's gateway. We could also keep these peers semi-constant, or more stable, by attempting to recreate the same tunnel at expiration, while still changing them on rejection. This could be done either from the fast tier or by using a random peer. Perhaps each destination could maintain its own "guard tier" that changes slowly.

Benefits / downsides? What happens if an adversary attacks guard nodes (either in I2P or Tor)?

Additional possible changes to I2P not mentioned:

1) Increase resistance to low-bandwidth tunnel building DDoS attack:

Changes made in 0.8.4, more coming in 0.8.5, further research needed

2) Connection limits (limiting number of requests per client in a minute, hour, or day) are supported in i2p but not enabled by default, these limits, if enabled, would prevent a single client from making requests every 15 seconds forever, although a distributed attack would still be possible. In addition, the server sends a connection reset if the limit is exceeded, it would have to drop the request instead.

3) Would server-imposed response delays help?

4) Disallow multiples from the same /16 in the fast tier

5) Increase fast and high-capacity tier maximum sizes

Sec 7 Conclusion:

What the "devs decided":

1) Timetable of 0.8.4 release:

Released March 2, installed in 25% of network by ~March 4, 50% by ~March 6, 75% by ~March 14 (source )

2) Relevant changes in 0.8.4 release:

a) Prevent tunnel-building DOS by a single source. This was done in reaction to the attack.

b) Penalize peers more due to tunnel rejections. This did not change the time constants of the capacity formulas, just changed (a + r) to (a + 2r) in the denominator of the formula in section A.1. However it may have had the effect of reacting faster to a DOS attack. This change was not made in reaction to the attack, but was previously planned and is part of a strategy to spread the traffic across more peers in the network and adjust the forumla in response to network conditions that have changed markedly in the past two years.

3) More changes to detect and prevent DOS are upcoming in 0.8.5 (scheduled for release the week of April 18) but these are not a complete solution. A fully distributed tunnel-building DDsS is difficult to prevent completely.

Reacting to performance changes as a "bad idea":

A bold and absolute statement not fully supported by the paper. Clearly there are steps we can take to mitigate the attack, especially by detecting and reacting to the DDoS and other items discussed at the end of Section 6. Also, the network is still very small. We aren't making absolute claims about our anonymity especially when the whole network is 2000 - 3000 routers. Yes we still have work to do to ensure anonymity while doing performance-based peer selection and there are tradeoffs involved. But don't write it off as a "bad idea". A more nuanced statement would be that anonymity vs. performance is a fundamental tradeoff, and that the speed of reaction to a peer's perceived performance change affects that tradeoff, and that perhaps i2p's reaction time is too fast for the anonymity that its users expect, and should be adjusted.

Sec. A.2 Integration value:

This isn't used in I2P for anything except a display and isn't relevant to the paper. You may also wish to remove the information about the well-integrated tier from sections 3.2.4, 3.2.5, and B.1.