wiki:petconpaper

Version 5 (modified by anonymous, 10 years ago) (diff)

Peer Profiling and Selection in I2P

PET-CON 2009.1
DRAFT - last update zzz feb. 26 17:00 UTC
zzz@i2pmail.org
Note - this looks like a one-hour talk, Pet-con limit is 10-15m, so a little editing is needed... Maybe this is the outline for a paper, not a 10m talk.

Overview of I2P

    - echelon does this part
  • Unidirectional tunnels, garlic (~= onion) routing, hidden services...
  • Started by jrandom in 2003; vanished in late 2007
  • ~800 routers == peers == users at any given time
  • ~50-100 new routers/day
  • 7 releases in 2008 (pic from stats.i2p)
  • Total usage currently: ~10MBps, ~40K tunnel hops
  • Network growth: doubled in last 6 months (pic from stats.i2p)
  • Hidden services oriented - P2P friendly - only one HTTP outproxy (~= "exit node")

Tunnel Overview

    - echelon will do this part
  • Client tunnels - for all user-generated traffic
  • Exploratory tunnels - low bw, used by the router for tunnel builds, tunnel tests, netdb queries... and for "exploration", as we will see below...
  • Tunnel lifetime 10m
  • Selected by each peer
  • Strict ordering within a tunnel (predecessor attack)
  • Tunnel build request can be rejected or dropped for a number of reasons
  • Periodic testing of each tunnel (~1KB per minute)

NetDB Overview

    - echelon will write this
  • RouterInfo for each router, and LeaseSet (~= "hidden service descriptor"), not discussed here
  • K/L/M/N/O bw classes ( <12 / 12+ / 32+ / 64+ / 128+ KBps of configured shared capacity) for each router
  • Information only, not used for any routing decisions...
  • ...Except that K routers are not used
  • Lots of stats in there for network debugging, but NOTHING is trusted or used except IP/port (and "K")
  • Serious anonymity and DOS implications if we trust claimed capacity
  • L is the default, so about 96.5% of routers are L-O, and route for others (pic from stats.i2p)

Peer Profiling and Tiers

  • Continuous gathering of stats, per-peer and per-tunnel
  • Keep totals and average of each stat for several time periods (1m, 10m, 1h, 1d)
  • Re-sort into tiers every 30s
  • Stats are persistent across restarts

Peer selection within I2P is simply the process of choosing which routers on the network to route locally-generated messages (and replies to those messages) to go through. In other words, which peers are used for tunnels. To accomplish this, the router maintains a database of each peer's performance, called the "profile". The router uses that data to estimate the bandwidth of a peer, how often a peer will to accept tunnel build requests, and whether a peer seems to be overloaded or otherwise unable to reliably perform.

The profile contains several statistics, continuously updated. For each statistic, averages, event counts, and maximums for 1 minute, 10 minute, 1 hour, and 24 hour periods are available. The profiles are also stored locally on disk, so that they are persistent across restarts. Also, the profile stores several timestamps, such as the last time a peer was heard from, the last time it accepted a tunnel request, and many more.

Each peer has a set of data points collected about them, including statistics about how long it takes for them to reply to a network database query, how often their tunnels fail, and how many new peers they are able to introduce us to, as well as simple data points such as when we last heard from them or when the last communication error occurred.

Much, if not most, of the profile data are unused in the current software. It remains a "reserve" defense that can be easily used to enhance the router's resistance to theoretical or actual attacks, such as denial-of-service attempts, selective packet drops, network database (floodfill) disruption, and others.

  • 3 Tiers:
    • Not Failing Tier: peers is pretty much everybody that is responsive; typ 300-500 peers
    • High Capacity Tier: peers are Not Failing and have above-average tunnels-built-per-hour; typ 10-30 peers
    • Fast Tier: peers are High Capacity and have above-average bandwidth-per-tunnel; typ 8-15 peers
  • Use only my own traffic for these measurements, not routed traffic (we don't know if the peer before or after us in a tunnel is a true participant or the originator or destination, so that data wouldn't be valid)
  • The two metrics above are relative, not absolute, as they depend on traffic;
  • Each tier contains 8 minimum, peers are 'promoted' if the tier is too small (pic: sample table from profiles.jsp)
    Peer (385, hiding 51)Groups (Caps)SpeedCapacityIntegrationFailing? 
    ++ 16BDe7Fast, High Capacity (OR 0.6.5)16,582.8837.470.00 netDb/profile/+-
    ++ 2Mnb~bFast, High Capacity (OR 0.6.5)2,944.3740.740.00 netDb/profile/+-
    ++ 2hHaG5Fast, High Capacity (OR 0.6.5)7,103.6134.300.00 netDb/profile/+-
    ++ CIIICFFast, High Capacity (OR 0.6.5)18,248.6814.64397.00 netDb/profile/+-
    ++ KQ~TdxFast, High Capacity (OR 0.6.5)8,025.648.970.00 netDb/profile/+-
    ++ M92fGWFast, High Capacity (OR 0.6.5)2,623.8644.220.00 netDb/profile/+-
    ++ g247zUFast, High Capacity (OR 0.6.5)18,113.8282.020.00 netDb/profile/+-
    ++ nF6ArZFast, High Capacity (OR 0.6.5)1,776.9846.540.00 netDb/profile/+-

    ++ 1oDXZQHigh Capacity (OR 0.6.5)3,722.1810.450.00 4/20 Test Fails netDb/profile/+-
    ++ BE-r13High Capacity (OR 0.6.5)4,504.2637.510.00 8/56 Test Fails netDb/profile/+-
    ++ LHxb70High Capacity (LR 0.6.5)2,459.3819.630.00 netDb/profile/+-
    ++ OPZ3i5High Capacity (LR 0.6.5)321.747.390.00 netDb/profile/+-
    ++ UM5WvvHigh Capacity (NR 0.6.5)2,137.3011.030.00 netDb/profile/+-
    ++ er9GCLHigh Capacity (LR 0.6.5)207.266.810.00 netDb/profile/+-
    ++ vXE9dGHigh Capacity (LR 0.6.5)241.7416.220.00 netDb/profile/+-
    ++ xHKlu4High Capacity (MR 0.6.5)0.007.400.00 netDb/profile/+-

    ++ -L9FoJNot Failing (OR 0.6.5)626.845.210.00 netDb/profile/+-
       -Nvi-8Not Failing (LR 0.6.5)0.001.000.00 netDb/profile/+-
       -dlmDxNot Failing (OR 0.6.5)2,290.465.090.00 netDb/profile/+-
       -qO3ahNot Failing (LU 0.6.5)0.005.000.00 netDb/profile/+-
    ++ -yUfotNot Failing (KR 0.6.5)0.005.000.00 netDb/profile/+-
       02YO6lNot Failing (LR 0.6.5)0.005.000.00 Unreachable netDb/profile/+-
       064wSqNot Failing (LU 0.6.5)0.005.000.00 netDb/profile/+-
       08vuC5Not Failing0.001.000.00 Unreachable netDb/profile/+-
    ++ 0rmD6zNot Failing (NR 0.6.5)5,057.750.000.00 3/6 Test Fails netDb/profile/+-
       0uOOg6Not Failing (LR 0.6.5)23.985.000.00 netDb/profile/+-
       0xl1nhNot Failing (NR 0.6.5)0.005.000.00 netDb/profile/+-
       12oGTFNot Failing (LR 0.6.5)0.005.000.00 netDb/profile/+-
       135LOSNot Failing (LR 0.6.5)154.585.600.00 netDb/profile/+-

Rating Details

  • Speed - simple - the average amount of data per tunnel we have sent through the peer in the last minute

    The speed calculation (as implemented here) simply goes through the profile and estimates how much data we can send or receive on a single tunnel through the peer in a minute. For this estimate it just looks at past performance. Previous algorithms used a longer time, weighing more recent data more heavily, and extrapolating it for the future. Another previous calculation used total (rather than per-tunnel) bandwidth, but it was decided that this method overrated slow, high-capacity peers (that is, slow in bandwidth but high-capacity in number of tunnels). Even earlier methods were much more complex. The current speed calculation has remained unchanges since early 2006, and it is a topic for further study to verify its effectiveness on today's network.

  • Capacity - not so simple - The weighted number of successful tunnel builds through the peer. Let r(t) be the successful builds per hour, over a certain time period; R = 4*r(10m) + 3*r(30m) + 2*r(1h) + r(1d);

    The capacity calculation (as implemented here) simply goes through the profile and estimates how many tunnels we think the peer would agree to participate in over the next hour. For this estimate it looks at how many the peer has agreed to lately, how many the peer rejected, and how many of the agreed to tunnels failed, and extrapolates the data. In addition, it includes a 'growth factor' (when the peer isn't failing or rejecting requests) so that we will keep trying to push their limits. The tunnel-failure part of the calculation was disabled for a long period, but it was reinstated for release 0.6.2, as it became apparent that recognizing and avoiding unreliable and unreachable peers is critically important. Unfortunately, as the tunnel tests require the participation of several peers, it is difficult to positively identify the cause of a failure. The recent changes assign a probability of failure to each of the peers, and uses that probability in the capacity calculation. These recent changes may require additional adjustment, and are a topic for further study to verify their effectiveness on today's network.

  • Growth factor adds a little bit each time, so that new peers are tried
  • Not required for measurement: any special-purpose 'load'
  • Not used for rating: latency, ....
  • "Bonuses" can be used to manually adjust preferences for individual peers

Capacity: Crime, Blame, and Punishment

  • Raw build capacity isn't sufficient - we have to decrement for bad behavior, because tunnel build attempts are expensive
    • A peer that accepts 10/10 tunnels is better than one that accepts 10/100.
    • A peer that explicitly rejects a request is better than one that drops it.
    • A peer that accepts a request but later drops data through the tunnel is the worst
  • Punishment: Capacity is decremented for build request rejections, build request timeouts, and tunnel test failures.
  • Problem: We don't know who to blame when a request or tunnel test message is dropped
    • Naive solution: Don't bother blaming anybody. This doesn't work well at all!
    • Much better solution: Blame everybody, with a weighted value. The bad peers will be discovered over time.
  • Example 1: Tunnel build request expired (3 outbound hops A-B-C), reply was due back through our inbound exploratory tunnel (2 hops D-E). Blame A, B, and C with weight 1/3, blame D-E with weight 1/2. Alternately: It was probably D's fault, because that's the usual failure point in I2P (the gateway of the inbound tunnel can't be reached), so blame D with weight 3/4 and E with weight 1/4. (pic needed here)

Peer Selection

  • Client tunnels from Fast tier
  • Exploratory tunnels from Not Failing tier, but a varying % is from High Capacity, more when build failure is high (maintains a minimum level of service while continuing to explore)
  • Equal weight within each tier

How It Works, Lessons for Other Implementations

Great! Core system has been in place since the beginning of I2P. Pretty much includes everything proposed in "Tuneup for Tor"

Stability - The Fast and High Capacity tier members don't change rapidly. This is because when a router uses a peer for tunnels, it tends to increase that peer's speed and capacity metrics. This keeps it in the pool. This is a desirable quality for anonymity, as using a large pool for tunnel building tends to assist predecessor attacks (ref).

The tier system is fast-reacting to individual peer failure or overload, and to increased local demand for bandwidth. The speed and capacity metrics are highly weighted for recent performance. When a peer starts to drop test messages, or fails to respond to tunnel build requests, it will quickly be demoted out of the high-capacity pool. As bandwidth demand increases, the speed metric for individual peers will rapidly increase, and the fast tier will quickly become reorganized to include the newly-recognized fast peers.

The tier system tends to use the highest-bandwidth peers when the network is not congested. As congestion increases, the total network traffic "spreads out" to lower-capacity peers. From an overall network perspective, this is optimal as it maintains a similar level of service for all routers.

The profiling system does not over-optimize. The router uses its own, normally-generated traffic for peer profiling. No high-bandwidth test messages are required or used. When a router does not require high bandwidth or a high number of tunnels, the metrics for each peer are correspondingly lower. Therefore, even a low-bandwidth peer may be classified as "fast" and used for tunnels. This tends to, at least partially, spread low-bandwidth tunnels broadly across the network, and leaves the high-capacity peers for high-speed traffic. However, even very-low-bandwidth routers tend to accurately find a few fast peers and thus are well-prepared when higher bw is demanded.

The profiling system is persistent across restarts, and maintains measurements for the last 24 hours. This allows a recently-started router to quickly re-integrate to the network, whether the router was just restarted or has been down for some time.

The parameters, weighting, and calculations are VERY powerful "knobs" to turn. difficult to test and debug in a distributed network; impossible to fully optimize; can take months or years to get right. The I2P router contains a framework for local network simulation and testing; however, we have not used this framework for profiling and selection testing. As described above, the routers include bandwidth and tunnel build success statistics in the network database entry they publish to the floodfill routers. While this information is not trusted or used in the router, it is gathered by the stats.i2p website (ref). On that website, several network performance graphs are presented, and the I2P developers make heavy use of this facility to monitor the network and judge the effectiveness of software changes in each release.

Here are some other lessens for those wishing to "tune up" their own network.

The basic measurements are much simpler than they used to be. The speed calculation, for example, was at one time over 400 lines of code, and it is now essentially a one-liner. The punishment for bad behavior is what keeps the network running well, and also is an area for further research. How heavily you punish determines how fast the load spreads out across the network as the it gets busy, and how quickly an overloaded peer gets avoided. Too The implementation contains an implicit estimate of the cost of a tunnel build request, as it rates the relative performance of a rejected requests and a dropped request. Or, alternatively, an accepted request vs. a request not made at all. Another way of looking at it is to establish a baseline of a peer that we have not made any requests of. Call that a capacity rating of zero. What percentage (or absolute number) of request rejections, dropped requests, and test failures is required to drop the capacity rating below zero? If peers are punished too heavily, the network will tend to congestion collapse as most peers are driven to negative capacity ratings and tunnel load spreads out quickly throughout the network, as routers attempt to route tunnels through very-low-capacity peers. If peers are punished too lightly, routers will be slow to react to overloaded peers, and maintain the same high capacity peers for too long. In other words, a router will accept poor performance from a peer even when better peers may be available.

Start simple, get more complex if necessary, you will probably have to simplify again later (for example, our speed calc used to be pages and pages of code, now a one-liner)

You don't need to generate large amounts of special-purpose data for testing. Normal traffic will do. Our exploratory tunnels average only 100KB in 10m (~150Bps)... Our unused client tunnels average only 10KB in 10m from tunnel tests (~15Bps)... but this is sufficient. When you are low bw, you don't care how accurately you sort into tiers. When you push a lot of data, your tiering will be better. Even very-low-bw routers tend to accurately find fast peers and thus are well-prepared when higher bw is demanded.

References

  • This paper includes material adapted from http://www.i2p2.de/ written by jrandom and others.
  • A Tune-up for Tor: Improving Security and Performance in the Tor Network Robin Snader and Nikita Borisov @ UIUC
  • http://www.i2p2.de/