wiki:petconpaper

Version 9 (modified by anonymous, 10 years ago) (diff)

--

Peer Profiling and Selection in I2P

PET-CON 2009.1
DRAFT 2009-02-26 20:00 UTC
zzz@i2pmail.org

Abstract

The I2P anonymous communication network uses a system of peer profiling for selecting a subset of peers for tunnel building. This system maintains anonymity while optimizing bandwidth. The system uses opportunistic measurement of both per-tunnel bandwidth and the number of tunnels accepted. The profiling and selection system has been used in I2P since early 2004, and while it continues to be adjusted, the fundamentals and effectiveness of the system have been demonstrated.

Here we document the algorithms used in I2P's peer profiling and selection, discuss the strengths and weaknesses of the system, and describe opportunities for future work. We write this paper in the hope that it will be useful for guiding improvements in I2P and other anonymous networks with similar features.

Overview of I2P

    - echelon does this part
  • Unidirectional tunnels, garlic (~= onion) routing, hidden services...
  • Started by jrandom in 2003; vanished in late 2007
  • ~800 routers == peers == users at any given time
  • ~50-100 new routers/day
  • 7 releases in 2008 (pic from stats.i2p)
  • Total usage currently: ~10MBps, ~40K tunnel hops
  • Network growth: doubled in last 6 months (pic from stats.i2p)
  • Hidden services oriented - P2P friendly - only one HTTP outproxy (~= "exit node")

I2P

Abstract

In the following we explain the I2P network, a message based low latency mixing network build for anonymity and security. In detail the functions to sort and measure the capacity and speed of the network nodes will be described. We show that this approach works better than fixed speed rating used e.g. in the TOR network.

I2P Overview

I2P itself is a overlay network based on unidirectional tunnels routed between the I2P nodes. The used garlic routing is similar to the onion routing used by TOR, trnasmitting data from hop to hop until the endpoint of the tunnel is reached. In contrast to TOR we use only hidden services in I2P with addition of a few otproxies operated by private individuals. The ancestor of the I2P project was IIP - a specialiced IRC server and clients transmiting their data via a mix network between themself. I2P was first started as a specialized low latency transport protocal for The Freenet Project but soon grew into a independent software project, driven by the author jrandom from 2003 on until end of 2007. Other people joined coding and vanished in time between and after jrandom vanished the project end of 2007, zzz and complication took over active programming. The codebase itself is public domain or licensed under GPL with exception, MIT or other free licenses. Most code is written in JAVA to support the three main operating systems, Windows,Linux and MacOS X. To maintain 1024bit pub-key encryption, some external libs as the GMP (GNU MP Bignum Library) are interweaved with the java code. The codebase is under active development with 7 releases in the year 2008 and more than 100 development builds between those public releases. Beside the core router itself some bundled software (e.g. a bittorrent client based on snark, I2Psnark) are maintained by zzz and complication. In time writing this paper 830 active nodes are active in the I2P net. This varies over time, as 50-100 new nodes joining the network every day and others vanish. Usual on weekend times the peak of 1000 active nodes are reached and in weekdays night the lower peak at 700 active routers happens. This shows a quite stable base net on which the ~550 active destinations can act on. I2P is oriented towards hidden services, all destinations inside the network upon the nodes. Only one http outproxy (exit node in TOR terms) is known running beside a IRC gateway for I2P development discussion. In contrast to TOR is I2P peer to peer friendly and distributes the encrypted traffic between all participating nodes. The current bandwith used by all nodes is roughly 11 MByte/sec with peaks at 16 MByte/sec and the net builds ~40.000 tunnels producing this load. In the last 6 month the net grew from 400 up to 800 nodes, from 5 to 10 MByte/sec and from ~20.000 to ~40.000 tunnls, which shows the vital stats of the net doubled in half a year. The forthcoming time shows ongoing stable growth. Tunnel Overview

Tunnel Overview

    - echelon will do this part
  • Client tunnels - for all user-generated traffic
  • Exploratory tunnels - low bw, used by the router for tunnel builds, tunnel tests, netdb queries... and for "exploration", as we will see below...
  • Tunnel lifetime 10m
  • Selected by each peer
  • Strict ordering within a tunnel (predecessor attack)
  • Tunnel build request can be rejected or dropped for a number of reasons
  • Periodic testing of each tunnel (~1KB per minute)

NetDB Overview

    - echelon will write this
  • RouterInfo for each router, and LeaseSet (~= "hidden service descriptor"), not discussed here
  • K/L/M/N/O bw classes ( <12 / 12+ / 32+ / 64+ / 128+ KBps of configured shared capacity) for each router
  • Information only, not used for any routing decisions...
  • ...Except that K routers are not used
  • Lots of stats in there for network debugging, but NOTHING is trusted or used except IP/port (and "K")
  • Serious anonymity and DOS implications if we trust claimed capacity ("low-resource attacks")
  • L is the default, so about 96.5% of routers are L-O, and route for others (pic from stats.i2p)

Peer Profiling and Tiers

  • Continuous gathering of stats, per-peer and per-tunnel
  • Keep totals and average of each stat for several time periods (1m, 10m, 1h, 1d)
  • Re-sort into tiers every 30s
  • Stats are persistent across restarts

Peer profiling was proposed for the I2P network in December 2003 (ref) by jrandom. It was introduced in the 0.3 release in March 2004 (ref).

Peer selection within I2P is simply the process of choosing which routers on the network to route locally-generated messages (and replies to those messages) to go through. In other words, which peers are used for tunnels. To accomplish this, the router maintains a database of each peer's performance, called the "profile". The router uses that data to estimate the bandwidth of a peer, how often a peer will to accept tunnel build requests, and whether a peer seems to be overloaded or otherwise unable to reliably perform.

The profiling system includes mechanisms similar to the "opportunistic bandwidth measurement algorithm" proposed for Tor (ref), and much more.

The profile contains several statistics, continuously updated. For each statistic, averages, event counts, and maximums for 1 minute, 10 minute, 1 hour, and 24 hour periods are available. The profiles are also stored locally on disk, so that they are persistent across restarts. Also, the profile stores several timestamps, such as the last time a peer was heard from, the last time it accepted a tunnel request, and many more.

Each peer has a set of data points collected about them, including statistics about how long it takes for them to reply to a network database query, how often their tunnels fail, and how many new peers they are able to introduce us to, as well as simple data points such as when we last heard from them or when the last communication error occurred.

Much, if not most, of the profile data are unused in the current software. It remains a "reserve" defense that can be easily used to enhance the router's resistance to theoretical or actual attacks, such as denial-of-service attempts, selective packet drops, network database (floodfill) disruption, and others.

The profile data is periodically coalesced and sorted into tiers of peers that are used for various functions, as described further below. The picture below shows a page from the I2P router console, with a portion of the profiles sorted into tiers.

  • Use only my own traffic for these measurements, not routed traffic (we don't know if the peer before or after us in a tunnel is a true participant or the originator or destination, so that data wouldn't be valid)
  • The two metrics above are relative, not absolute, as they depend on traffic;
  • Each tier contains 8 minimum, peers are 'promoted' if the tier is too small (pic: sample table from profiles.jsp)
Peer (385, hiding 51)Groups (Caps)SpeedCapacityIntegrationFailing? 
++ 16BDe7Fast, High Capacity (OR 0.6.5)16,582.8837.470.00 netDb/profile/+-
++ 2Mnb~bFast, High Capacity (OR 0.6.5)2,944.3740.740.00 netDb/profile/+-
++ 2hHaG5Fast, High Capacity (OR 0.6.5)7,103.6134.300.00 netDb/profile/+-
++ CIIICFFast, High Capacity (OR 0.6.5)18,248.6814.64397.00 netDb/profile/+-
++ KQ~TdxFast, High Capacity (OR 0.6.5)8,025.648.970.00 netDb/profile/+-
++ M92fGWFast, High Capacity (OR 0.6.5)2,623.8644.220.00 netDb/profile/+-
++ g247zUFast, High Capacity (OR 0.6.5)18,113.8282.020.00 netDb/profile/+-
++ nF6ArZFast, High Capacity (OR 0.6.5)1,776.9846.540.00 netDb/profile/+-

++ 1oDXZQHigh Capacity (OR 0.6.5)3,722.1810.450.00 4/20 Test Fails netDb/profile/+-
++ BE-r13High Capacity (OR 0.6.5)4,504.2637.510.00 8/56 Test Fails netDb/profile/+-
++ LHxb70High Capacity (LR 0.6.5)2,459.3819.630.00 netDb/profile/+-
++ OPZ3i5High Capacity (LR 0.6.5)321.747.390.00 netDb/profile/+-
++ UM5WvvHigh Capacity (NR 0.6.5)2,137.3011.030.00 netDb/profile/+-
++ er9GCLHigh Capacity (LR 0.6.5)207.266.810.00 netDb/profile/+-
++ vXE9dGHigh Capacity (LR 0.6.5)241.7416.220.00 netDb/profile/+-
++ xHKlu4High Capacity (MR 0.6.5)0.007.400.00 netDb/profile/+-

++ -L9FoJNot Failing (OR 0.6.5)626.845.210.00 netDb/profile/+-
   -Nvi-8Not Failing (LR 0.6.5)0.001.000.00 netDb/profile/+-
   -dlmDxNot Failing (OR 0.6.5)2,290.465.090.00 netDb/profile/+-
   -qO3ahNot Failing (LU 0.6.5)0.005.000.00 netDb/profile/+-
++ -yUfotNot Failing (KR 0.6.5)0.005.000.00 netDb/profile/+-
   02YO6lNot Failing (LR 0.6.5)0.005.000.00 Unreachable netDb/profile/+-
   064wSqNot Failing (LU 0.6.5)0.005.000.00 netDb/profile/+-
   08vuC5Not Failing0.001.000.00 Unreachable netDb/profile/+-
++ 0rmD6zNot Failing (NR 0.6.5)5,057.750.000.00 3/6 Test Fails netDb/profile/+-
   0uOOg6Not Failing (LR 0.6.5)23.985.000.00 netDb/profile/+-
   0xl1nhNot Failing (NR 0.6.5)0.005.000.00 netDb/profile/+-
   12oGTFNot Failing (LR 0.6.5)0.005.000.00 netDb/profile/+-
   135LOSNot Failing (LR 0.6.5)154.585.600.00 netDb/profile/+-

Rating Details

    Speed - simple - the average (FIXME OR IS IT PEAK) amount of data per tunnel we have sent through the peer in the last minute

    The speed calculation (as implemented here) simply goes through the profile and estimates how much data we can send or receive on a single tunnel through the peer in a minute. For this estimate it just looks at past performance. Previous algorithms used a longer time, weighing more recent data more heavily, and extrapolating it for the future. Another previous calculation used total (rather than per-tunnel) bandwidth, but it was decided that this method overrated slow, high-capacity peers (that is, slow in bandwidth but high-capacity in number of tunnels). Even earlier methods were much more complex. The current speed calculation has remained unchanges since early 2006, and it is a topic for further study to verify its effectiveness on today's network.

    This is similar to the 'opportunistic bandwidth measurement' proposed in tune-up for Tor (ref). However, I2P uses a per-tunnel measurement. As we use Peak (FIXME?) and not average, this works for us. Fixme

    Crucial to the smooth operation of an I2P router is an estimate of peer tunnel capacity, defined as weighted number of successful tunnel builds through the peer in a time period. As tunnel building is expensive, it is important to rate peers based on their willingness to accept tunnels. Peers may reject or drop tunnel requests for any number of reasons. Generally the rejection or drop is caused by a bandwidth limit (from participating tunnels or locally generated traffic), a processing (CPU) limit which manifests itself as large request processing delay, or an absolute limit on participating tunnel count.

    blahblah Let r(t) be the successful builds per hour, over a certain time period; R = 4*r(10m) + 3*r(30m) + 2*r(1h) + r(1d);

    The capacity calculation (as implemented here) simply goes through the profile and estimates how many tunnels we think the peer would agree to participate in over the next hour. For this estimate it looks at how many the peer has agreed to lately, how many the peer rejected, and how many of the agreed to tunnels failed, and extrapolates the data. In addition, it includes a 'growth factor' (when the peer isn't failing or rejecting requests) so that we will keep trying to push their limits. The tunnel-failure part of the calculation was disabled for a long period, but it was reinstated for release 0.6.2, as it became apparent that recognizing and avoiding unreliable and unreachable peers is critically important. Unfortunately, as the tunnel tests require the participation of several peers, it is difficult to positively identify the cause of a failure. The recent changes assign a probability of failure to each of the peers, and uses that probability in the capacity calculation. These recent changes may require additional adjustment, and are a topic for further study to verify their effectiveness on today's network.

    Growth factor adds a little bit each time, so that new peers are tried.

    For both metrics, "Bonuses" can be used to manually adjust preferences for individual peers.

    The profiling system does not use "active bandwidth probing" or generation of special-purpose traffic for measurement. Nor does it used several other statistics available in the profile, including latency.

    Capacity: Crime, Blame, and Punishment

    Raw build capacity isn't sufficient - we have to decrement for bad behavior, because tunnel build attempts are expensive. A tunnel build request is about 4KB, and the reply is identically-sized. Generation of the build request also consumes significant CPU and entropy to create the cryptographic keys. As an example of the factors that we must consider:

    • A peer that accepts 10/10 tunnels is better than one that accepts 10/100.
    • A peer that explicitly rejects a request is better than one that drops it.
    • A peer that accepts a request but later drops data through the tunnel is the worst

    Therefore, the capacity is adjusted as a form of "punishment": Capacity is decremented for build request rejections, build request timeouts, and tunnel test failures.

    Problem: We don't know which of the tunnel peers to blame when a request or tunnel test message is dropped. This is because tunnel builds requests are handed off from peer to peer along the path, and because tunnels are unidirectional. What should the router do with this incomplete information?

    • Naive solution: Don't bother blaming anybody. This doesn't work well at all!
    • Much better solution: Blame everybody, with a weighted value. The bad peers will be discovered over time because they will be consistently bad. The metric for good peers will still be pretty good.

    Example 1: Tunnel build request expired (4 outbound hops A-B-C-D), reply was due back through our inbound exploratory tunnel (2 hops E-F). Blame A, B, C, and D with weight 1/8, blame E-F with weight 1/4. Alternately: It was probably E's fault, because that's the usual failure point in I2P (the gateway of the inbound tunnel can't be reached), so blame E with weight 3/8 and E with weight 1/8. (no change to the outbound blame). (pic needed here)

    Tiers and sorting

    Every 30 seconds, all the profiles are sorted into three tiers:

    • Not Failing Tier: peers is pretty much everybody that is responsive; typ 300-500 peers
    • High Capacity Tier: peers are Not Failing and have above-average tunnels-built-per-hour; typ 10-30 peers
    • Fast Tier: peers are High Capacity and have above-average bandwidth-per-tunnel; typ 8-15 peers
    Both the speed and capacity metrics are skewed, with a small number of high ratings and a "long tail". By using the average and not the median, we select a small number of peers for the top two tiers. FIXME move this down

    Peer Selection

    To review, exploratory tunnels are generally low-bandwidth, and are used for router operation, including building and testing other tunnels.

    Client tunnels are used for all user client and server traffic, including accessing internal I2P network destinations or "hidden services" such as "eepsites", connection to external gateways ("inproxies" and "outproxies") and other uses.

    The Not Failing tier is everybody, including the following two tiers.

    The High-Capacity tier includes peers with above-average rating for accepting tunnel build requests, and the following tier.

    The Fast tier includes peers from the High-Capacity tier with above-average rating for bandwidth-per-tunnel.

    So here's how it works. Client tunnels are build from peers in the Fast tier. the end.

    Exploratory tunnels are build from peers either in the Not Failing tier or the High Capacity tier. The tier selected is chosen on a per-build basis, using a weighted random function. This function uses data on the exploratory tunnel build success vs. the client tunnel build success to adjust the weighting. As exploratory build success declines, the router builds more tunnels from the high capacity tier, to limit the amount of effort spent on the expensive tunnel build request operation. Therefore the selection maintains the balance between tunnel build requests and the need to explore peers. It may seem curious to use the Not Failing tier (generally the lowest-bandwidth, lowest-capacity peers) for exploratory tunnels, since these tunnels are required to function for a router to build client tunnels. However failing exploratory tunnels are recognized quickly and so this works out ok FIXME. And the benefit, of course, is opportunistic bandwidth and capacity measurement.

    Peers are selected with Equal weight within each tier

    How It Works, Lessons for Other Implementations

    Great! Core system has been in place since the beginning of I2P. Pretty much includes everything proposed in "Tuneup for Tor" and much more. First and foremost, it doesn't use claimed bandwidth, which prevents low-resource attacks. It also works well enough that peer selection is not a big obstacle to I2P network performance. There are a lot of other issues that we are working on. The basic method has been in place for five years, and has been tuned only modestly in the last two years. Other characteristics:

    Stability - The Fast and High Capacity tier members don't change rapidly. This is because when a router uses a peer for tunnels, it tends to increase that peer's speed and capacity metrics. This keeps it in the pool. This is a desirable quality for anonymity, as using a large pool for tunnel building tends to assist predecessor attacks (ref).

    The tier system is fast-reacting to individual peer failure or overload, and to increased local demand for bandwidth. The speed and capacity metrics are highly weighted for recent performance. When a peer starts to drop test messages, or fails to respond to tunnel build requests, it will quickly be demoted out of the high-capacity pool. As bandwidth demand increases, the speed metric for individual peers will rapidly increase, and the fast tier will quickly become reorganized to include the newly-recognized fast peers.

    The tier system tends to use the highest-bandwidth peers when the network is not congested. As congestion increases, the total network traffic "spreads out" to lower-capacity peers. From an overall network perspective, this is optimal as it maintains a similar level of service for all routers.

    The profiling system does not over-optimize. The router uses its own, normally-generated traffic for peer profiling. No high-bandwidth test messages are required or used. When a router does not require high bandwidth or a high number of tunnels, the metrics for each peer are correspondingly lower. Therefore, even a low-bandwidth peer may be classified as "fast" and used for tunnels. This tends to, at least partially, spread low-bandwidth tunnels broadly across the network, and leaves the high-capacity peers for high-speed traffic. However, even very-low-bandwidth routers tend to accurately find a few fast peers and thus are well-prepared when higher bw is demanded. Also, there is no need for a complete global optimum. I2P routers generally know only a subset of the active peers in the network, usually 20% to 80%. Through exploratory tunnel building and other peer discovery mechanisms, routers have no difficulty finding a sufficient portion of peers, and preventing network partitioning. As the I2P network grows the percentage of peers known to each router will decline as we implement additional mechanisms to limit memory usage, TCP and UDP connections, and other resources in the router. This poses no threat to the profiling system.

    The profiling system is persistent across restarts, and maintains measurements for the last 24 hours. This allows a recently-started router to quickly re-integrate to the network, whether the router was just restarted or has been down for some time.

    The parameters, weighting, and calculations are VERY powerful "knobs" to turn. difficult to test and debug in a distributed network; impossible to fully optimize; can take months or years to get right. The I2P router contains a framework for local network simulation and testing; however, we have not used this framework for profiling and selection testing. As described above, the routers include bandwidth and tunnel build success statistics in the network database entry they publish to the floodfill routers. While this information is not trusted or used in the router, it is gathered by the stats.i2p website (ref). On that website, several network performance graphs are presented, and the I2P developers make heavy use of this facility to monitor the network and judge the effectiveness of software changes in each release.

    Here are some other lessens for those wishing to "tune up" their own network.

    The basic measurements are much simpler than they used to be. The speed calculation, for example, was at one time over 400 lines of code, and it is now essentially a one-liner. The punishment for bad behavior is what keeps the network running well, and also is an area for further research. How heavily you punish determines how fast the load spreads out across the network as the it gets busy, and how quickly an overloaded peer gets avoided. Too The implementation contains an implicit estimate of the cost of a tunnel build request, as it rates the relative performance of a rejected requests and a dropped request. Or, alternatively, an accepted request vs. a request not made at all. Another way of looking at it is to establish a baseline of a peer that we have not made any requests of. Call that a capacity rating of zero. What percentage (or absolute number) of request rejections, dropped requests, and test failures is required to drop the capacity rating below zero? If peers are punished too heavily, the network will tend to congestion collapse as most peers are driven to negative capacity ratings and tunnel load spreads out quickly throughout the network, as routers attempt to route tunnels through very-low-capacity peers. If peers are punished too lightly, routers will be slow to react to overloaded peers, and maintain the same high capacity peers for too long. In other words, a router will accept poor performance from a peer even when better peers may be available.

    Start simple, get more complex if necessary, you will probably have to simplify again later (for example, our speed calc used to be pages and pages of code, now a one-liner)

    As the tune-up paper says, active bandwidth probing (i.e. generating large amounts of special-purpose data for testing is not practical. Nore is it necessary. Our exploratory tunnels average only 100KB in 10m (~150Bps)... Our unused client tunnels average only 10KB in 10m from tunnel tests (~15Bps)... but this is sufficient. When you are low bw, you don't care how accurately you sort into tiers. When you push a lot of data, your tiering will be better. Even very-low-bw routers tend to accurately find fast peers and thus are well-prepared when higher bw is demanded.

References

  • This paper includes material adapted from http://www.i2p2.de/ written by jrandom and others.
  • A Tune-up for Tor: Improving Security and Performance in the Tor Network Robin Snader and Nikita Borisov @ UIUC
  • I2P Development Meeting 68, December 9, 2003 http://www.i2p2.de/meeting68 (profiling described)
  • I2P Development Meeting 82, March 23, 2004 http://www.i2p2.de/meeting82 (profiling released in 0.3)
  • Predecessor attack papers
  • http://www.i2p2.de/