Changes between Version 5 and Version 6 of petconpaper


Ignore:
Timestamp:
Feb 26, 2009 7:02:44 PM (10 years ago)
Author:
anonymous
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • petconpaper

    v5 v6  
    66PET-CON 2009.1
    77<br>
    8 DRAFT - last update zzz feb. 26 17:00 UTC
     8DRAFT 2009-02-26 19:00 UTC
    99<br>
    1010zzz@i2pmail.org
    1111<br>
    12 Note - this looks like a one-hour talk, Pet-con limit is 10-15m,
    13 so a little editing is needed...
    14 Maybe this is the outline for a paper, not a 10m talk.
     12
     13<p><b>Abstract</b>
     14<p>
     15The I2P anonymous communication network uses a system of peer profiling for selecting a subset of peers for tunnel building. This system maintains anonymity while optimizing bandwidth. The system uses opportunistic measurement of both per-tunnel bandwidth and the number of tunnels accepted. The profiling and selection system has been used in I2P since early 2004, and while it continues to be adjusted, the fundamentals and effectiveness of the system have been demonstrated.
     16<p>
     17Here we document the algorithms used in I2P's peer profiling and selection, discuss the strengths and weaknesses of the system, and describe opportunities for future work. We write this paper in the hope that it will be useful for guiding improvements in I2P and other anonymous networks with similar features.
     18
    1519
    1620<h2>Overview of I2P</h2><ul> - echelon does this part
     
    4852        <li> ...Except that K routers are not used
    4953        <li> Lots of stats in there for network debugging, but NOTHING is trusted or used except IP/port (and "K")
    50         <li> Serious anonymity and DOS implications if we trust claimed capacity
     54        <li> Serious anonymity and DOS implications if we trust claimed capacity ("low-resource attacks")
    5155        <li> L is the default, so about 96.5% of routers are L-O, and route for others
    5256          (pic from stats.i2p)
     
    6367<p>
    6468Peer selection within I2P is simply the process of choosing which routers on the network to route locally-generated messages (and replies to those messages) to go through. In other words, which peers are used for tunnels. To accomplish this, the router maintains a database of each peer's performance, called the "profile". The router uses that data to estimate the bandwidth of a peer, how often a peer will to accept tunnel build requests, and whether a peer seems to be overloaded or otherwise unable to reliably perform.
     69<p>
     70The profiling system includes mechanisms similar to the "opportunistic bandwidth measurement algorithm" proposed for Tor (ref), and much more.
    6571<p>
    6672The profile contains several statistics, continuously updated. For each statistic, averages, event counts, and maximums for 1 minute, 10 minute, 1 hour, and 24 hour periods are available. The profiles are also stored locally on disk, so that they are persistent across restarts. Also, the profile stores several timestamps, such as the last time a peer was heard from, the last time it accepted a tunnel request, and many more.
     
    138144<h2>Rating Details            </h2><ul>
    139145
    140         <li> Speed - simple - the average amount of data per tunnel we have sent through
     146        <p> Speed - simple - the average (FIXME OR IS IT PEAK) amount of data per tunnel we have sent through
    141147          the peer in the last minute
    142148<p>
     
    147153The current speed calculation has remained unchanges since early 2006, and it is a topic for further study to verify its effectiveness on today's network.
    148154
    149         <li> Capacity - not so simple - The weighted number of successful tunnel builds
    150           through the peer.
     155<p> This is similar to the 'opportunistic bandwidth measurement' proposed in tune-up for Tor (ref). However, I2P uses a per-tunnel measurement. As we use Peak (FIXME?) and not average, this works for us. Fixme
     156
     157        <p>
     158Crucial to the smooth operation of an I2P router is an estimate of peer tunnel capacity, defined as weighted number of successful tunnel builds through the peer in a time period. As tunnel building is expensive, it is important to rate peers based on their willingness to accept tunnels. Peers may reject or drop tunnel requests for any number of reasons. Generally the rejection or drop is caused by a bandwidth limit (from participating tunnels or locally generated traffic), a processing (CPU) limit which manifests itself as large request processing delay, or an absolute limit on participating tunnel count.
     159
     160<p> blahblah
    151161          Let r(t) be the successful builds per hour, over a certain time period;
    152162          R = 4*r(10m) + 3*r(30m) + 2*r(1h) + r(1d);
     
    156166The tunnel-failure part of the calculation was disabled for a long period, but it was reinstated for release 0.6.2, as it became apparent that recognizing and avoiding unreliable and unreachable peers is critically important. Unfortunately, as the tunnel tests require the participation of several peers, it is difficult to positively identify the cause of a failure. The recent changes assign a probability of failure to each of the peers, and uses that probability in the capacity calculation. These recent changes may require additional adjustment, and are a topic for further study to verify their effectiveness on today's network.
    157167
    158         <li> Growth factor adds a little bit each time, so that new peers are tried
    159         <li> Not required for measurement: any special-purpose 'load'
    160         <li> Not used for rating: latency, ....
    161         <li> "Bonuses" can be used to manually adjust preferences for individual peers
    162 
    163 </ul>
    164 <h2>Capacity: Crime, Blame, and Punishment </h2><ul>
    165 
    166         <li> Raw build capacity isn't sufficient - we have to decrement for bad behavior, because
    167           tunnel build attempts are expensive
     168        <p> Growth factor adds a little bit each time, so that new peers are tried.
     169
     170<p> For both metrics, "Bonuses" can be used to manually adjust preferences for individual peers.
     171     
     172   <p>The profiling system does not use "active bandwidth probing" or generation of special-purpose traffic for measurement.
     173         Nor does it used several other statistics available in the profile, including latency.
     174       
     175
     176
     177<h2>Capacity: Crime, Blame, and Punishment </h2>
     178<p> Raw build capacity isn't sufficient - we have to decrement for bad behavior, because
     179          tunnel build attempts are expensive. A tunnel build request is about 4KB, and the reply is identically-sized. Generation of the build request also consumes significant CPU and entropy to create the cryptographic keys. As an example of the factors that we must consider:
    168180        <ul>
    169181                <li> A peer that accepts 10/10 tunnels is better than one that accepts 10/100.
     
    171183                <li> A peer that accepts a request but later drops data through the tunnel is the worst
    172184        </ul>
    173         <li> Punishment: Capacity is decremented for build request rejections, build request timeouts,
     185
     186
     187        <p> Therefore, the capacity is adjusted as a form of "punishment": Capacity is decremented for build request rejections, build request timeouts,
    174188          and tunnel test failures.
    175         <li> Problem: We don't know who to blame when a request or tunnel test message is dropped
     189
     190        <p> Problem: We don't know which of the tunnel peers to blame when a request or tunnel test message is dropped. What should the router do with this incomplete information?
    176191        <ul>
    177192                <li> Naive solution: Don't bother blaming anybody. This doesn't work well at all!
    178                 <li> Much better solution: Blame everybody, with a weighted value. The bad peers will be discovered over time.
     193                <li> Much better solution: Blame everybody, with a weighted value. The bad peers will be discovered over time because they will be consistently bad. The metric for good peers will still be pretty good.
    179194        </ul>
    180         <li> Example 1: Tunnel build request expired (3 outbound hops A-B-C), reply was due back through
    181           our inbound exploratory tunnel (2 hops D-E).
    182           Blame A, B, and C with weight 1/3, blame D-E with weight 1/2.
    183           Alternately: It was probably D's fault, because that's the usual failure point in I2P
    184           (the gateway of the inbound tunnel can't be reached), so blame D with weight 3/4 and
    185           E with weight 1/4.
     195        <p> Example 1: Tunnel build request expired (4 outbound hops A-B-C-D), reply was due back through
     196          our inbound exploratory tunnel (2 hops E-F).
     197          Blame A, B, C, and D with weight 1/8, blame E-F with weight 1/4.
     198          Alternately: It was probably E's fault, because that's the usual failure point in I2P
     199          (the gateway of the inbound tunnel can't be reached), so blame E with weight 3/8 and
     200          E with weight 1/8. (no change to the outbound blame).
    186201          (pic needed here)
    187202
    188 </ul>
    189 <h2>Peer Selection                 </h2><ul>
    190 
    191         <li> Client tunnels from Fast tier
    192         <li> Exploratory tunnels from Not Failing tier, but a varying % is from High Capacity,
    193           more when build failure is high (maintains a minimum level of service
    194           while continuing to explore)
    195         <li> Equal weight within each tier
    196 
    197 </ul>
     203
     204
     205<h2>Peer Selection                 </h2>
     206<p>
     207
     208To review, exploratory tunnels are generally low-bandwidth, and are used for router operation, including building and testing other tunnels.
     209
     210<p>Client tunnels are used for all user client and server traffic, including accessing internal I2P network destinations or "hidden services" such as "eepsites", connection to external gateways ("inproxies" and "outproxies") and other uses.
     211
     212<p> The Not Failing tier is everybody.
     213
     214<p>The High-Capacity tier includes peers with above-average rating for accepting tunnel build requests.
     215
     216<p>The Fast tier includes peers from the High-Capacity tier with above-average rating for bandwidth-per-tunnel.
     217
     218<p>So here's how it works. Client tunnels are build from peers in the Fast tier.
     219
     220
     221        <p> Exploratory tunnels are build from peers either in the Not Failing tier or the High Capacity tier. The tier selected is chosen on a per-build basis, using a weighted random function. This function uses data on the exploratory tunnel build success vs. the client tunnel build success to adjust the weighting.
     222As exploratory build success declines, the router builds more tunnels from the high capacity tier, to limit the amount of effort spent on the expensive tunnel build request operation.
     223Therefore the selection maintains the balance between tunnel build requests and the need to explore peers.
     224
     225It may seem curious to use the Not Failing tier (generally the lowest-bandwidth, lowest-capacity peers) for exploratory tunnels, since these tunnels are required to function for a router to build client tunnels. However failing exploratory tunnels are recognized quickly and so this works out ok FIXME.
     226 And the benefit, of course, is opportunistic bandwidth and capacity measurement.
     227
     228        <p> Peers are selected with Equal weight within each tier
     229
     230
    198231<h2>How It Works, Lessons for Other Implementations   </h2>
    199232
    200233        <p> Great! Core system has been in place since the beginning of I2P.
    201              Pretty much includes everything proposed in "Tuneup for Tor"
     234             Pretty much includes everything proposed in "Tuneup for Tor" and much more.
     235First and foremost, it doesn't use claimed bandwidth, which prevents low-resource attacks. It also works well enough that peer selection is not a big obstacle to I2P network performance. There are a lot of other issues that we are working on. The basic method has been in place for five years, and has been tuned only modestly in the last two years. Other characteristics:
    202236
    203237        <p> Stability - The Fast and High Capacity tier members don't change rapidly. This is because when a router uses  a peer for tunnels, it tends to increase that peer's speed and capacity metrics. This keeps it in the pool.
     
    211245<p> The tier system tends to use the highest-bandwidth peers when the network is not congested. As congestion increases, the total network  traffic "spreads out" to lower-capacity peers. From an overall network perspective, this is optimal as it maintains a similar level of service for all routers.
    212246
    213 <p> The profiling system does not over-optimize. The router uses its own, normally-generated traffic for peer profiling. No high-bandwidth test messages are required or used. When a router does not require high bandwidth or a high number of tunnels, the metrics for each peer are correspondingly lower. Therefore, even a low-bandwidth peer may be classified as "fast" and used for tunnels. This tends to, at least partially, spread low-bandwidth tunnels broadly across the network, and leaves the high-capacity peers for high-speed traffic. However, even very-low-bandwidth routers tend to accurately find a few fast peers and thus are  well-prepared when higher bw is demanded.
     247<p> The profiling system does not over-optimize. The router uses its own, normally-generated traffic for peer profiling. No high-bandwidth test messages are required or used. When a router does not require high bandwidth or a high number of tunnels, the metrics for each peer are correspondingly lower. Therefore, even a low-bandwidth peer may be classified as "fast" and used for tunnels. This tends to, at least partially, spread low-bandwidth tunnels broadly across the network, and leaves the high-capacity peers for high-speed traffic. However, even very-low-bandwidth routers tend to accurately find a few fast peers and thus are  well-prepared when higher bw is demanded. Also, there is no need for a complete global optimum. I2P routers generally know only a subset of the active peers in the network, usually 20% to 80%. Through exploratory tunnel building and other peer discovery mechanisms, routers have no difficulty finding a sufficient portion of peers, and preventing network partitioning. As the I2P network grows the percentage of peers known to each router will decline as we implement additional mechanisms to limit memory usage, TCP and UDP connections, and other resources in the router. This poses no threat to the profiling system.
    214248
    215249
     
    236270          (for example, our speed calc used to be pages and pages of code, now a one-liner)
    237271
    238         <p> You don't need to generate large amounts of special-purpose data for testing. Normal traffic will do.
     272        <p> As the tune-up paper says, active bandwidth probing (i.e.  generating large amounts of special-purpose data for testing is not practical. Nore is it necessary.
    239273          Our exploratory tunnels average only 100KB in 10m (~150Bps)...
    240274          Our unused client tunnels average only 10KB in 10m from tunnel tests (~15Bps)...
     
    250284<li><i>A Tune-up for Tor: Improving Security and Performance in the Tor Network</i>
    251285Robin Snader and Nikita Borisov @ UIUC
     286<li>Predecessor attack papers
    252287<li>http://www.i2p2.de/
    253288</ul>