Version 42 (modified by hottuna, 6 years ago) (diff)


General P2P networks

Name Search horizon* Comments
Gnutella Limited
Gnutella2 Limited

  • Search horizon describes how much the network that can be searched from a certain position in the network graph. Limited search horizon means that a search from one part of the network won't necessarily find results from another part of the network.


DHTs are a good alternative due to O(log n) lookup time and a unlimited search horizon. But have serious issues when it comes to being robust against attacks.

Name Search horizon* Lookup steps Mutable data Comments
Kademlia Unlimited O(log2b(n)) [2] No Is susceptible to Sybil and Eclipse attacks.*
Freenet Unlimited O(log2(n)) [3]No
Chord Unlimited O(log2(n)*0.5)[8] No Is highly susceptible to Sybil and Eclipse attacks.*
Pastry Unlimited O(log2b(n))[7] No Is highly susceptible to Sybil and Eclipse attacks.*

  • Kademlia is less susceptible to eclipse attacks. "For one thing, it is difficult to affect the routing tables of a Kademlia node, as each node tends to keep only highly available peers in its routing table. This increases the required costs for an attacker to convince honest nodes to link to the compromised nodes. Similarly, Kademlia uses iterative routing, exploring multiple nodes at each step, making routing less dependent on specific nodes and thus less vulnerable to attacks." [1]

Kademlia lookups can be optimized by enlarging how many bits of IDs, b, that are considered for each bucket. With b > 1 lookup steps would be decreased from O(log2(n)) to O(log2b(n)) but the number of buckets would be increased to an expected 2blog2b(n). [2]

Kademlia Defence Mechanisms

Sybil Defence

Sybil attacks are based in the idea of creating a large number of participating nodes. The Sybil attack does not damage the DHT by itself, but can be used as a vector to artificially create a major- ity of colluding malicious nodes in the overlay.

Name Source Description

Eclipse Defence

Eclipse attacks are attacks on the routing / routing tables.

Name Source Description
Random lookups R5N[4] Before initiating a recursive kad lookup, do a random walk in the network graph to determine the start of the kad lookup.
Control in/out-degrees [5][1] Control of the in-degree and out-degree of nodes via anonymous auditing. At the cost of slower avg. lookups.

Storage Defence

Storage attacks are attacks which attempt to provide bogus responses to queries.

Name Source Description
Recursive lookups R5N[4] Make FIND_VALUE request recursive by forwarding the query recursively and (recursively, to the previous requester in the chain of the recursion) returning the answer, a reliability metric of nodes can be obtained. Which can be used in conjunction with the last_seen attribute of k-bucket entries to create a combined eviction policy.

Kademlia Performance Improvements

Standard Kademlia performance can be improved by modifying it.

Name Source Description
Recursive lookups [6] Make FIND_VALUE request recursive by forwarding the query recursively and returning the answer directly to the original source of the request.


This is a draft of the proposal.

Kademlia is preferable to Freenet due to its lookup speed and extendibility. Kademlia is preferable to Chord/Pastry due to being as fast or faster and more resilient against Eclipse attacks.

The ultimate goal is to provide a Kademlia implementation that supports three types of FIND_VALUE. Recursive, Iterative and Random Recursive.
Recursive is the fastest means of lookup. However it is very vulnerable to Sybil attacks where a node in the recursion simply answers 'no result' instead of continuing the recursion.
Iterative is the standard Kademlia means of lookup. It is resistant to attacks since if will query a nodes for every iteration.
Recursive Random is the R5N[4] means of lookup. It is resistant to most if not all forms of attacks since it eventually will find a path to the data if one exists.

Method Lookup time (avg.) Reliability
Recursive (log2b(n) + 1)*RTT/2 Low
Iterative log2b(n)*RTT Medium
Random Recursive (log2b(n) + rand + 1)*RTT/2 High

The three lookup mechanisms could be used in either sequential failover mode, hybrid mode and parallel mode.
Sequential failover mode would would produce the lowest possible network load, while still maintaining the reliability of Random Recursive lookups. However, the lookup speed will suffer if failovers are needed.
Hybrid mode would use success rates for the different lookup methods to try to decide which method to use. This might be a bad mechanism since it likely would be susceptible to attacks that provide high success rates for most but not all keys.
Parallel mode would increase network load considerably, but provide the fastest possible lookup times at all times.

1: Implement Kademlia

A full implementation of Kademlia will be used as a base for further enhancements.

  • Investigate the i2p.zzz.kademlia implementation
  • Make sure that i2p.zzz.kademlia implements Kademlia fully
    • Implement k-bucket merging

2: Improve performance

For Kademlia to be a viable alternative, lookup speed must be kept high. This means that lookup steps should be kept low and that as many round-trips as possibly should be avoided.

  • Implement recursive lookups
    • Investigate performance

3: Attack resistance

The two main attack resistance methods will be recursive and random recursive lookups. Recursive lookups are not only fast, but can be be used to provide a lookup success metric that is useful when doing the k-bucket evictions. Random recursive lookups allows FIND_VALUE requesters to eventually find a path to the data if one exists.

  • Implement recursive lookup metric
    • Change k-bucket eviction policy to also use recursive lookup success metric
  • Implement random recursive lookups

Unresolved issues

k-bucket building for recursive lookups

k-buckets need to be continually updated. In the case of iterative searching, this is not a problem. But when doing recursive searches, the search originator will not get the usual information about nodes that are close to the queried key.

[6] suggests two solutions to this issue:
Source: Direct Mode: All nodes on the routing path send back k nodes that are close to key back to the originator of the query.
Source: Source-Routing Mode: All nodes on the routing path send back k nodes that are close to key back to the previous hop on routing path which appends its own k closest nodes to key and repeats.

However Source: Direct Mode may provide an excellent DDOS attack vector if the originator address is not verified.

Performance/Memory trade-of for b

A performance/memory usage trade-of exists for the b-value. Plotted below are lookup steps and the memory usage for differing b-values. k is set to 10, which might be a reasonable setting. But more discussion is needed regarding that. DHT lookup steps DHT memory usage

Further work

Use the locally known nodes as for routing

If we have nodes in our local NetDB that have distance, dxor , that is lower than any distance found in the Kademlia routing tables they are good candidates for FIND_VALUE querying. However to find such candidates the local NetDB has to be sorted, which is expensive.

[1] A Survey of DHT Security Techniques _
[2] Kademlia: A Peer-to-peer information system based on the XOR Metric _
[3] Searching in a Small World _
[4] R5N : Randomized Recursive Routing for Restricted-Route Networks _
[5] Eclipse attacks on overlay networks: Threats and defenses _
[6] R/Kademlia: Recursive and Topology-aware Overlay Routing _ slides
[7] Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems ["Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems _]
[8] Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications _

Attachments (2)

Download all attachments as: .zip