wiki:NetDB/NextBackend

Version 28 (modified by hottuna, 6 years ago) (diff)

--

General P2P networks

Name Search horizon* Comments
Gnutella Limited
Gnutella2 Limited


  • Search horizon describes how much the network that can be searched from a certain position in the network graph. Limited search horizon means that a search from one part of the network won't necessarily find results from another part of the network.

DHTs

DHTs are a good alternative due to O(log n) lookup time and a unlimited search horizon. But have serious issues when it comes to being robust against attacks.

Name Search horizon* Lookup steps Mutable data Comments
Kademlia Unlimited Max: O(log2 n)
Min: O(log2b n)
No Is susceptible to sybil and eclipse attacks.*
Freenet Unlimited O(log2 n) [2]No
Choord Unlimited No Is highly susceptible to sybil and eclipse attacks.*
Pastry Unlimited No Is highly susceptible to sybil and eclipse attacks.*


  • Kademlia is less susceptible to eclipse attacks. "For one thing, it is difficult to affect the routing tables of a Kademlia node, as each node tends to keep only highly available peers in its routing table. This increases the required costs for an attacker to convince honest nodes to link to the compromised nodes. Similarly, Kademlia uses iterative routing, exploring multiple nodes at each step, making routing less dependent on specific nodes and thus less vulnerable to attacks." [1]

Kademlia lookups can be optimized by enlarging how many bits of IDs, b, that are considered for each bucket. With b > 1 lookup steps would be decreased from O(log2 n) to O(log2b n) but the number of buckets would be increased to an expected 2blog2bn. [2]


Kademlia Defence Mechanisms

Sybil Defence

Sybil attacks are based in the idea of creating a large number of participating nodes. The Sybil attack does not damage the DHT by itself, but can be used as a vector to artificially create a major- ity of colluding malicious nodes in the overlay.

Name Source Description

Eclipse Defence

Eclipse attacks are attacks on the routing / routing tables.

Name Source Description
Random lookups R5N[4] Before initiating a recursive kad lookup, do a random walk in the network graph to determine the start of the kad lookup.
Control in/out-degrees [5][1] Control of the in-degree and out-degree of nodes via anonymous auditing. At the cost of slower avg. lookups.

Storage Defence

Storage attacks are attacks which attempt to provide bogus responses to queries.

Name Source Description
Recursive lookups R5N[4] Make FIND_VALUE request recursive by forwarding the query recursively and (recursively, to the previous requester in the chain of the recursion) returning the answer, a reliability metric of nodes can be obtained. Which can be used in conjunction with the last_seen attribute of k-bucket entries to create a combined eviction policy.

Kademlia Performance Improvements

Standard Kademlia performance can be improved by modifying it.

Name Source Description
Recursive lookups [6] Make FIND_VALUE request recursive by forwarding the query recursively and returning the answer directly to the original source of the request.

Proposal

This is a draft of the proposal.

Kademlia is preferable to Freenet due to its lookup speed and extendibility. Kademlia is preferable to Chord/Pastry? due to being as fast or faster and more resilient against eclipse attacks.

The ultimate goal is to provide a Kademlia implementation that supports three types of FIND_VALUE. Recursive, iterative and recursive random.

Recursive is the fastest means of lookup. However it is very vulnerable to Sybil attacks where a node in the recursion simply answers 'no result' instead of continuing the recursion.

Iterative is the standard Kademlia means of lookup. It is resistant to Eclipse attacks and Storage attacks since if will query a nodes for every iteration.

Recursive random is the R5N[4] means of lookup. It is resistant to most if not all forms of attacks since it eventually will find a path to the data if one exists.

Method Lookup time (avg.) Reliability
Recursive (log2b(n) + 1)*RTT/2 Low
Iterative log2b(n)*RTT Medium
Random recursive (log2b(n) + rand + 1)*RTT/2 High

1: Implement Kademlia

A full implementation of Kademlia will be used as a base for further enhancements.

  • Investigate the i2p.zzz.kademlia implementation
  • Make sure that i2p.zzz.kademlia implements Kademlia fully
    • Implement k-bucket merging

2: Improve performance

For Kademlia to be a viable alternative, lookup speed must be kept high. This means that lookup steps should be kept low and that as many round-trips as possibly should be avoided.

  • Implement recursive lookups
    • Investigate performance

3: Attack resistance

The two main attack resistance methods will be recursive and random recursive lookups. Recursive lookups are not only fast, but can be be used to provide a lookup success metric that is useful when doing the k-bucket evictions. Random recursive lookups allows FIND_VALUE requesters to eventually find a path to the data if one exists.

  • Implement recursive lookup metric
    • Change k-bucket eviction policy to also use recursive lookup success metric
  • Implement random recursive lookups

Unresolved issues

k-bucket building for recursive lookups

k-buckets need to be continually updated. In the case of iterative searching, this is not a problem. But when doing recursive searches, the search originator will not get the usual information about nodes that are close to the queried key.

[6] suggests two solutions to this issue:
Source: Direct Mode: All nodes on the routing path send back k nodes that are close to key back to the originator of the query.
Source: Source-Routing Mode: All nodes on the routing path send back k nodes that are close to key back to the previous hop on routing path which appends its own k closest nodes to key and repeats.

How

Further work

Use the locally known nodes as for routing

If we have nodes in our local NetDB that have distance, dxor , that is lower than any distance found in the Kademlia routing tables they are good candidates for FIND_VALUE querying. However to find such candidates the local NetDB has to be sorted, which is expensive.



[1] A Survey of DHT Security Techniques _
[2] Kademlia: A Peer-to-peer information system based on the XOR Metric _
[3] Searching in a Small World _
[4] R5N : Randomized Recursive Routing for Restricted-Route Networks _
[5] Eclipse attacks on overlay networks: Threats and defenses _
[6] R/Kademlia: Recursive and Topology-aware Overlay Routing _ slides

Attachments (2)

Download all attachments as: .zip