Changes between Version 62 and Version 63 of NetDB/NextBackend


Ignore:
Timestamp:
Jun 4, 2013 7:47:44 AM (6 years ago)
Author:
hottuna
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • NetDB/NextBackend

    v62 v63  
    1 == Why? ==
     1= Why? =
    22'''Scalability:''' Currently we lack O(log(n)) lookup scalabilty if more !FloodFill nodes than the local NetDB contains exist.
    33
     
    66
    77
    8 === General P2P networks ===
     8= Alternatives =
     9== General P2P networks ==
    910
    1011|| '''Name''' || '''Search horizon*''' ||  '''Comments''' ||
     
    1516* Search horizon describes how much the network that can be searched from a certain position in the network graph. Limited search horizon means that a search from one part of the network won't necessarily find results from another part of the network.
    1617
    17 === DHTs ===
     18== DHTs ==
    1819DHTs are a good alternative due to O(log n) lookup time and a unlimited search horizon. But have serious issues when it comes to being robust against attacks.
    1920
     
    6364|| Recursive lookups || [6] || Make FIND_VALUE request recursive by forwarding the query recursively and returning the answer directly to the original source of the request. ||
    6465
    65 == Proposal ==
     66= Proposal =
    6667'''This is a draft of the proposal.'''
    6768
    68 === Why? ===
     69== Why? ==
    6970Kademlia is preferable to Freenet due to its lookup speed and extendibility.
    7071
     
    7677
    7778
    78 === ''Key'' ===
    79 ''Key'' rotation might be an interesting idea in the form of hash(dest+low_res_timestamp). It may help out against eclipse attacks if the results of the (unpublished) UCSB paper holds up.
    80 
    81 
    82 === Store ===
    83 '''Idea: Random Recursive Stores:''' By making stores travel a short number of random steps (short to avoid sybil nodes) and then begin recursive store. Where a value is stored in the ~''k'' closest nodes for each node in the path of the recursive insertion.
    84 
    85 ==== Recursion ====
    86 ''r'' STOREs are initiated at the originator node. Every node should forward r,,local_recursion,, STORE request.
    87 
    88 === Lookup ===
     79== Lookup ==
    8980
    9081The ultimate goal is to provide a Kademlia implementation that supports three types of FIND_VALUE. Recursive, Iterative and Random Recursive. [[BR]]
     
    9384Recursive Random is the R5N[4] means of lookup. It is resistant to most if not all forms of attacks since it eventually will find a path to the data if one exists.
    9485
    95 || '''Method''' || '''Lookup time (avg.)''' || '''Reliability''' || '''Usage scenario''' ||
     86|| '''Method''' || '''Lookup time (avg.)''' || '''Reliability''' ||
    9687|| Recursive || (log2^b^(n) + 1)*RTT/2 || Low ||
    9788|| Iterative || log2^b^(n)*RTT || Medium ||
     
    10394Parallel mode would increase network load considerably, but provide the fastest possible lookup times at all times.[[BR]]
    10495
    105 ==== Recursion ====
     96=== Recursion ===
    10697''r'' FIND_VALUEs are initiated at the originator node. Every node should forward r,,local_recursion,, FIND_VALUEs requests.
    10798
    10899
    109 === End of Recursion ===
     100== Store ==
     101Stores should ideally always be performed using Random Recursion to increase the spread of nodes that contain the inserted data.
     102Iterative and Recursive STORE queries are possible, however they both have limited spread of inserted data. Additionally the benefits of non-random Recursive STOREs would be insertion speed. Something that likely isn't a very important factor.
     103
     104|| '''Method''' || '''Store time''' || '''Spread''' ||
     105|| Recursive || (log2^b^(n) + 1)*RTT/2 || Low ||
     106|| Iterative || log2^b^(n)*RTT || Medium ||
     107|| Random Recursive || (log2^b^(n) + rand + 1)*RTT/2 || High ||
     108
     109=== Recursion ===
     110''r'' STOREs are initiated at the originator node. Every node should forward r,,local_recursion,, STORE request.
     111
     112
     113== End of Recursion ==
    110114Recursion can be used to slow DOS the network.
    111115
    112 ==== Metric 1 ====
     116=== Metric 1 ===
    113117Each node in the path of the recursion will stop forwarding the query if it is closer to the ''key'' than any node in its routing table.
    114118This is what is used by R5N.
    115119
    116 ==== Metric 2 ====
     120=== Metric 2 ===
    117121A Hops-travelled counter, ''h,,travelled,,'', can be used. The recursion can be terminated using ''h,,travelled,,'' ≤ 2*''T''.
    118122
     
    120124''T'' = log2^b^(N), which is the average number of steps needed to go anywhere in the network.
    121125
    122 ==== Metric 1 && 2 ====
     126=== Metric 1 && 2 ===
    123127   "(''h'' > ''T'' && closest) || (''h'' > 4''T'') right now (4''T'' or 3''T'' or 2''T'', not sure); that way, the random walk (''h'' <= ''T'') doesn't terminate at a local min by accident." -- C. Grothoff [4]
    124128
    125129
    126130
    127 === Length of Random path ===
     131== Length of Random path ==
    128132Estimated Buckets entries = 2^b^log2^b^(N) <=> N = (2^b^)^bucket_entries*pow(2,-b)^ [[BR]]
    129133rand_path_length = log2^b^(N), which is the average number of steps needed to go anywhere in the network [4].
    130134
    131135
    132 === Replication factor ===
     136== Replication factor ==
    133137In standard kad, ''k'' is used as the replication factor, or how many STORE request that will be sent by the originator. The replication factor, henceforth ''r'' is a trade-of between FIND_VALUE performance and the cost of STORE. Since no data is permanent in the NetDB a relatively high replication factor should be acceptable.
    134138
    135139''r'' is determined by, ''r'' = (N/c)^1/2^, where c is a constant describing the number of nodes with direct TCP/UDP connection to this node. For our purposes a value for c would be chosen.
    136140
    137 ==== Recursive Replication factor ====
     141=== Recursive Replication factor ===
    138142For recursive stores a replication factor has to be calculated for each node on the path of recursion.
    139143
     
    142146
    143147
    144 
    145 === How ===
    146 
    147 ==== 1: Implement Kademlia ====
     148== ''Key''-rotation ==
     149''Key''-rotation might be an interesting idea in the form of hash(dest+low_res_timestamp). It may help out against eclipse attacks if the results of the (unpublished) UCSB paper holds up. This will have to verified before any implementation of it is meaningful.
     150
     151It is only interesting if the time it takes for malicious Eclipse-nodes to integrate into the DHT is significant.
     152nodes_needed_for_eclipse = (60/key_rot_interval)*eclipse_integration_time*attackers_per_eclipse[[BR]]
     153nodes_needed_for_eclipse = (60/10)*24*20 = 2880. A key rotation interval of 10 minutes is chosen, the time it takes to integrate a new malicious node into the DHT is guessed to be 24hrs and finally 20 attackers per destination is what was used in the UCSB paper to Eclipse a ''key''.[[BR]]
     154
     155== How ==
     156
     157=== 1: Implement Kademlia ===
    148158A full implementation of Kademlia will be used as a base for further enhancements.
    149159
     
    152162   * Implement ''k''-bucket merging
    153163
    154 ==== 2: Improve performance ====
     164=== 2: Improve performance ===
    155165For Kademlia to be a viable alternative, lookup speed must be kept high. This means that lookup steps should be kept low and that as many round-trips as possibly should be avoided.
    156166
     
    158168   * Investigate performance
    159169
    160 ==== 3: Attack resistance ====
     170=== 3: Attack resistance ===
    161171The two main attack resistance methods will be recursive and random recursive lookups.
    162172Recursive lookups are not only fast, but can be be used to provide a lookup success metric that is useful when doing the ''k''-bucket evictions.
    163 Random recursive lookups allows FIND_VALUE requesters to eventually find a path to the data if one exists.
     173Random Recursive lookups allows FIND_VALUE requesters to eventually find a path to the data if one exists.
    164174
    165175 * Implement recursive lookup metric
     
    169179
    170180
    171 === Unresolved issues ===
    172 ==== ''k''-bucket building for recursive lookups ====
     181== Unresolved issues ==
     182=== ''k''-bucket building for recursive lookups ===
    173183''k''-buckets need to be continually updated.
    174184In the case of iterative searching, this is not a problem. But when doing recursive searches, the search originator will not get the usual information about nodes that are close to the queried key. [[BR]][[BR]]
     
    179189However '''Source: Direct Mode''' may provide an excellent DDOS attack vector if the originator address is not verified.
    180190
    181 ==== !Performance/Memory trade-of for ''b'' ====
     191=== !Performance/Memory trade-of for ''b'' ===
    182192A performance/memory usage trade-of exists for the ''b''-value. Plotted below are lookup steps and the memory usage for differing ''b''-values. ''k'' is set to 10, which might be a reasonable setting. But more discussion is needed regarding that.
    183193
     
    186196[[Image(dht_mem_usage.svg, 450px)]]
    187197
    188 === Further work ===
    189 ==== Use the locally known nodes as for routing ====
     198== Further work ==
     199=== Use the locally known nodes as for routing ===
    190200If we have nodes in our local NetDB that have distance, d,,xor,, , that is lower than any distance found in the Kademlia routing tables they are good candidates for FIND_VALUE querying. However to find such candidates the local NetDB has to be sorted, which is expensive.
    191201