Opened 6 weeks ago

Last modified 5 weeks ago

## #2694 accepted enhancement

# FloodfillPeerSelector: Remove bottlenecks

Reported by: | jogger | Owned by: | zzz |
---|---|---|---|

Priority: | minor | Milestone: | 0.9.46 |

Component: | router/netdb | Version: | 0.9.44 |

Keywords: | Cc: | ||

Parent Tickets: | Sensitive: | no |

### Description

The addall() to the sorted treeset costs lots of CPU due to poor implementation of mass additions on the Java level. Collections.sort() is much faster.

the _sorted treeset is accessed much too often at the end of get().

Patch:

--- i2p-0.9.44/router/java/src/net/i2p/router/networkdb/kademlia/FloodfillPeerSelector.java +++ 44dev/router/java/src/net/i2p/router/networkdb/kademlia/FloodfillPeerSelector.java @@ -13,6 +13,7 @@ import java.util.Collections; import java.util.HashSet; import java.util.Iterator; +import java.util.LinkedList; import java.util.List; import java.util.Set; import java.util.TreeSet; @@ -203,13 +204,12 @@ * @param kbuckets now unused */ private List<Hash> selectFloodfillParticipantsIncludingUs(Hash key, int howMany, Set<Hash> toIgnore, KBucketSet<Hash> kbuckets) { - List<Hash> ffs = selectFloodfillParticipants(toIgnore, kbuckets); - TreeSet<Hash> sorted = new TreeSet<Hash>(new XORComparator<Hash>(key)); - sorted.addAll(ffs); + LinkedList<Hash> sorted = new LinkedList<Hash>(selectFloodfillParticipants(toIgnore, kbuckets)); + Collections.sort(sorted, new XORComparator<Hash>(key)); List<Hash> rv = new ArrayList<Hash>(howMany); - List<Hash> okff = new ArrayList<Hash>(ffs.size()); - List<Hash> badff = new ArrayList<Hash>(ffs.size()); + List<Hash> okff = new ArrayList<Hash>(sorted.size()); + List<Hash> badff = new ArrayList<Hash>(sorted.size()); int found = 0; long now = _context.clock().now(); long installed = _context.getProperty("router.firstInstalled", 0L); @@ -229,14 +229,13 @@ // 5 == FNDF.MAX_TO_FLOOD + 1 int limit = Math.max(5, howMany); - limit = Math.min(limit, ffs.size()); + limit = Math.min(limit, sorted.size()); MaskedIPSet maskedIPs = new MaskedIPSet(limit * 3); // split sorted list into 3 sorted lists for (int i = 0; found < howMany && i < limit; i++) { - Hash entry = sorted.first(); + Hash entry = sorted.poll(); if (entry == null) break; // shouldn't happen - sorted.remove(entry); // put anybody in the same /16 at the end RouterInfo info = _context.netDb().lookupRouterInfoLocally(entry); MaskedIPSet entryIPs = new MaskedIPSet(_context, entry, info, 2); @@ -442,11 +441,10 @@ } // are we corrupting _sorted here? for (int i = rv.size(); i < howMany; i++) { - if (_sorted.isEmpty()) + Hash entry = _sorted.pollFirst(); + if (entry == null) break; - Hash entry = _sorted.first(); rv.add(entry); - _sorted.remove(entry); } return rv; }

### Subtickets

### Change History (10)

### comment:1 Changed 6 weeks ago by

Milestone: | undecided → 0.9.46 |
---|---|

Status: | new → accepted |

### comment:2 Changed 6 weeks ago by

Without having benchmarked.. insertion into a TreeSet is O(logN) but there is a new node created for each element which could be causing the issue. If that's the case it's better to use an ArrayList, also it might sort even faster than a LinkedList.

The inefficiency of removing from the head of an ArrayList can be mitigated because it is known in advance `howMany`

are needed, so a single System.arraycopy will be sufficient.

### comment:3 Changed 6 weeks ago by

Java uses TimSort? which is O(n log(n)) https://en.wikipedia.org/wiki/Timsort

If java addAll() inserts one at a time, then in total it's going to be O(nsquared log(n)), right?

### comment:4 Changed 6 weeks ago by

No, addAll is still O(n*log(n)) or at least should be (you're adding N elements, log(N) each). The devil must be in the details somewhere.

### comment:5 Changed 6 weeks ago by

Right, sort() has the same complexity of O(n * log(n)). The difference is the overhead associated with each step. Adding to the tree means reorganizing the tree each time and that moves much more data around.

Ideally a mass insertion is done by sorting the added data first and doing a merge afterwards which is just what my hint does as the target is empty.

Collections.sort() sorts linked lists as arrays so there is no extra overhead for using a linked list.

### comment:6 Changed 6 weeks ago by

Collections.sort() sorts linked lists as arrays so there is no extra overhead for using a linked list.

List::sort creates a new array and copies all elements to it, and that will involve walking the entire LinkedList, whereas an ArraysList sorts the underlying array in-place.

### comment:7 Changed 6 weeks ago by

OK, so back to my question in comment 1 - have any empirical evidence of the savings with n ~= 1700, or a link to any post by anybody who has done similar analysis?

### comment:8 Changed 5 weeks ago by

Maybe 10% savings? Tweak test code as you like

Size: 1700 runs: 50000 list time: 9873 treeset time: 10938

int SZ = 1700; List<Integer> list = new ArrayList<Integer>(SZ); for (int i = 0; i < SZ; i++) { list.add(Integer.valueOf(i)); } long ltime = 0; long stime = 0; int RUNS = 50000; for (int i = 0; i < RUNS; i++) { Collections.shuffle(list); List<Integer> l2 = new ArrayList<Integer>(SZ); long start1 = System.currentTimeMillis(); l2.addAll(list); Collections.sort(l2); ltime += System.currentTimeMillis() - start1; long start2 = System.currentTimeMillis(); Set<Integer> set = new TreeSet<Integer>(); set.addAll(list); stime += System.currentTimeMillis() - start2; } System.out.println("Size: " + SZ + " runs: " + RUNS + " list time: " + ltime + " treeset time: " + stime);

### comment:9 Changed 5 weeks ago by

This is getting tricky. Converted int to string and used a custom comparator. List came out first on JDK11, Graal 20 and JDK 13 were better on the treeset. The flaw is the shuffle. Timsort expects real-world data, not randomized input.

So the comparison is unfair, the treeset always performs the same regardless of input, the sort takes longer due to randomizing.

### comment:10 Changed 5 weeks ago by

That's why I posted it, adjust it to your liking and do your own tests.

FYI my test was on Java 8.

**Note:**See TracTickets for help on using tickets.

I'll take a look for 0.9.46.

You have any reference link or measurement that this would make a difference for order of 1700 entries?