Opened 11 months ago

Closed 9 months ago

## #2694 closed enhancement (fixed)

# FloodfillPeerSelector: Remove bottlenecks

Reported by: | jogger | Owned by: | zzz |
---|---|---|---|

Priority: | minor | Milestone: | 0.9.46 |

Component: | router/netdb | Version: | 0.9.44 |

Keywords: | Cc: | ||

Parent Tickets: | Sensitive: | no |

### Description

The addall() to the sorted treeset costs lots of CPU due to poor implementation of mass additions on the Java level. Collections.sort() is much faster.

the _sorted treeset is accessed much too often at the end of get().

Patch:

--- i2p-0.9.44/router/java/src/net/i2p/router/networkdb/kademlia/FloodfillPeerSelector.java +++ 44dev/router/java/src/net/i2p/router/networkdb/kademlia/FloodfillPeerSelector.java @@ -13,6 +13,7 @@ import java.util.Collections; import java.util.HashSet; import java.util.Iterator; +import java.util.LinkedList; import java.util.List; import java.util.Set; import java.util.TreeSet; @@ -203,13 +204,12 @@ * @param kbuckets now unused */ private List<Hash> selectFloodfillParticipantsIncludingUs(Hash key, int howMany, Set<Hash> toIgnore, KBucketSet<Hash> kbuckets) { - List<Hash> ffs = selectFloodfillParticipants(toIgnore, kbuckets); - TreeSet<Hash> sorted = new TreeSet<Hash>(new XORComparator<Hash>(key)); - sorted.addAll(ffs); + LinkedList<Hash> sorted = new LinkedList<Hash>(selectFloodfillParticipants(toIgnore, kbuckets)); + Collections.sort(sorted, new XORComparator<Hash>(key)); List<Hash> rv = new ArrayList<Hash>(howMany); - List<Hash> okff = new ArrayList<Hash>(ffs.size()); - List<Hash> badff = new ArrayList<Hash>(ffs.size()); + List<Hash> okff = new ArrayList<Hash>(sorted.size()); + List<Hash> badff = new ArrayList<Hash>(sorted.size()); int found = 0; long now = _context.clock().now(); long installed = _context.getProperty("router.firstInstalled", 0L); @@ -229,14 +229,13 @@ // 5 == FNDF.MAX_TO_FLOOD + 1 int limit = Math.max(5, howMany); - limit = Math.min(limit, ffs.size()); + limit = Math.min(limit, sorted.size()); MaskedIPSet maskedIPs = new MaskedIPSet(limit * 3); // split sorted list into 3 sorted lists for (int i = 0; found < howMany && i < limit; i++) { - Hash entry = sorted.first(); + Hash entry = sorted.poll(); if (entry == null) break; // shouldn't happen - sorted.remove(entry); // put anybody in the same /16 at the end RouterInfo info = _context.netDb().lookupRouterInfoLocally(entry); MaskedIPSet entryIPs = new MaskedIPSet(_context, entry, info, 2); @@ -442,11 +441,10 @@ } // are we corrupting _sorted here? for (int i = rv.size(); i < howMany; i++) { - if (_sorted.isEmpty()) + Hash entry = _sorted.pollFirst(); + if (entry == null) break; - Hash entry = _sorted.first(); rv.add(entry); - _sorted.remove(entry); } return rv; }

### Subtickets

### Change History (11)

### comment:1 Changed 11 months ago by

Milestone: | undecided → 0.9.46 |
---|---|

Status: | new → accepted |

### comment:2 Changed 11 months ago by

Without having benchmarked.. insertion into a TreeSet is O(logN) but there is a new node created for each element which could be causing the issue. If that's the case it's better to use an ArrayList, also it might sort even faster than a LinkedList.

The inefficiency of removing from the head of an ArrayList can be mitigated because it is known in advance `howMany`

are needed, so a single System.arraycopy will be sufficient.

### comment:3 Changed 11 months ago by

Java uses TimSort? which is O(n log(n)) https://en.wikipedia.org/wiki/Timsort

If java addAll() inserts one at a time, then in total it's going to be O(nsquared log(n)), right?

### comment:4 Changed 11 months ago by

No, addAll is still O(n*log(n)) or at least should be (you're adding N elements, log(N) each). The devil must be in the details somewhere.

### comment:5 Changed 11 months ago by

Right, sort() has the same complexity of O(n * log(n)). The difference is the overhead associated with each step. Adding to the tree means reorganizing the tree each time and that moves much more data around.

Ideally a mass insertion is done by sorting the added data first and doing a merge afterwards which is just what my hint does as the target is empty.

Collections.sort() sorts linked lists as arrays so there is no extra overhead for using a linked list.

### comment:6 Changed 11 months ago by

Collections.sort() sorts linked lists as arrays so there is no extra overhead for using a linked list.

List::sort creates a new array and copies all elements to it, and that will involve walking the entire LinkedList, whereas an ArraysList sorts the underlying array in-place.

### comment:7 Changed 11 months ago by

OK, so back to my question in comment 1 - have any empirical evidence of the savings with n ~= 1700, or a link to any post by anybody who has done similar analysis?

### comment:8 Changed 11 months ago by

Maybe 10% savings? Tweak test code as you like

Size: 1700 runs: 50000 list time: 9873 treeset time: 10938

int SZ = 1700; List<Integer> list = new ArrayList<Integer>(SZ); for (int i = 0; i < SZ; i++) { list.add(Integer.valueOf(i)); } long ltime = 0; long stime = 0; int RUNS = 50000; for (int i = 0; i < RUNS; i++) { Collections.shuffle(list); List<Integer> l2 = new ArrayList<Integer>(SZ); long start1 = System.currentTimeMillis(); l2.addAll(list); Collections.sort(l2); ltime += System.currentTimeMillis() - start1; long start2 = System.currentTimeMillis(); Set<Integer> set = new TreeSet<Integer>(); set.addAll(list); stime += System.currentTimeMillis() - start2; } System.out.println("Size: " + SZ + " runs: " + RUNS + " list time: " + ltime + " treeset time: " + stime);

### comment:9 Changed 11 months ago by

This is getting tricky. Converted int to string and used a custom comparator. List came out first on JDK11, Graal 20 and JDK 13 were better on the treeset. The flaw is the shuffle. Timsort expects real-world data, not randomized input.

So the comparison is unfair, the treeset always performs the same regardless of input, the sort takes longer due to randomizing.

### comment:10 Changed 11 months ago by

That's why I posted it, adjust it to your liking and do your own tests.

FYI my test was on Java 8.

### comment:11 Changed 9 months ago by

Resolution: | → fixed |
---|---|

Status: | accepted → closed |

Still not sure I understand comment 9, real-world data _is_ random.

However, I agree with OP that one-time sort of List is better than TreeSet?.addAll(). ArrayList? even better than LinkedList?.

Took first part of patch, tweaked to avoid additional copy by sorting the first list.

After that, no longer removing items from front of sorted.

Reduced size of okff and badff to max required.

Did not take the latter part of OP's patch to the Collector, that code is already known to be slow and I think rarely used now, to be investigated further, perhaps we can get rid of it completely by converting the remaining users of the slow (kbucket) methods to the faster (selectFloodfill) ones.

In 03c112ebc403b94b94031be3afbfb68a417c3a91 to be 0.9.45-18

**Note:**See TracTickets for help on using tickets.

I'll take a look for 0.9.46.

You have any reference link or measurement that this would make a difference for order of 1700 entries?