Opened 7 years ago

Closed 7 years ago

#1125 closed defect (fixed)

net.i2p.client.naming.DummyNamingServiceTest failure

Reported by: Zlatin Balevsky Owned by: Zlatin Balevsky
Priority: major Milestone: 0.9.9
Component: router/netdb Version:
Keywords: Cc: Zlatin Balevsky, zzz, dg
Parent Tickets: Sensitive: no


<zab> my intuition suggests it is related to the behavior with eepsites where the first time you click on an eepsite it times out, but when you retry it opens immediately
<dg> oh that happens to everyone?
<zab> because I'm seeing the exact same behavior - the first time I run the test it times out waiting for a response to a DestLookupMessage
<zab> the second and following times it immediately receives a response


Change History (8)

comment:1 Changed 7 years ago by Zlatin Balevsky

Traced this all the way to net.i2p.router.networkdb.kademlia.SearchJob. It must be finding the leaseSet but for some reason that job invokes the onFailure job.

comment:2 Changed 7 years ago by Zlatin Balevsky

Component: router/generalrouter/netdb
Owner: set to Zlatin Balevsky
Priority: minormajor
Status: newaccepted

comment:3 Changed 7 years ago by Zlatin Balevsky

I am observing the following behavior which I can't explain:

  1. Client issues a leaseSet lookup by hash
  2. leaseSet is not present locally, netDB starts an IterativeSearchJob
  3. That job sends encrypted DLM's to 5 peers, 3 seconds apart
  4. None of the peers respond (or at least I can't find it in the logs)
  5. The job fails because it reaches the total timeout (15000 ms)
  6. Somehow after this loop is done, the leaseSet is present in the netDB(?!?!) However, the client has already timed out so it does not receive it.

I tweaked the IterativeSarchJob.retry() method to check the netDB locally for the leaseSet and terminate sooner if it's found. That has solved the problem of the failing unit test as well as the problem of eepsites timing out on first request. But I still don't understand the how & why:

# old_revision [6e99142e0ff990ce195df0e37bcfa98c40dd1684]
# patch "router/java/src/net/i2p/router/networkdb/kademlia/"
#  from [5e98cd578cd06c6b1aee9b708717622dafef0fa4]
#    to [e88d25eac8f8ba1ffa73e66236b9ba791130148a]
--- router/java/src/net/i2p/router/networkdb/kademlia/	5e98cd578cd06c6b1aee9b708717622dafef0fa4
+++ router/java/src/net/i2p/router/networkdb/kademlia/	e88d25eac8f8ba1ffa73e66236b9ba791130148a
@@ -188,6 +188,12 @@ class IterativeSearchJob extends FloodSe
         while (true) {
+            if (_facade.lookupLeaseSetLocally(getKey()) != null) {
+                if (_log.shouldLog(Log.DEBUG))
+                    _log.debug("somehow key was found locally ?!?! "+getKey());
+                success();
+                return;
+            }
             Hash peer;
             synchronized (this) {
                 if (_dead) return;

comment:4 Changed 7 years ago by zzz

<zzz> the root cause is probably the matcher expiring or not being triggered
<zzz> I recently fixed the identical problem with LS verifies
<zzz> your patch could still be helpful in some cases but it would need some tweaks

  • zzz researches further

<zzz> 6dc5bed94321ae2b290cfe351511d18465e08f91 is the LS verify fix
<zzz> yeah there's a problem here
<zzz> InboundMessageDistributor? is handling it instead of putting it in the InNetMessagePool? so the matcher never gets called
<zzz> how to do this safely…
<zzz> and does it affect RIs too?

comment:5 Changed 7 years ago by zzz

I'll do the IMD fix. Fixes to the above patch that would be necessary in ISJ:

  • Handle both LS and RI. Probably using the new KNDF.lookupLocallyWithoutValidation()
  • Store old timestamp at start, and then compare it later, as there may be an old version in the db. Use DatabaseEntry?.getDate()

comment:6 Changed 7 years ago by zzz

IMD fix http://pastethis.i2p/show/PmGly4r1CJfcry7K3hRH/

We do not request encrypted responses to RI lookups by default.

comment:7 Changed 7 years ago by zzz

Status: acceptedtesting

Above patch pushed 1615807be24a13c9a1540fae231936bb768139e5

comment:8 Changed 7 years ago by Zlatin Balevsky

Resolution: fixed
Status: testingclosed

Released in production in 0.9.9

Note: See TracTickets for help on using tickets.