Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#1412 closed defect (fixed)

i2p looses client tunnels in 0.9.16-9rc

Reported by: Eche|on Owned by:
Priority: major Milestone: 0.9.17
Component: router/general Version: 0.9.16
Keywords: Cc:
Parent Tickets: Sensitive: no

Description

I used downtime to upgrade my linux router to 0.9.16-rc9
It starts up and builds tunnel to clients as expected.
But as soon as the 20 min startup is done, it shows job lag 1 msec, message delay 0 and all tunnel went to yellow state. Participating tunnel do look fine.
I did use the package from killyourtv repo.
logs do show:
11/26/14 11:14:20 AM WARN [cheduler 3/4] net.i2p.util.Clock : Warning - Updating target clock offset to 6352ms from -16904ms, Stratum 8
Nothing special more. I add teh wrapper.log soon (with threads dumped).
Logs section:

Subtickets

Attachments (1)

wrapper.log (270.1 KB) - added by Eche|on 5 years ago.
wrapper.log thread dump

Download all attachments as: .zip

Change History (5)

Changed 5 years ago by Eche|on

Attachment: wrapper.log added

wrapper.log thread dump

comment:1 Changed 5 years ago by Eche|on

Sorry, did miss:

I2P version: 0.9.16-9-rc-1
Java version: Oracle Corporation 1.8.0_25 (Java™ SE Runtime Environment 1.8.0_25-b17)

Wrapper version: 3.5.25
Server version: 8.1.16.v20140903
Servlet version: Jasper JSP 2.1 Engine
Platform: Linux amd64 3.2.0-4-amd64
Processor: Sandy Bridge H/M (corei)
Jbigi: Locally optimized native BigInteger? library loaded from file
Encoding: UTF-8
Charset: UTF-8

comment:2 Changed 5 years ago by zzz

Discussed at length in IRC today. Happened twice. Rolled back to 0.9.16-0 successfully.

Router in question has 13-15 active destinations. Nothing in the logs other than the BuildExecutor? is waiting, presumably because there's no non-zero-hop expl. tunnel for the paired tunnel for a build.

Reducing the delay from 250 to 25 ms in -10-rc but not optimistic this will help. Another possibility is to increase the number of expl. tunnels from 2 to 5 to avoid them being overwhelmed at startup.

Neither theory addresses why .16-0 works. No clues in the diff from -0, after looking carefully. There's very little changes in the router since .16-0.

comment:3 Changed 5 years ago by Eche|on

Priority: blockermajor
Resolution: fixed
Status: newclosed

Applied .16-10rc and 5 expl. tunnel and it does work. I keep a eye on it.
Lifting blocker.

comment:4 Changed 5 years ago by zzz

another thing that may be causing trouble, or something to experiment with. Putting here so it doesn't get lost:

<zzz> router.tunnelConcurrentBuilds=10
<zzz> the default is dynamically determined based on bandwidth and cpu but for a fast box the default is 10
<zzz> you could try raising it to see if that keeps the dog from barking
<zzz> if you have 220 tunnels, and a success of 50%, thats 440 builds in 10 minutes… average of 10 seconds… that's an average of 8 in progress at a time
<zzz> so once you get behind, that limit could be a problem
<zzz> else if (allowed > MAX_CONCURRENT_BUILDS) allowed = MAX_CONCURRENT_BUILDS; Never go beyond 10, that is uncharted territory (old limit was 5)

Note: See TracTickets for help on using tickets.