Opened 7 years ago

Closed 6 years ago

#690 closed defect (fixed)

Bote and/or seedless creating too many connection objects

Reported by: Zlatin Balevsky Owned by: HungryHobo
Priority: major Milestone:
Component: apps/plugins Version: 0.9.1
Keywords: halt OOM Cc: sponge
Parent Tickets: Sensitive: no

Description

"user" reported a router halt. Lots of stack traces available here:

http://t5qds7twb7eyyvp2fchhbn74tgdzv774rlru5guit5jkn4a6do7a.b32.i2p/again/

Looking at stack_20 for example we see that one of the runner threads is holding the I2PTunnelRunner.slock monitor while waiting for some other operation to finish. A ton of other I2PTunnelRunner threads are blocked waiting to acquire the same monitor.

"I2PTunnelRunner 8198" daemon prio=10 tid=0x00007f59545d5800 nid=0x7011 in Object.wait() [0x00007f593b7f6000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at net.i2p.client.streaming.Connection.packetSendChoke(Connection.java:214)
        - locked <0x00000000f7972570> (a java.util.TreeMap)
        at net.i2p.client.streaming.PacketLocal.waitForAccept(PacketLocal.java:215)
        at net.i2p.client.streaming.MessageOutputStream.flushAvailable(MessageOutputStream.java:490)
        at net.i2p.client.streaming.MessageOutputStream.flush(MessageOutputStream.java:341)
        at net.i2p.client.streaming.MessageOutputStream.flush(MessageOutputStream.java:305)
        at net.i2p.i2ptunnel.I2PTunnelRunner.run(I2PTunnelRunner.java:161)
        - locked <0x00000000e2d6aea0> (a java.lang.Object)

There are some other problems in his report, such as 28468 instances of net.i2p.client.streaming.Connection. It is unclear if they are the cause or the result of the above condition. I'll be updating this post if I gain more insight.

Subtickets

Change History (11)

comment:1 Changed 7 years ago by Zlatin Balevsky

Current working theory:

  1. Client app (bote?) has a bug or legitimate reason for tunnel creation burst
  2. There are no limits to number of tunnels, so many new I2PTunnelRunner threads get created
  3. However, because of that monitor only one such thread at a time can initialize a tunnel, the rest queue up
  4. Client app has a timeout on their side after which it tries to re-create the tunnels, so it keeps recreating.
  5. The rate of newly created tunnels is faster than the rate at which old ones fail (250ms) so the jvm runs out of memory

comment:2 Changed 7 years ago by zzz

Link above is dead so I'm not following completely, not sure what is stuck where? deadlock or what?

But:

  • yes I think i2pbote has some bugs, HungryHobo? is working on at least some of them
  • no we don't generally have outbound limits, only inbound. If you die you did it to yourself.

comment:3 Changed 7 years ago by zzz

Cc: sponge added

cc'ing sponge as, (not completely sure but afaik) bote uses a direct I2CP connection, except for seedless-inside-bote which goes through i2ptunnel. But again, without seeing the traces linked above, just a wild guess.

comment:4 Changed 7 years ago by zzz

Cc: HungryHobo added

… and HH

see also #686

comment:5 Changed 7 years ago by DISABLED

Link is alive again, and content gzpi'ed :)

comment:6 Changed 7 years ago by DISABLED

It could well be i2pbote-induced, becoz since disabling i2pbote it as not happened again

comment:7 Changed 7 years ago by zzz

thx user for bringing back the traces.

Looking at the last one, stack_51, there's 89 threads in HTTPClient:

> grep clientConnectionRun stack_51|sort | uniq -c
     17 	at net.i2p.i2ptunnel.I2PTunnelHTTPClient.clientConnectionRun(I2PTunnelHTTPClient.java:913)
     72 	at net.i2p.i2ptunnel.I2PTunnelHTTPClient.clientConnectionRun(I2PTunnelHTTPClient.java:982)

…and the majority of them (72) are waiting for a connection. The max timeout is 60 seconds, so that means that over one a second was issued. The others (17) are waiting for a naming lookup, via lookupDest().

As I said above, all indications are that this is seedless-within-bote. I'm not going to work on it further, it's up to HH and sponge. Something looks majorly out of control inside bote. I really know nothing about it, and am not prepared to start asking what version you are running, etc. It's up to them. It's another indication of the inefficiency of going through the HTTP proxy when you already have an I2CP session. And doing (presumably) b32 lookups through there too. Really bad idea. Use I2PSocketEepGet through your own session.

Now, stack_20 is very different from stack_51, of course:

> grep d6aea0 stack_20|sort | uniq -c
      1 	- locked <0x00000000e2d6aea0> (a java.lang.Object)
    182 	- waiting to lock <0x00000000e2d6aea0> (a java.lang.Object)

Again, looks like seedless-gone-insane, but slowed down by the lock. Maybe we're doing seedless a favor?

I went back thru the mtn history and the lock at line 161 of I2PTunnelRunner has been there from the beginning. The comment? I added that. So should we get rid of it? For further study…

If anybody has another favorite stack_xx file, please comment. Those are the only two I looked at.

comment:8 Changed 7 years ago by Zlatin Balevsky

m0ar evidence pointing to seedless and/or bote spinning out of control, 23668 connection objects:

http://t5qds7twb7eyyvp2fchhbn74tgdzv774rlru5guit5jkn4a6do7a.b32.i2p/new/

this is from a router with the monitor in I2PTunnelRunner removed. Is it possible to rename issues in trac?

comment:9 Changed 7 years ago by Zlatin Balevsky

Component: apps/i2ptunnelapps/other
Summary: Nested locking in I2PTunnelRunner causes haltBote and/or seedless creating too many connection objects

renaming issue

comment:10 Changed 7 years ago by zzz

Cc: HungryHobo removed
Component: apps/otherapps/plugins
Owner: set to HungryHobo
Status: newassigned

I2PTunnelRunner lock removed in 0.9.1-13.

comment:11 Changed 6 years ago by HungryHobo

Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.