Opened 6 years ago

Closed 4 years ago

#758 closed defect (fixed)

I2CP Congestion Control

Reported by: guest Owned by: zzz
Priority: minor Milestone:
Component: api/i2cp Version: 0.9.3
Keywords: review Cc: zab@…
Parent Tickets:

Description (last modified by zzz)

I2P version: 0.9.3-0
Java version: Oracle Corporation 1.7.0_09 (OpenJDK Runtime Environment 1.7.0_09-b30)
Wrapper version: 3.1.1
Server version: 6.1.26
Servlet version: Jasper JSP 2.1 Engine
Platform: Linux amd64 3.5.0-17-generic
Processor: Phenom II / Opteron Gen 3 (Shanghai/Deneb/Heka/Callisto?, 45 nm) (athlon64)
Jbigi: Locally optimized native BigInteger? library loaded from file
Encoding: UTF-8
Charset: UTF-8

running 0.9.3 for some hours. now these two error messages popup in the log several times and participating tunnels dropped from >1000 to about 200 now:

[P reader 2/4] uter.client.MessageReceivedJob: Error writing out the message status message I2CP write to queue failed
     at net.i2p.router.client.QueuedClientConnectionRunner.doSend(
     at net.i2p.router.client.MessageReceivedJob.messageAvailable(
     at net.i2p.router.client.MessageReceivedJob.runJob(
     at net.i2p.router.client.ClientConnectionRunner.receiveMessage(
     at net.i2p.router.client.ClientManager$HandleJob.runJob(
     at net.i2p.router.client.ClientManager.messageReceived(
     at net.i2p.router.client.ClientManagerFacadeImpl.messageReceived(
     at net.i2p.router.tunnel.InboundMessageDistributor.handleClove(
     at net.i2p.router.message.GarlicMessageReceiver.handleClove(
     at net.i2p.router.message.GarlicMessageReceiver.receive(
     at net.i2p.router.tunnel.InboundMessageDistributor.distribute(
     at net.i2p.router.tunnel.TunnelParticipant$DefragmentedHandler.receiveComplete(
     at net.i2p.router.tunnel.FragmentHandler.receiveComplete(
     at net.i2p.router.tunnel.FragmentHandler.receiveSubsequentFragment(
     at net.i2p.router.tunnel.FragmentHandler.receiveFragment(
     at net.i2p.router.tunnel.FragmentHandler.receiveTunnelMessage(
     at net.i2p.router.tunnel.TunnelParticipant.dispatch(
     at net.i2p.router.tunnel.TunnelDispatcher.dispatch(
     at net.i2p.router.InNetMessagePool.doShortCircuitTunnelData(
     at net.i2p.router.InNetMessagePool.shortCircuitTunnelData(
     at net.i2p.router.InNetMessagePool.add(
     at net.i2p.router.transport.TransportManager.messageReceived(
     at net.i2p.router.transport.TransportImpl.messageReceived(
     at net.i2p.router.transport.ntcp.NTCPConnection$ReadState.receiveLastBlock(
     at net.i2p.router.transport.ntcp.NTCPConnection$ReadState.receiveSubsequent(
     at net.i2p.router.transport.ntcp.NTCPConnection$ReadState.receiveBlock(
     at net.i2p.router.transport.ntcp.NTCPConnection.recvUnencryptedI2NP(
     at net.i2p.router.transport.ntcp.NTCPConnection.recvEncryptedFast(
     at net.i2p.router.transport.ntcp.NTCPConnection.recvEncryptedI2NP(
     at net.i2p.router.transport.ntcp.Reader.processRead(
     at net.i2p.router.transport.ntcp.Reader.access$400(
     at net.i2p.router.transport.ntcp.Reader$

[nal Reader 3] ent.ClientMessageEventListener: Error delivering the payload I2CP write to queue failed
     at net.i2p.router.client.QueuedClientConnectionRunner.doSend(
     at net.i2p.router.client.ClientMessageEventListener.handleReceiveBegin(
     at net.i2p.router.client.ClientMessageEventListener.messageReceived(
     at net.i2p.internal.QueuedI2CPMessageReader$


Change History (10)

comment:1 Changed 6 years ago by zzz

  • Component changed from unspecified to api/i2cp
  • Description modified (diff)
  • Owner set to zzz
  • Status changed from new to accepted

Excellent. As part of the whole bufferbloat project I put limits on the I2CP queue sizes. Some destination on your router must have been very busy.

I'd completely forgotten about this change. I'll take another look at the limits and probably change the error to a warning. And maybe change the queues to codel w/ priority.

Nothing to worry about, and not a direct cause of your participating tunnel count dropping - although the high local traffic would cause participating tunnels to drop.

Thanks for the report. Great to see that a new limit kicked in for somebody.

comment:2 Changed 6 years ago by zzz

Log as a warn instead of throwing exception in 0.9.3-3.

Leaving open as I'm considering switching these queues to CoDelPriorityBlockingQueue?.

comment:3 Changed 6 years ago by zab

  • Cc zab@… added

I would be curious to find out why the destination was very busy. Also, was the destination over NTCP or SSU? This may be a symptom of another problem (starvation, deadlock, $UNKNOWN).

I'm trying to think what would be a good way to diagnose this. A thread & heap dumps would be ideal if we can catch them the moment the queue overfills. I'll open a separate ticket with some related ideas.

comment:4 Changed 6 years ago by guest

Well, the only thing I can add is that my router is always quite busy (especially since 0.9.2) at about 700-1000kb/s upload. And I'm seeding a lot of torrents, so it's half i2psnark and half participating.
But it never shows any delayed tasks and message delay is about 100-400ms.

comment:5 Changed 6 years ago by zzz

@zab you're conflating local destinations and transports. The queue that overflowed was in I2CP, unrelated to transport (NTCP/SSU) queues.

I'm not happy with my simple fix in -3 as it will leak resources. I'm going to redo it for -4 and plug more leaks along the way.

Also in -4 I'm going to implement a change to reduce the number of I2CP messages which will keep that queue from filling up so quickly.

Snark does have some issues with holding locks for a long time during checking operations. There's also timer congestion/blocking issues in streaming and in the I2CP bandwidth throttler. Any of these could cause snark to fall behind.

comment:6 Changed 6 years ago by zzz

Improved fix in 0.9.3-4, throw exception so resources can be reclaimed, but log as WARN where caught.

comment:7 Changed 6 years ago by guest

btw.: thanks for the great work!

comment:8 Changed 6 years ago by zzz

  • Milestone changed from 0.9.4 to 0.9.5
  • Summary changed from 2 errors in 0.9.3 to I2CP Congestion Control

0.9.3-5 contains additional fixes for dropped leaseset requests over i2cp.

Leaving this ticket open to fully review I2CP for correct handling of dropped messages and ensuring no resource leaks. Right now there's no cleaner one map on the router side but I worked around it by implementing 'fast receive' so we don't use that map.

I also have priorities and codel all coded up but probably won't check it in anytime soon. Part of the problem is a blocking put() on the client side that we lose if we go to PBQ.

Some issues are also different for in-JVM and external clients.

The whole issue of congestion control and anti-bufferbloat in I2CP needs some attention.

The immediate issue in the OP should be addressed, so renaming the ticket and pushing out a release, for further study.

comment:9 Changed 6 years ago by str4d

  • Keywords review added
  • Milestone 0.9.5 deleted

comment:10 Changed 4 years ago by zzz

  • Resolution set to fixed
  • Status changed from accepted to closed

declaring fixed, as much as we are going to.

Note: See TracTickets for help on using tickets.