Opened 14 months ago

Closed 12 months ago

Last modified 10 months ago

#2314 closed enhancement (worksforme)

Snark penalized under higher load

Reported by: jogger Owned by:
Priority: major Milestone: undecided
Component: router/general Version: 0.9.36
Keywords: Cc:
Parent Tickets: Sensitive: no

Description

I have seen throughput issues with Snark (should affect other webapps as well) under higher system load around 80%. Was seeding quite a couple of torrents from 6 instances. Outbound Snark traffic was around 600 kBps. It raised to 1 MBps when share % was lowered or shutdown initiated, thus lowering system load. Dropped back to 600 kBps when lifting the share limit or cancelling shutdown.

Works also the other way round: System runs at 65% load and scan of some torrents is initiated, occupying one cpu core for half an hour and bringing system load to 80%. Snark throughput drops, but number of participating tunnels and their traffic stays the same.

Does not matter whether Snark is connected standalone through I2CP or run within the router.

From TunnelGatewayPumper?.java

  • TODO this combines IBGWs and OBGWs, do we wish to separate the two
  • and/or prioritize OBGWs (i.e. our outbound traffic) over IBGWs (participating)?

Maybe this is the hint to solve the issue, deliver packets for internal destinations first and send own traffic first. I think this is important for targeting low end platforms.

Subtickets

Change History (5)

comment:1 Changed 14 months ago by zzz

Not sure if there's any bug here. As load increases, latencies are going to increase, and throughput is going to go down.

Additionally, we always prioritize traffic for internal destinations over participating traffic (as OP proposes in last paragraph). Participating traffic will eventually move to other routers if performance is poor, but that may take minutes or hours - remember that tunnels are a 10 minute commitment, and there's no central authority to rate routers.

Snark is marked as lower priority than other local traffic (e.g. IRC, HTTP) but it sounds like all your local traffic is snark.

Never heard of anybody getting even close to 1 MBps for snark, that's crazy fast.

comment:2 Changed 14 months ago by jogger

I have narrowed down the issue. Maybe Linux only. Basic reason is the well known fact that Linux CFS does not play well with Java. Snark threads do not get CPU often enough. Please close this one, I will analyse this further and then come up with a suggestion concerning number and priority of threads.

And yes, over 1 MBps Snark output is possible on ARM 32. Enough leechers that take 50 to 100kBps through a single tunnel.

comment:3 Changed 13 months ago by jogger

Can´t currently reproduce, my router does not get enough load since some weeks (maybe thanks to NTCP2?). However maybe this is just #1411 resurfacing? See description in #1943. This is on an 8-core machine, so when one CPU maxes out the machine keeps going quite well.

comment:4 Changed 12 months ago by zzz

Resolution: worksforme
Status: newclosed

closing as requested

comment:5 Changed 10 months ago by jogger

After zab´s patch and some tweaks I was able to run the same large torrent off two identical machines with 200 and 2500+ tunnels for some weeks. Performance penalty for the loaded machine < 10% measured as bytes seeded, so works for me now.

Note: See TracTickets for help on using tickets.