Opened 3 years ago
Last modified 3 years ago
#2251 accepted enhancement
Remove sleep from EventPumper
Reported by: | Zlatin Balevsky | Owned by: | zzz |
---|---|---|---|
Priority: | minor | Milestone: | 0.9.36 |
Component: | router/transport | Version: | 0.9.34 |
Keywords: | ntcp nio | Cc: | |
Parent Tickets: | Sensitive: | no |
Description (last modified by )
Just opening a discussion on this one. I've been running a router with patches from #2247 + removed sleep() for a few hours with YourKit profiling enabled on OpenJDK 8 and cpu usage looks good. Bandwidth usage also looks higher than normal, but that could be due to it being Friday night, so I will wait for a few days and post a graph.
Subtickets
Attachments (1)
Change History (9)
comment:1 Changed 3 years ago by
comment:2 Changed 3 years ago by
Description: | modified (diff) |
---|---|
Milestone: | undecided → 0.9.36 |
comment:3 Changed 3 years ago by
We put that sleep in there out of desperation, but it is effective.
Unfortunately, we have no easy way to know or find out if 100% CPU issues have been eliminated on all platforms, for all possible JREs.
Perhaps we need a better way to determine if a sleep is necessary, perhaps by counting consecutive loops with no valid keys. We'd have to minimize pointless wakeup()s first.
Following probably applies to #2247 but putting it here re: minimizing wakeup()s:
In Jetty outside the selector (see their SelectChannelEndPoint?.java) they have a local AtomicInteger? for wanted local interest ops. Then they compareAndSet, and if different, queue the change to the selector.
Then in selector thread (see their ManagedSelector?.java) they have a state with an AtomicReference? for a state enum. They compareAndSet that, and then wakeup if changed.
Changed 3 years ago by
Attachment: | bw-no-sleep.png added |
---|
BW for 10 days with patches from 2247 and no sleep()
comment:4 Changed 3 years ago by
Ok, if sleep() is here to stay I'm going to stop my experiment (but doesn't that bandwidth graph look just beautiful?) and move on to #2252
comment:5 Changed 3 years ago by
survey of throttles per day:
medium-speed home router: 2
medium-fast non-ff: 30
fast ff: 3000
so the sleep isn't so harmless on a big router.
obviously loops in 2 sec will go up with more keys.
changing threshold to increase with increased keys, testing now
comment:6 Changed 3 years ago by
I didn't count the actual events on my busy router, but from YourKit profiling over a period of 60 seconds, 1.5 seconds were spent in sleep = 1500/25 = 60 such events in that one minute period.
Here's another thought - why don't we declare some platform combinations "fixed" and remove the sleep just for those? For example (x86, OpenJDK/OracleJDK, v1.7+) can be one such combination. It doesn't need to be an exhaustive list but as long as we can cover the most common ones it will still be impactful.
comment:7 Changed 3 years ago by
Status: | new → accepted |
---|
Existing stat is ntcp.failsafeThrottle
Threshold changed as in comment 5 in ca3e097fd6f16a2c304ee67a679f87af96e918ff 0.9.35-13 to increases the limit from 512 to max(512, 2*connections)
Test results: reduced throttles-per-day to zero for the fast ff; no significant change for medium and medium-fast routers.
Leaving open for additional criteria, as proposed in comment 6
comment:8 Changed 3 years ago by
Here is a patch that disables selector sleep on { x86_64, (Oracle | Open)JDK, Java7+) platform combo
http://zerobin.i2p/?827d4b81f047b696#Y53PWnL7atnKA8VaK9VuiQo18YckEB8GPB5ixzL1Z6I=
Er, I meant ticket #2247