Opened 18 months ago
Last modified 17 months ago
#2619 new defect
NCTP Writer: Do not run dry
Reported by: | jogger | Owned by: | zzz |
---|---|---|---|
Priority: | minor | Milestone: | undecided |
Component: | router/transport | Version: | 0.9.42 |
Keywords: | Cc: | ||
Parent Tickets: | Sensitive: | no |
Description
Writer.java has a comment in wantsWrite() that highlights the bug. If a con is already transmitting the next write to the same con wakes up another instance falling into wait() immediately again.
Putting the notify() there where the comment is, solves the issue. NTCP Writer then runs less frequently, with a longer average time slice. CPU usage clearly down. A bit hard to read, using only two sets would make the code much more readable.
Subtickets
Change History (13)
comment:1 Changed 18 months ago by
comment:2 Changed 18 months ago by
Yes, that´s line 65 and there is another unnecessary notify() in connectionClosed(). Patch below. I was referring to the fact that having three sets for just four states (inactive, pending, active or both) makes code hard to read and that might be the reason nobody finally resolved this. So I removed one of them.
A reason should also be given why to retain the livewrites logic as it partially defeats multithreading. NTCP Writer runs for a much shorter time than the minimum linux time slice so looping over the same con avoids context switching and cache invalidation. Also under load another thread would spend about as much time waiting on the processor run queue before running as the current writer needs to process the packet.
--- i2p-0.9.42/router/java/src/net/i2p/router/transport/ntcp/Writer.java +++ 42dev/router/java/src/net/i2p/router/transport/ntcp/Writer.java @@ -21,17 +21,15 @@ private final Log _log; private final Set<NTCPConnection> _pendingConnections; private final Set<NTCPConnection> _liveWrites; - private final Set<NTCPConnection> _writeAfterLive; private final List<Runner> _runners; - + public Writer(RouterContext ctx) { _log = ctx.logManager().getLog(getClass()); _pendingConnections = new LinkedHashSet<NTCPConnection>(16); _runners = new ArrayList<Runner>(5); _liveWrites = new HashSet<NTCPConnection>(5); - _writeAfterLive = new HashSet<NTCPConnection>(5); } - + public synchronized void startWriting(int numWriters) { for (int i = 1; i <=numWriters; i++) { Runner r = new Runner(); @@ -46,25 +44,22 @@ r.stop(); } synchronized (_pendingConnections) { - _writeAfterLive.clear(); + _liveWrites.clear(); + _pendingConnections.clear(); _pendingConnections.notifyAll(); } } - + public void wantsWrite(NTCPConnection con, String source) { //if (con.getCurrentOutbound() != null) // throw new RuntimeException("Current outbound message already in play on " + con); - boolean already = false; - boolean pending = false; + boolean already; + boolean pending; synchronized (_pendingConnections) { - if (_liveWrites.contains(con)) { - _writeAfterLive.add(con); - already = true; - } else { - pending = _pendingConnections.add(con); - // only notify here if added? - } - _pendingConnections.notify(); + already = _liveWrites.contains(con); + pending = _pendingConnections.add(con); + if (!already) + _pendingConnections.notify(); } if (_log.shouldLog(Log.DEBUG)) _log.debug("wantsWrite: " + con + " already live? " + already + " added to pending? " + pending + ": " + source); @@ -72,18 +67,15 @@ public void connectionClosed(NTCPConnection con) { synchronized (_pendingConnections) { - _writeAfterLive.remove(con); + _liveWrites.remove(con); _pendingConnections.remove(con); - // necessary? - _pendingConnections.notify(); } } - + private class Runner implements Runnable { - + /** a scratch space to serialize and encrypt messages */ private final NTCPConnection.PrepBuffer _prepBuffer; - + private volatile boolean _stop; public Runner() { @@ -98,27 +90,31 @@ while (!_stop) { try { synchronized (_pendingConnections) { - boolean keepWriting = (con != null) && _writeAfterLive.remove(con); + _liveWrites.remove(con); + boolean keepWriting = (con != null) && _pendingConnections.contains(con); if (keepWriting) { // keep on writing the same one if (_log.shouldLog(Log.DEBUG)) _log.debug("Keep writing on the same connection: " + con); } else { - _liveWrites.remove(con); con = null; if (_pendingConnections.isEmpty()) { - if (_log.shouldLog(Log.DEBUG)) - _log.debug("Done writing, but nothing pending, so wait"); - _pendingConnections.wait(); - } else { - Iterator<NTCPConnection> iter = _pendingConnections.iterator(); + if (_log.shouldLog(Log.DEBUG)) + _log.debug("Done writing, but nothing pending, so wait"); + _pendingConnections.wait(); + } + Iterator<NTCPConnection> iter = _pendingConnections.iterator(); + while (iter.hasNext()) { con = iter.next(); - iter.remove(); - _liveWrites.add(con); - if (_log.shouldLog(Log.DEBUG)) - _log.debug("Switch to writing on: " + con); + if (_liveWrites.add(con)) { + if (_log.shouldLog(Log.DEBUG)) + _log.debug("Switch to writing on: " + con); + break; + } + con = null; } } + _pendingConnections.remove(con); } } catch (InterruptedException ie) {} if (!_stop && (con != null)) {
comment:3 Changed 18 months ago by
I'm testing moving the notify (only) in Writer and Reader. Haven't looked closely at the other changes yet.
related: #2620
comment:4 Changed 18 months ago by
No ill effects seen so far. In case anybody else wants to play along:
============================================================ --- router/java/src/net/i2p/router/transport/ntcp/Reader.java 288ad24419cf980ee9841954b9e7d9ddcebab57f +++ router/java/src/net/i2p/router/transport/ntcp/Reader.java c182983c351e1a4689518652915f5bb3b0b1e333 @@ -64,9 +64,8 @@ class Reader { already = true; } else { _pendingConnections.add(con); - // only notify here if added? + _pendingConnections.notify(); } - _pendingConnections.notify(); } if (_log.shouldLog(Log.DEBUG)) _log.debug("wantsRead: " + con + " already live? " + already); ============================================================ --- router/java/src/net/i2p/router/transport/ntcp/Writer.java 7e120839ba0da16f20d6ef46f90681c676d5c594 +++ router/java/src/net/i2p/router/transport/ntcp/Writer.java ecffb6382dcc3ab23774809b3e70440860efc015 @@ -62,9 +62,8 @@ class Writer { already = true; } else { pending = _pendingConnections.add(con); - // only notify here if added? + _pendingConnections.notify(); } - _pendingConnections.notify(); } if (_log.shouldLog(Log.DEBUG)) _log.debug("wantsWrite: " + con + " already live? " + already + " added to pending? " + pending + ": " + source);
comment:5 Changed 18 months ago by
20 readings from eepget on a testnet with patch from comment 4:
073.17 145.50 144.13 145.60 120.55 146.98 100.33 155.60 156.12 169.70 152.36 098.94 147.73 158.96 155.32 103.25 036.98 146.73 047.75 080.38
No artificial delay or loss were added; SSU was not disabled, so there is another variable to consider, but so far throughput looks about the same as vanilla.
Let me know if you would like me to disable SSU and retest.
comment:6 Changed 18 months ago by
I assume you are not running in a CPU-constrained environment, where above patches have most benefit. Before I post patches I have prechecked them for positive effect on context switches and / or CPU. It would nice if you could provide such figures along with your test results. I could provide you with a bash script for linux to start.
Currently I am not aware whether your eepget test involves any latency effects. Those can only be seen if there is a lot of request/response on the message level. This way context switches would add up.
comment:7 Changed 17 months ago by
Ok so zab's test shows no ill effect, but doesn't have evidence of improvement, and jogger has provided a possible explanation. Jogger, I know you have tested the patch in comment 2, but have you tested the one in comment 4? If it does provide benefit I can get it in for .43.
comment:8 Changed 17 months ago by
Moving the statements as in comment 4 was the first thing I tried before the other improvements. So tested that OK before providing comment 2.
comment:9 Changed 17 months ago by
@jogger thanks, you say the change in comment 4 "tested OK", but does it provide any measurable benefit? you have any test results?
comment:10 Changed 17 months ago by
OK, so we go to my long time logging which I use to test usefulness of all my improvements on the live net. Here we have two long time samples (>1000 sec) both taken at 22.6% total CPU, all averaged per second. Column 2 is number of voluntary context switches, column 6 is time spent idle on the processor run queue. The comparison is fair, as the number of pumper activations was also nearly exactly the same. As one can see the number of context switches for NTCP reader/writer is clearly down compared to the pumper. The time spent on the run queue cut down more than half (this was helped also by complete elimination of Tunnel GW Pumper and NTCP Send finisher).
Stock 0.9.42
226 BuildExecut 5 6 10 26 14 2527 1350 5495 2935 BuildHandle 4 32 36 94 22 2632 606 24167 5569 I2CP_Reader 82 41 123 185 42 1504 345 2247 515 I2CP_Writer 82 0 82 7 15 88 183 88 183 NTCP_Pumper 854 146 999 325 106 325 106 380 124 NTCP_reader 741 189 930 416 265 448 285 562 357 NTCPSendFin 953 0 953 55 114 58 120 58 120 NTCP_writer 1054 0 1054 125 154 119 146 119 146 Tunnel_GW_p 462 36 498 229 142 460 285 496 307 UDP_ACK_sen 11 0 11 30 2 2586 183 2647 187 UDP_Establi 7 5 12 21 5 1699 426 3057 767 UDP_message 167 1 168 51 29 301 172 303 173 UDP_packet_ 124 106 230 244 68 1058 294 1967 547 UDP_Packet_ 222 4 226 64 33 282 147 287 149 UDPReceiver 241 0 241 26 42 109 175 109 175 UDPSender_1 228 0 228 34 37 150 163 150 163
My current dev version, containing my published patches and then some:
226 BuildExec 4 1 5 24 2 4548 355 5647 441 BuildHand 4 6 9 105 3 11432 323 29220 826 I2CP_Read 87 0 88 31 18 356 204 357 205 I2CP_Writ 71 0 71 8 5 115 75 115 75 JobQueue_ 267 54 321 479 83 1493 259 1795 312 NTCP_Pump 857 71 928 234 43 252 47 273 50 NTCP_read 495 20 514 365 41 710 80 738 83 NTCP_writ 842 9 851 135 53 158 63 160 64 UDP_ACK_s 18 2 20 56 3 2822 146 3108 161 UDP_Estab 7 1 8 24 1 2834 121 3381 145 UDP_messa 297 2 299 134 19 447 65 450 65 UDP_packe 190 32 222 196 19 882 83 1031 98 UDP_Packe 314 25 339 102 23 301 67 325 72 UDPReceiv 348 2 351 42 44 121 124 122 125 UDPSender 354 6 360 48 18 133 50 135 51
comment:11 Changed 17 months ago by
Thanks for the data. I was hoping for test results for the comment 4 patch *only* - since zab's test didn't show any benefit, we don't really have any good reason to check in that patch right now.
comment:12 Changed 17 months ago by
Here is a quick shot for your patch, traffic levels slightly different, but also shows the significant reduction of context switches for reader/writer compared to pumper and the huge savings in run queue wait time.
238 BuildExec 4 1 5 29 2 5359 386 7025 506 BuildHand 3 5 8 104 3 12360 353 31160 890 I2CP_Read 58 8 66 151 15 2301 228 2618 259 I2CP_Writ 52 0 52 6 4 123 67 123 67 JobQueue_ 51 3 54 77 7 1443 127 1529 135 NTCP_Pump 833 66 900 218 34 242 38 261 41 NTCP_read 453 42 495 359 36 726 73 793 80 NTCPSendF 711 0 712 48 28 67 40 67 40 NTCP_writ 756 1 756 122 35 161 47 161 47 Tunnel_GW 322 6 328 173 25 526 76 536 78 UDP_ACK_s 18 1 19 58 2 3035 80 3222 85 UDP_Estab 8 1 9 28 1 3124 125 3706 149 UDP_messa 292 4 296 145 17 492 58 498 58 UDP_packe 138 46 184 356 15 1930 79 2578 106 UDP_Packe 301 27 327 117 17 358 51 390 55 UDPReceiv 344 1 345 43 37 123 107 124 107 UDPSender 355 12 366 58 17 159 46 165 47
comment:13 Changed 17 months ago by
Thank you. Patch from comment 4 in e1163891c6249ea2bdd626fec783c1f88ad462fa 0.9.42-11-rc
Leaving ticket open for evaluation of the other changes in comment 2.
this would be easier if you supplied a patch, but I think you want to move the notify() call from line 67 to 65?
I don't understand your "two sets" comment or whether it's related at all to the notify() thing.