Opened 9 years ago

Last modified 18 months ago

#698 open defect

Message delay calculation

Reported by: zzz Owned by: zzz
Priority: minor Milestone:
Component: router/transport Version: 0.9.1
Keywords: performance time Cc: Zlatin Balevsky
Parent Tickets: Sensitive: no


An important part of tunnel accept throttling is the current outbound message delay. Unfortunately that delay is calculated differently for NTCP and SSU. For NTCP it is near zero as it doesn't include the round-trip ack. For SSU it does include the round trip.

Therefore, a router that has mostly SSU connections, or has NTCP disabled, has a relatively high average message delay and accepts way too few tunnel requests.

Fix the throttling threshold, or fix the delay calculation, or both.


Change History (14)

comment:1 Changed 9 years ago by Zlatin Balevsky

Cc: Zlatin Balevsky added

comment:2 Changed 9 years ago by zzz

See http://zzz.i2p/topics/1255 for discussion of changes in 0.9.3 that have affected this stat (but not fixed the issue in the OP).

comment:3 Changed 9 years ago by zzz


In addition to the NTCP vs SSU problem, on the SSU side, the stat is highly influenced by messages that are retransmitted before being acked. As the stat is a straight average, it's the outliers that drive the stat.

Since the max RTO was quintupled in 0.9.3 (from 3 to 15 seconds), the outliers now have much more impact on the stat. That's what I was getting at in the zzz.i2p thread.

Median is of course not easy to do, so let's not… but maybe we only contribute to the stat average for messages acked without a retransmission?

Or, perhaps, we completely rework the stat to include only local queueing, and not the full time-to-ack.

The stat right now is a complete disaster, which wouldn't matter if we weren't throttling build requests based on it.

comment:4 Changed 8 years ago by str4d

Milestone: 0.9.4

comment:5 Changed 7 years ago by str4d

Keywords: performance time added

comment:6 Changed 6 years ago by str4d

Status: newopen

comment:7 Changed 5 years ago by Mysterious

I had a look into this metric, and how we can make it more useful.

Please check this blog post for example patch: http://mysterious.i2p/blog/poking-into-the-i2p-router.html

comment:9 Changed 5 years ago by Mysterious

Does it seem sensible to make the stat UDP transport only? (see above patches)

comment:10 Changed 5 years ago by zzz

A few reasons why it may make sense to leave NTCP in there:

  • In severe congestion situations ("backlog"), the NTCP number will be greater than zero
  • We may wish to someday add a real end-to-end RTT measurement in there for NTCP, either at the I2NP-layer, or in NTCP2 http://zzz.i2p/topics/1577 - but it's clearly not a high-impact patch, easy to restore if we have some meaningful NTCP measurement available
  • To remove NTCP now, with all the zeros, may actually disrupt the thresholds we have in there now. But, after all, that's the point of this ticket.

Having said all that, not opposed philosophically to the patch, if we think through the issues. Have any test results?

comment:11 Changed 5 years ago by Mysterious

I've had time to gather some statistics, they are limited to my personal situation.

The majority of the time the stat is between 500 ms and 1000 ms. I've never seen it below 400 ms. Outliers to 1500 ms, in some rare cases to 2000 ms. The outliers seem to improve when there is more connection capacity, but this is not the typical scenario for a node. So i'd say 500 ms-1000 ms, with outliers to 2000 ms. I don't have data for low bandwidth, crappy kind of connection, metrics. I could gather lower bandwidth metrics if valuable, but my network path is fairly solid, so it won't resemble an intermittent connection, or one with a high ping (for a nearby server ~5 ms is typical).

comment:12 Changed 4 years ago by Mysterious

Personally I see the UDP only metrics as more meaningful, but strictly speaking we should probably bump up the 2250 ms limit if we go this route. I've gotten close to it on a fairly decent connection, as I mentioned in my previous post.

comment:13 Changed 18 months ago by Zlatin Balevsky

Sensitive: unset

Or, perhaps, we completely rework the stat to include only local queueing, and not the full time-to-ack.

I vote for this option. Isn't it as simple as:

--- a/router/java/src/net/i2p/router/transport/
+++ b/router/java/src/net/i2p/router/transport/
@@ -285,7 +285,7 @@ public abstract class TransportImpl implements Transport {
         //if (true)
         //    _log.error("(not error) I2NP message sent? " + sendSuccessful + " " + msg.getMessageId() + " after " + msToSend + "/" + msg.getTransmissionTime());
-        long lifetime = msg.getLifetime();
+        long lifetime = msg.getLifetime() - msg.getSendBegin();
         if (lifetime > 3000) {
             int level = Log.DEBUG;
             //if (!sendSuccessful)

comment:14 Changed 18 months ago by Zlatin Balevsky

I'm getting some weird reading on the sidebar, like "Message Delay: -31 years" with the change from 13, so consider it just an illustration of an idea

Note: See TracTickets for help on using tickets.