Opened 2 weeks ago
Last modified 7 days ago
#2589 new enhancement
Fine-tune congestion avoidance growth factor
Reported by: | Zlatin Balevsky | Owned by: | zzz |
---|---|---|---|
Priority: | minor | Milestone: | undecided |
Component: | streaming | Version: | 0.9.41 |
Keywords: | testnet | Cc: | |
Parent Tickets: | Sensitive: | no |
Description (last modified by )
I changed the congestion avoidance growth rate factor to be a double instead of an integer and tested the following scenarios:
Factor 0.5, 1.0, 1.5
Loss probability 0.001%, 0.5%, 1% per node (1 - (1-P)^{12 for total loss rate)
Delay 75ms per node == 150ms RTT
}
The results are in the attached spreadsheet. As expected, the throughput decreases significantly as the loss probability increases, but it is surprising to note that increasing the factor does not help throughput at higher loss rate, in fact the opposite is true.
I don't have a dimension for number of re-transmissions, if needed I will dig it out from the logs.
Subtickets
Attachments (2)
Change History (8)
Changed 2 weeks ago by
Attachment: | Congestion_Avoidance_analysis.ods added |
---|
Changed 2 weeks ago by
Attachment: | Congestion_Avoidance_analysis2.ods added |
---|
comment:1 Changed 2 weeks ago by
See Congestion_Avoidance_analysis2.ods for counts of re-transmitted packets versus the other two variables. I only took 10 samples in each configuration. My conclusion is that there is no statistical difference between them and that the congestion avoidance factor is not related to the number of retransmissions.
comment:2 Changed 2 weeks ago by
Description: | modified (diff) |
---|
comment:3 Changed 8 days ago by
Component: | router/update → streaming |
---|
I assume we're talking about streaming here.
Interesting. Because we window based on messages (multiples of 1730 bytes), not based on bytes outstanding, we have to simulate AIMD based on probability:
int shouldIncrement = _context.random().nextInt(con.getOptions().getCongestionAvoidanceGrowthRateFactor()*newWindowSize);
This causes the rampup to be somewhat unpredictable. Perhaps a growth rate higher than 1 would make things better.
Your results are in line with my expectations, that growth rate has little to do with drop probability, at least in the steady state. It would mainly affect the shape of the window-size-vs-time graph, is it a fast or slow stairstep.
If we didn't have 'slow start' (which really means fast start), or TCBShare (remembering the last window) the growth factor would affect the rampup at the beginning of a connection. Since we do have both, I don't think the growth factor knob will have much effect.
We do limit the 'slow start' (i.e. fast start) phase to a window of 24, so in a test network where the window wants to go way higher than that, the growth factor will slow the bandwidth rampup and you probably would see a difference with a higher growth rate. See CPH line 441 for details.
TL;DR growth rate mainly affects bandwidth rampup at the start of a connection, and not much even there. Should have very little impact on packet loss.
comment:4 Changed 8 days ago by
I'm proposing we decrease the factor (which increases the probability of the window growing) as that improves throughput even at very high loss rate. (see first spreadsheet)
I don't have numbers for the live network and it's very hard to get those, but I believe the 0.5% loss rate is higher and the 1% rate much higher than what's really out there. 0.5% loss rate at each hop translates to 5.8% loss across 12 hops and 1% to 11.3%.
Regarding slow start and TCBShare, for the second spreadhseet I had to restart the nodes after each sample was taken so that I could count the retransmitted packets in the logs. In the first spreadsheet however I did not need to restart between samples so the TCBShare was in effect.
comment:5 Changed 7 days ago by
Re: growth factor, I was confused, as you said lower factor is faster, other than that I stand by my comments, I don't think it will influence loss or bandwidth much, except at the beginning.
Re: loss probability, I'm very confused still. Is this a dependent or independent variable? That is, are you observing loss, or artificially adding it with tc? I'm assuming the former but maybe I'm wrong?
comment:6 Changed 7 days ago by
The loss probability (x-axis in the graph) is artificially added with tc. The y-axis in the first spreadsheet is the throughput in KB, in the second spreadsheet is the number of retransmitted packets during the transfer of a 1MB file. (Sorry, I didn't label the axis correctly in the second spreadsheet).
retransmission vs loss rate vs factor