Opened 4 years ago

Closed 3 years ago

#1702 closed defect (fixed)

Tunnels become unusable

Reported by: Zlatin Balevsky Owned by: Zlatin Balevsky
Priority: major Milestone: 0.9.26
Component: api/utils Version: 0.9.22
Keywords: irc Cc:
Parent Tickets: Sensitive: no

Description

After several hours to a day of uptime, my IRC2P tunnels become unusable. The stars are green on the console but tunnels cannot establish any connections. Router is firewalled. Sometimes the irc client reports NoRouteToHost? exceptions.

Subtickets

Attachments (2)

log-debug.txt.gz (107.2 KB) - added by Zlatin Balevsky 4 years ago.
DEBUG logs of a connection attempt
Q9SH.txt.gz (19.5 KB) - added by Zlatin Balevsky 4 years ago.
WARN log of a router from restart to connection attempt. Notice that destination Q9SH shows as negatively cached even though it has not failed crypto and has not been retried in more than two minutes.

Download all attachments as: .zip

Change History (6)

Changed 4 years ago by Zlatin Balevsky

Attachment: log-debug.txt.gz added

DEBUG logs of a connection attempt

Changed 4 years ago by Zlatin Balevsky

Attachment: Q9SH.txt.gz added

WARN log of a router from restart to connection attempt. Notice that destination Q9SH shows as negatively cached even though it has not failed crypto and has not been retried in more than two minutes.

comment:1 Changed 4 years ago by zzz

there's a small chance that the fix for #1650 #1698 in 0.9.22-23 will help. Probably not. I know that zab is looking at netdb lookups (IterativeSearchJob?) and negative caching and has added some advanced config options for experimentation. Another possibility is the ff selection heuristics in FloodfillPeerSelector?. It could also just be typical conn limit stuff. No good leads at this point.

re: Q9SH.txt.gz attachement, it's gzipped twice, so rename and gunzip again to look at it.

There are two sections in that log where it's negative cached: 18:04:30 - 18:07:48 and 18:13:06 to 18:16:34. Both are longer than two minutes. At 18:03:00 = 18:03:27 you can see the three attempts that trigger the negative cache, but I don't see any others. I don't know if the cache clear timer didn't run, or there are other unlogged lookup failures that incremented the cache counter.

To be investigated further.

comment:2 Changed 4 years ago by Zlatin Balevsky

After upgrading to 0.9.22-25 and un-firewalling the router I cannot reproduce this. Uptime is almost 22 days and everything is operating normally.

comment:3 Changed 4 years ago by str4d

Status: newopen

comment:4 Changed 3 years ago by zzz

Component: unspecifiedapi/utils
Milestone: undecided0.9.26
Resolution: fixed
Status: openclosed

Probable dup of #1776

Note: See TracTickets for help on using tickets.