Opened 3 years ago

Last modified 4 months ago

#1893 open defect

i2psnark: data loss on restarts

Reported by: zzz Owned by: zzz
Priority: minor Milestone: 0.9.30
Component: apps/i2psnark Version: 0.9.28
Keywords: Cc:
Parent Tickets: Sensitive: no

Subtickets

#1903: Way to reproduce issues described in 1893closedzzz
#2508: I2PSnark: Torrent metadata is lost on restartclosedzzz

Change History (21)

comment:1 Changed 3 years ago by jogger

Add a subticket #1903.

comment:2 Changed 3 years ago by zzz

Status: newinfoneeded_new

#1903 describes a way to reproduce, where the per-torrent config files under i2psnark.config.d are deleted. Those config files contain state information including completion and whether to remap filenames to "safe" characters. When the files are deleted, we forget that information. Torrents will be rescanned but we dont guess what the file mapping will be in the absence of a config file.

I don't understand why the config files would be deleted or corrupted on shutdown, I'll try to come up with a theory.

Please provide the info requested over in the forum thread: quote:

Please provide the version information from the top of /logs in the console, and the number of torrents you have running.

comment:3 Changed 3 years ago by jogger

Status: infoneeded_newnew

Found better way to reproduce:

Installed new i2p machine, 0% participating traffic, pointed Snark at data dir with 72 torrents already successfully created and seeded on a different machine.

On first start all torrents were scanned. Files with "unsafe" characters were recreated and redownloaded as described in the forum. Torrents without "unsafe" characters were checked 100% complete.

First restart after some hours: All torrents rescanned - obviously all config files corrupted.

Second restart after some more hours: 19 torrents directly loaded, rest rescanned - obviously most config files corrupted.

Third restart after some more hours: 7 torrents directly loaded, rest rescanned - obviously most config files corrupted.

This has been seen on all my i2p machines on wheezy, jessie, stretch and macOS with 4 different HW platforms, all with journaled file systems. Others also have this problem as there are often leechers appearing being 80 - 95% complete on torrents that have been inactive for weeks. This must be previous seeders exposed to the same problem.

Conclusion 1: The "safe" file name mapping causes leechers to create file names different from the original torrent. If the config file of the original torrent gets corrupted, it can only properly be recreated by recreating the .torrent file. Recreating by rescan loses the original names. Possible solution: "Unsafe" file names get renamed before torrent creation.

Conclusion 2: The config file corruption problem caused by crash or by restart without participating tunnels really exists on a variety of platforms and i2p versions. Possible solution: If a torrent is rescanned, check file content against the torrent file, do not trust the config file. Introduce a way of integrity checking for the config files themselves.

I2P version: 0.9.28-0
Java version: Oracle Corporation 1.8.0_25 (Java™ SE Runtime Environment 1.8.0_25-b17)
Wrapper version: 3.5.30
Server version: 8.1.21.v20160908
Servlet version: Jasper JSP 2.1 Engine
JSTL version: standard-taglib 1.2.0
Platform: Mac OS X x86_64 10.12.2
Jcpuid version: 0
Processor: uninitialized (unrecognized)
Jbigi: Native BigInteger? library jbigi not loaded - using pure Java - poor performance may result - see http://i2p-projekt.i2p/jbigi for help
Jbigi version: 0
GMP version: unknown

comment:4 Changed 3 years ago by zzz

Milestone: undecided0.9.29
Status: newaccepted

OK. Thanks for the detailed information.

I think I was confused by your use of the word "corrupted". If a torrent gets rescanned in this scenario, it's probably not that the config file was corrupted or deleted - it probably just didn't get updated (i.e. rewritten) on shutdown. Since the torrent data will then be more recent than the config file, the torrent will be rescanned at the next startup.

In addition, the config file contains the setting on mapping of unsafe characters. Without that file, we make some assumption. We set mapping if we downloaded the torrent; we don't map if we created the torrent. If the torrent just appears in the directory, that's where we don't know what to do. The code may be able to handle this better, not sure. In any case, it's unrelated to the shutdown issue, and I may make a separate ticket for it.

I'll be looking at ways to speed up and parallelize the shutdown, and wait longer for it to complete.

comment:5 Changed 3 years ago by zzz

Status: acceptedtesting

In 4f859c6acce391a4c0b3bee3bbe96cded077c60f to be 0.9.28-5:

  • Parallelize the context shutdown tasks, so the jetty shutdown starts sooner
  • Increase max time for shutdown tasks from 10 sec to 60 sec (120 on ARM)
  • Change 20 ms delay between unannounces to every 8th unannounce

Even with the previous 20 ms delay in the loop, we should have been able to get almost 500 torrents to shut down cleanly (maybe 450… I think we have Jetty configured to wait a second), so I still don't fully understand what's going on. With this change we should be able to do a clean shutdown with several thousand torrents.

The only thing that's threaded is the unannounces themselves, so even if some of those aren't completed before shutdown, it shouldn't affect the config file.

The torrents are all closing quickly for me, perhaps though it takes longer if a lot of peers are connected.

Opening ticket #1938 for the unsafe character mapping issue.

Logs after change:

xx:50:21.519 WARN  [le Jetty-122] net.i2p.router.Router         : Router state change from RUNNING to GRACEFUL_SHUTDOWN
xx:50:21.519 CRIT  [ShutdownHook] net.i2p.router.Router         : Graceful shutdown progress: No more tunnels, starting final shutdown
xx:50:23.519 WARN  [ShutdownHook] net.i2p.router.Router         : Router state change from GRACEFUL_SHUTDOWN to FINAL_SHUTDOWN_1
xx:50:23.519 WARN  [ShutdownHook] net.i2p.router.Router         : Router state change from FINAL_SHUTDOWN_1 to FINAL_SHUTDOWN_2
xx:50:23.519 CRIT  [ShutdownHook] net.i2p.router.Router         : Starting final shutdown(2)
xx:50:23.521 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.data.SDSCache$Shutdown
xx:50:23.521 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.data.SDSCache$Shutdown
xx:50:23.521 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.data.SDSCache$Shutdown
xx:50:23.522 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:23.522 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.util.SimpleTimer2$Shutdown
xx:50:23.522 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.router.time.RouterTimestamper$Shutdown
xx:50:23.522 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:23.524 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:23.525 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.router.web.ConfigServiceHandler$UpdateWrapperOrRekeyTask
xx:50:23.525 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.client.naming.BlockfileNamingService$Shutdown
xx:50:23.525 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:23.525 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.router.web.StatSummarizer$Shutdown
xx:50:23.533 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.router.startup.RouterAppManager$Shutdown
xx:50:23.533 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:23.533 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:23.534 WARN  [ShutdownHook] net.i2p.router.Router         : Running shutdown task class net.i2p.util.SimpleTimer2$Shutdown
xx:50:23.535 WARN  [ger$Shutdown] outer.startup.RouterAppManager: Shutting down client Router Console
xx:50:23.535 INFO  [ger$Shutdown] outer.startup.RouterAppManager: Client Router Console is now STOPPING 
xx:50:23.536 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.data.SDSCache$Shutdown
xx:50:23.536 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.data.SDSCache$Shutdown
xx:50:23.536 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.data.SDSCache$Shutdown
xx:50:23.536 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:23.536 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.util.SimpleTimer2$Shutdown
xx:50:23.536 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.router.time.RouterTimestamper$Shutdown
xx:50:23.536 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:23.536 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:23.536 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.router.web.ConfigServiceHandler$UpdateWrapperOrRekeyTask
xx:50:23.536 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.client.naming.BlockfileNamingService$Shutdown
xx:50:23.536 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:23.570 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.router.web.StatSummarizer$Shutdown
xx:50:24.563 WARN  [ger$Shutdown] org.klomp.snark.SnarkManager  : Snark stop() begin
java.lang.Exception: I did it
	at org.klomp.snark.SnarkManager.stop(SnarkManager.java:307)
	at org.klomp.snark.web.I2PSnarkServlet.destroy(I2PSnarkServlet.java:102)
	at org.eclipse.jetty.servlet.ServletHolder.destroyInstance(ServletHolder.java:395)
	at org.eclipse.jetty.servlet.ServletHolder.doStop(ServletHolder.java:368)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
	at org.eclipse.jetty.servlet.ServletHandler.doStop(ServletHandler.java:240)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
	at org.eclipse.jetty.server.handler.HandlerWrapper.doStop(HandlerWrapper.java:107)
	at org.eclipse.jetty.security.SecurityHandler.doStop(SecurityHandler.java:392)
	at org.eclipse.jetty.security.ConstraintSecurityHandler.doStop(ConstraintSecurityHandler.java:464)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
	at org.eclipse.jetty.server.handler.HandlerWrapper.doStop(HandlerWrapper.java:107)
	at org.eclipse.jetty.server.session.SessionHandler.doStop(SessionHandler.java:136)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
	at org.eclipse.jetty.server.handler.HandlerWrapper.doStop(HandlerWrapper.java:107)
	at org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:815)
	at org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:160)
	at org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:518)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
	at org.eclipse.jetty.server.handler.HandlerCollection.doStop(HandlerCollection.java:250)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
	at org.eclipse.jetty.server.handler.HandlerCollection.doStop(HandlerCollection.java:250)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
	at org.eclipse.jetty.server.handler.HandlerWrapper.doStop(HandlerWrapper.java:107)
	at org.eclipse.jetty.server.Server.doStop(Server.java:343)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
	at net.i2p.router.web.RouterConsoleRunner.shutdown(RouterConsoleRunner.java:233)
	at net.i2p.router.startup.RouterAppManager.shutdown(RouterAppManager.java:190)
	at net.i2p.router.startup.RouterAppManager$Shutdown.run(RouterAppManager.java:201)
	at java.lang.Thread.run(Thread.java:745)
	at net.i2p.util.I2PThread.run(I2PThread.java:103)
xx:50:24.566 WARN  [ger$Shutdown] org.klomp.snark.SnarkManager  : SnarkManager final shutdown
xx:50:30.485 WARN  [ger$Shutdown] org.klomp.snark.SnarkManager  : Snark stop() end
xx:50:30.544 INFO  [ger$Shutdown] outer.startup.RouterAppManager: Client Router Console is now STOPPED 
xx:50:30.544 WARN  [ger$Shutdown] outer.startup.RouterAppManager: Shutting down client i2ptunnel
xx:50:30.544 INFO  [ger$Shutdown] outer.startup.RouterAppManager: Client i2ptunnel is now STOPPING 
xx:50:30.545 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.router.startup.RouterAppManager$Shutdown
xx:50:30.545 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:30.545 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.router.util.DecayingBloomFilter$Shutdown
xx:50:30.545 INFO  [ShutdownHook] net.i2p.router.Router         : Shutdown task complete: Shutdown task net.i2p.util.SimpleTimer2$Shutdown
xx:50:35.196 WARN  [ShutdownHook] net.i2p.router.Router         : Router state change from FINAL_SHUTDOWN_2 to FINAL_SHUTDOWN_3
xx:50:35.197 CRIT  [ShutdownHook] net.i2p.router.Router         : Shutdown(2) complete

comment:6 Changed 3 years ago by jogger

Thanks for your effort. I will test.

However you are clearly describing that a clean shutdown is necessary to avoid torrent rescans at startup. That corresponds to my reports that this bug produces the same symptoms as a machine crash. I propose a real fix for the issue that will survive a "kill -9" by

  • either writing out the config files frequently with every change
  • or separating the config files into two (metadata and status) with one that is never touched once the torrent is complete and thus allows a clean reload after a restart.

comment:7 Changed 3 years ago by zzz

Thanks for your comments. Unfortunately, if you think about it, it's impossible to guarantee a consistent, clean state after a kill -9. If you stop code at an arbitrary time, you're going to get arbitrary results. The best we can do is recognize that state on restart and do a rescan. With certain changes such as rewriting the config file after every change, we might make it better… or could we make it worse, by increasing the chance that the config file partially written? I'll think about possible further improvements but no-rescan-ever isn't possible.

comment:8 Changed 3 years ago by jogger

The fix described above does not work in all cases.

I did a graceful restart upgrading from 0.9.29-3 to 4 and all config files were dumped out successfully at that time.

Subsequently, config files were written upon creation, deletion and completion of torrents.

I then did a graceful restart in order to upgrade from from 0.9.29-4 to 5 (seeding 99 torrents at that time). Not a single config file was written out. I lost the progress data at least.

Again no config files written upon a graceful restart updating to 0.9.29-6.

Last edited 3 years ago by jogger (previous) (diff)

comment:9 Changed 3 years ago by zzz

Status: testingneeds_work

reopening based on comment 8

comment:10 Changed 3 years ago by zzz

Milestone: 0.9.290.9.30
Status: needs_worktesting

Webapp shutdown was broken by Jetty 9 (0.9.29-4) as reported in comment 8. Good catch!

After that was fixed, all config files for running torrents were being written at shutdown, whether changed or not. This was broken in late 2015, possibly as a temporary thing. Fixed to only write the necessary config files, which will speed up the shutdown. Only write when the storage (i.e. the bitfield) changed, or we uploaded something, so the upload byte count changed.

Also, not directly related to OP's issue, since he's on a Mac, but writing the config files is very slow on Raspberry Pi, made a change in DataHelper? to not sync the file system, which speeds it up by about 2x. I did tests on my macbook, and syncs are very fast, shouldn't be an issue.

All in 0.9.29-14 b2f40838a26c79ba642a60232f1505588783cc12

thanks for testing

comment:11 Changed 6 months ago by zzz

Resolution: fixed
Sensitive: unset
Status: testingclosed

Webapp shutdown was broken again by me in 0.9.38, fixed for 0.9.41, see #2508. Closing this ticket.

comment:12 Changed 6 months ago by Reportage

Resolution: fixed
Status: closedreopened

Still seeing the same metadata loss after the latest fixes have been applied when restarting the router while torrents are running.

comment:13 Changed 6 months ago by Reportage

Add a subticket #2508 (I2PSnark: Torrent metadata is lost on restart).

comment:14 Changed 6 months ago by Reportage

Resolution: fixed
Status: reopenedclosed

comment:15 Changed 5 months ago by Reportage

Resolution: fixed
Status: closedreopened

comment:16 Changed 4 months ago by zzz

Status: reopenedinfoneeded

Cannot reproduce here, please provide reason for reopening and steps to reproduce with 0.9.41-0 or higher.

comment:17 Changed 4 months ago by Reportage

Status: infoneededopen

Doesn't happen upon every restart, but over time the metadata for individual torrents gets lost.

To reproduce, restart the router with torrents running repeatedly, and sooner or later the meta data will disappear.

comment:18 Changed 4 months ago by zzz

"gets lost" as in the config file disappears completely, or is overwritten with a fresh config, or is only partially stored, or some particular field is gone, or?

comment:19 Changed 4 months ago by jogger

Tried restarts and crashes on a Mac with inactive, leeched torrents only. Found no problems. All metadata such as uploaded volume intact.

Will never again dare to try Snark with torrents freshly created from files with "unsafe" characters though.

comment:20 in reply to:  18 Changed 4 months ago by Reportage

Replying to zzz:

"gets lost" as in the config file disappears completely, or is overwritten with a fresh config, or is only partially stored, or some particular field is gone, or?

I haven't been monitoring the config files, but I'm assuming the config is overwritten with a new file. When metadata disappears, it doesn't disappear for all torrents at the same time.. sometimes the metadata will disappear for a couple of torrents, leaving the metadata intact for a several others. When the metadata does disappear, everything disappears including any comments associated with the torrent.

comment:21 Changed 4 months ago by jogger

OK, reappeared on a Mac after a simple restart. Torrents rescanned, metadata such as upload volume lost.

I had recently as a test changed the snark startup delay to 0 after 60 sec which had solved the problem for me in the past.

Noteworthy: This machine is IPv6 only, integrating very, very slowly into the network.

Snark log until checking the first torrent:

Error connecting to I2P - check your I2CP settings!
Connecting to I2P
Unable to connect to I2P!
Error connecting to I2P - check your I2CP settings!
Connecting to I2P
Unable to connect to I2P!
Error connecting to I2P - check your I2CP settings!
Connecting to I2P
Unable to connect to I2P!
Error connecting to I2P - check your I2CP settings!
Connecting to I2P
Unable to connect to I2P!
Error connecting to I2P - check your I2CP settings!
Connecting to I2P
Unable to connect to I2P!
Error connecting to I2P - check your I2CP settings!
Connecting to I2P
Unable to connect to I2P!
Error connecting to I2P - check your I2CP settings!
Connecting to I2P
Unable to connect to I2P!

router log:
2019/08/15 22:00:13 ERROR [… DirMonitor?] …etManagerFactory: Error creating session for socket manager

net.i2p.client.I2PSessionException: No manager yet
at net.i2p.router.client.ClientManagerFacadeImpl?.connect(ClientManagerFacadeImpl?.java:267)
at net.i2p.client.impl.I2PSessionImpl.connect(I2PSessionImpl.java:709)
at net.i2p.client.streaming.I2PSocketManagerFactory.createManager(I2PSocketManagerFactory.java:415)
at net.i2p.client.streaming.I2PSocketManagerFactory.createManager(I2PSocketManagerFactory.java:301)
at net.i2p.client.streaming.I2PSocketManagerFactory.createManager(I2PSocketManagerFactory.java:178)
at net.i2p.client.streaming.I2PSocketManagerFactory.createManager(I2PSocketManagerFactory.java:153)
at org.klomp.snark.I2PSnarkUtil.connect(I2PSnarkUtil.java:288)
at org.klomp.snark.SnarkManager?.monitorTorrents(SnarkManager?.java:2723)
at org.klomp.snark.SnarkManager?$DirMonitor?.run(SnarkManager?.java:2508)
at java.base/java.lang.Thread.run(Thread.java:835)
at net.i2p.util.I2PThread.run(I2PThread.java:103)

Note: See TracTickets for help on using tickets.