Opened 23 months ago
Closed 11 months ago
#2453 closed defect (wontfix)
Fix classloader leaks
Reported by: | jogger | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | n/a |
Component: | router/general | Version: | 0.9.38 |
Keywords: | Cc: | ||
Parent Tickets: | Sensitive: | no |
Description
Today got a spontaneous restart together with the following misleading log:
2019/03/06 13:37:25 | CRIT [P reader 2/2] net.i2p.router.Router : Thread ran out of memory, shutting down I2P
2019/03/06 13:37:25 | java.lang.OutOfMemoryError?: Metaspace
2019/03/06 13:37:54 | CRIT [P reader 2/2] net.i2p.router.Router : free mem: 138986168 total mem: 268435456
2019/03/06 13:37:54 | CRIT [P reader 2/2] net.i2p.router.Router : To prevent future shutdowns, increase wrapper.java.maxmemory in /home/e/i2p/wrapper.config
The metaspace OOM results from classloader leaks that I am unable to debug. There are at least two sources of leaks I noticed
- a sudden hike by 500 classes loaded during unattended operation
- some classes added by repeated starting and stopping of webapps
Had seen those restarts often before but now was monitoring with jconsole all the time. Workaround is to set MaxMetaSpace? to some insane value.
Subtickets
Change History (13)
comment:1 Changed 23 months ago by
comment:3 Changed 23 months ago by
The log is misleading because it points to maxmemory as a possible solution. In this case the JVM clearly reports a metaspace OOM. I suspect the above hint would even be spit out if one gets an OOM because running with -Xss set too low.
Just read comment 1, will try that. Will need some days to reappear. The hike is even more significant, today short term > 700, visibility for me depends on the graph scaling in jconsole.
comment:4 Changed 23 months ago by
You have any suggestions on how to improve the log message, or an alternative message if "metaspace" is the problem?
something like this?
if (oom.getMessage().contains("Metaspace"))
log("Fix your Metaspace!!!")
comment:5 Changed 23 months ago by
btw, case 2 in OP (start/stop of webapps) may be related to the way jetty webapp classloading works, or may be a byproduct of what we do for plugin webapps, where we try to make sure to get the new classes after a plugin is updated. Not sure if the webapp logic forces that for non-plugin webapps or not.
whatever the case, I wouldn't worry about it, as restarting webapps is very rare in normal operation.
comment:6 Changed 23 months ago by
About webapps I agree, fixing this is only important if uptime > 1 year desired.
Other apps die silently on an OOM and leave it entirely up to the user to figure out what went wrong. We should acknowledge that most users run Java 11 as standard part of their OS now and spit out errors accordingly. There are many memory areas Java deals with now, so erors messages should reflect that. From my experience I had heap OOMs only through my own coding errors, while metaspace and stack overflow OOM were more common and should be reported as such. For direct memory shortage I have only seen performance issues so far. Code cache could also be considered.
We should not log "Fix your Metaspace!!!", but use neutral language there. After all people with memory restrictions will set memory parameters to allow some headroom above multiday maximums. If we bust those it may be our fault.
re comment 1: Class loader peaks are caused by hundreds of java.lang.invoke.LambdaForm?$MH/0x….. with number of loaded classes never returning to previous levels afterwards.
comment:7 Changed 23 months ago by
I don't have any data on Java versions in our userbase (do you?), but I suspect that most of our users are on windows, and most of them are on Java 8.
"Fix your metaspace" was a placeholder, what's your suggestion for a real error message, and is contains("Metaspace") sufficient to detect when to display it?
comment:8 Changed 23 months ago by
Class loader peaks are caused by hundreds of java.lang.invoke.LambdaForm??$MH/0x….. with number of loaded classes never returning to previous levels afterwards.
Very strange. I get a few of those in my log as well, but not hundreds and they get unloaded at later time. I wonder if it could be an artifact of having jconsole attached. Can you correlate the timing of their loading with any other activity on the router? Actually, what is the router doing - is it just routing traffic or seeding torrents, or something else?
comment:9 Changed 22 months ago by
There is no correlation to other activity. Occurs during unattended operation, also not related to any IP change.
comment:11 Changed 11 months ago by
Sensitive: | unset |
---|---|
Status: | new → infoneeded_new |
This whole ticket sounds very JVM-specific. Is there anything to be done here or can we close the ticket?
comment:12 Changed 11 months ago by
Status: | infoneeded_new → new |
---|
I agree. It is very JVM-specific, the metaspace allocated differs between JVM versions. The error does only occur, when a limit on metaspace is set, and will not be noticed by most users since it is in the 10-20 MB area over long time. So probably not worth putting more effort in.
comment:13 Changed 11 months ago by
Milestone: | undecided → n/a |
---|---|
Resolution: | → wontfix |
Status: | new → closed |
The sudden hike you are mentioning sounds very suspicious and I would like to investigate further. If you can, please run with
-verbose:class
( https://stackoverflow.com/questions/10230279/java-verbose-class-loading ) and I will too.Regarding MetaSpace, IIRC that was introduced in Java 8 and we're still targeting Java 7…