Opened 2 years ago

Closed 18 months ago

#1989 closed defect (fixed)

NPE net.i2p.crypto.CryptixRijndael_Algorithm.blockEncrypt(CryptixRijndael_Algorithm.java:392)

Reported by: 2121 Owned by: zzz
Priority: minor Milestone: 0.9.31
Component: api/crypto Version: 0.9.30
Keywords: Cc:
Parent Tickets:

Description

i get this message on my i2p router sporadically and after (re)start:


CRIT 3/4? net.i2p.router.JobQueueRunner? : Error processing job [Outbound client message delayed send] on thread 2: null

java.lang.NullPointerException?
at net.i2p.crypto.CryptixRijndael_Algorithm.blockEncrypt(CryptixRijndael_Algorithm.java:392)
at net.i2p.crypto.CryptixAESEngine.encryptBlock(CryptixAESEngine.java:221)
at net.i2p.crypto.CryptixAESEngine.encrypt(CryptixAESEngine.java:122)
at net.i2p.crypto.CryptixAESEngine.encrypt(CryptixAESEngine.java:79)
at net.i2p.crypto.ElGamalAESEngine.encryptAESBlock(ElGamalAESEngine.java:692)
at net.i2p.crypto.ElGamalAESEngine.encryptAESBlock(ElGamalAESEngine.java:632)
at net.i2p.crypto.ElGamalAESEngine.encryptNewSession(ElGamalAESEngine.java:540)
at net.i2p.crypto.ElGamalAESEngine.encrypt(ElGamalAESEngine.java:411)
at net.i2p.crypto.ElGamalAESEngine.encrypt(ElGamalAESEngine.java:453)
at net.i2p.router.message.GarlicMessageBuilder?.buildMessage(GarlicMessageBuilder?.java:201)
at net.i2p.router.message.GarlicMessageBuilder?.buildMessage(GarlicMessageBuilder?.java:170)
at net.i2p.router.message.OutboundClientMessageJobHelper?.createGarlicMessage(OutboundClientMessageJobHelper?.java:126)
at net.i2p.router.message.OutboundClientMessageOneShotJob?.send(OutboundClientMessageOneShotJob?.java:612)
at net.i2p.router.message.OutboundClientMessageOneShotJob?.access$300(OutboundClientMessageOneShotJob?.java:103)
at net.i2p.router.message.OutboundClientMessageOneShotJob?$SendJob?.runJob(OutboundClientMessageOneShotJob?.java:357)
at net.i2p.router.JobQueueRunner?.runCurrentJob(JobQueueRunner?.java:135)
at net.i2p.router.JobQueueRunner?.run(JobQueueRunner?.java:78)


Subtickets

Change History (19)

comment:1 Changed 2 years ago by zzz

  • Status changed from new to infoneeded_new

Please provide the version information from the top of the /logs page in your console.

comment:2 Changed 2 years ago by zzz

notes to self: CryptixAESEngine.encryptBlock() at line 210 could be improved to guarantee non-null preparedKey passed to CryptixRijndael_Algorithm.blockEncrypt(), but SessionKey?.setData() will throw ISE if called twice, so preparedKey won't be set to null again, so isn't clear how this could ever happen. But will fixup at line 210 just be sure.

comment:3 Changed 2 years ago by zzz

Above cleanups in 3871abfd63f255f4993d3484585f01538cdb4835 to be 0.9.30-4

comment:4 Changed 2 years ago by 2121

  • Status changed from infoneeded_new to new

I2P version: 0.9.30-3
Java version: Oracle Corporation 1.8.0_152-ea (Java(TM) SE Runtime Environment 1.8.0_152-ea-b01)
Wrapper version: 3.5.30
Server version: 9.2.21.v20170120
Servlet version: Jasper JSP 2.3 Engine
JSTL version: standard-taglib 1.2.0
Platform: Linux arm 4.4.0-1034-raspi2
Processor: (armv7)
Jbigi: Locally optimized native BigInteger? library loaded from file
Jbigi version: 4
GMP version: 6.1.2
Encoding: UTF-8
Charset: UTF-8

comment:5 Changed 2 years ago by zzz

  • Component changed from unspecified to api/crypto
  • Milestone changed from undecided to 0.9.31
  • Status changed from new to open

OP reports on IRC that he's seen it on multiple RPi platforms. No reports on x86 so far. He also reports that the above cleanups fix the problem. All of this info is preliminary, will await further testing.

comment:6 Changed 2 years ago by 2121

after the patch the error appeared again

CRIT 1/1? net.i2p.router.JobQueueRunner? : Error processing job [Outbound client message delayed send] on thread 0: null

java.lang.NullPointerException?
at net.i2p.crypto.CryptixRijndael_Algorithm.blockEncrypt(CryptixRijndael_Algorithm.java:392)
at net.i2p.crypto.CryptixAESEngine.encryptBlock(CryptixAESEngine.java:222)
at net.i2p.crypto.CryptixAESEngine.encrypt(CryptixAESEngine.java:125)
at net.i2p.crypto.CryptixAESEngine.encrypt(CryptixAESEngine.java:79)
at net.i2p.crypto.ElGamalAESEngine.encryptAESBlock(ElGamalAESEngine.java:692)
at net.i2p.crypto.ElGamalAESEngine.encryptAESBlock(ElGamalAESEngine.java:632)
at net.i2p.crypto.ElGamalAESEngine.encryptNewSession(ElGamalAESEngine.java:540)
at net.i2p.crypto.ElGamalAESEngine.encrypt(ElGamalAESEngine.java:411)
at net.i2p.crypto.ElGamalAESEngine.encrypt(ElGamalAESEngine.java:453)
at net.i2p.router.message.GarlicMessageBuilder?.buildMessage(GarlicMessageBuilder?.java:201)
at net.i2p.router.message.GarlicMessageBuilder?.buildMessage(GarlicMessageBuilder?.java:170)
at net.i2p.router.message.OutboundClientMessageJobHelper?.createGarlicMessage(OutboundClientMessageJobHelper?.java:126)
at net.i2p.router.message.OutboundClientMessageOneShotJob?.send(OutboundClientMessageOneShotJob?.java:612)
at net.i2p.router.message.OutboundClientMessageOneShotJob?.access$300(OutboundClientMessageOneShotJob?.java:103)
at net.i2p.router.message.OutboundClientMessageOneShotJob?$SendJob?.runJob(OutboundClientMessageOneShotJob?.java:357)
at net.i2p.router.JobQueueRunner?.runCurrentJob(JobQueueRunner?.java:135)
at net.i2p.router.JobQueueRunner?.run(JobQueueRunner?.java:78)

comment:7 Changed 2 years ago by zzz

OK, looking at this again.

The sessionKey passed in should be an array of two int[][], i.e. Ke and Kd. We use Ke for encryption.

sessionKey is not null. sessionKey[0], i.e. Ke, is null. This confirms that the changes I checked in to make double-sure that sessionKey isn't null don't matter. So let's look at makeKey() where the preparedKey comes from.

keyData is always null (we don't do caching). So we always return new Object {Ke, Kd}. Both are always created with new int[][] and can't be null.

So again, this appears to be "impossible". It's either buggy hardware, compiler, or JVM, or there's some Java memory model thing at play here, with how array elements are filled in or visible across threads. If so, perhaps it could be fixed with synchronization somewhere. To be researched. As I've said elsewhere, strange and impossible things seem to happen on the RPi.

comment:8 Changed 2 years ago by zzz

Well, array elements are not volatile in Java. I knew that. Cheezy fix is to just move the return value creation up and hope the cache gets flushed. Now that I look at it, why do we have encryption and decryption session key creation in the same method. Wouldn't we just need one or the other, but never both? This whole thing is crying out for a rewrite.

comment:9 Changed 2 years ago by 2121

At this point i can say with certainty that it happens only on my rpi3 (armv7).

comment:10 Changed 2 years ago by 2121

I can now confirm that the error is specifically tight to SSU/UDP.
Running the same router with TCP/NTCP only is absolutely fine.

comment:11 Changed 2 years ago by 2121

After doing some load testing on the disk tcp also bumped into this error.
It seems to be very RPi3 specific and not an i2p bug.

comment:12 Changed 2 years ago by zzz

Stab in the dark as described in comment 8 above, in d47b99835c50891cc75187820250a9467306780c to be 0.9.30-4.
Not a serious fix, but could reduce or eliminate the errors. Or, won't affect anything. If it does make a difference, will point the way to a better fix.

comment:13 Changed 2 years ago by 2121

With 0.9.30-4 error message is gone.
Memory stats of the router show ~40% less memory consumption right from the start.
This goes for armv7 as well as x86_64 linux.

Last edited 2 years ago by 2121 (previous) (diff)

comment:14 Changed 2 years ago by zzz

  • Owner set to zzz
  • Status changed from open to accepted

Thanks for the test results. It confirms the theory in comment 8.

I don't know why the change would affect memory consumption.

Thinking about whether to leave the fix as-is or make it more robust.

comment:15 Changed 2 years ago by 2121

still no errors after more than 5 days - so i would consider this fixed.
I don't know about making it more 'robust' - is it worth your time or better spent on other matters ?

comment:16 Changed 2 years ago by 2121

as it turns out - i was wrong :-(

CRIT 1/1? net.i2p.router.JobQueueRunner? : Error processing job [Outbound client message delayed send] on thread 0: null

java.lang.NullPointerException?
at net.i2p.crypto.CryptixRijndael_Algorithm.blockEncrypt(CryptixRijndael_Algorithm.java:405)
at net.i2p.crypto.CryptixAESEngine.encryptBlock(CryptixAESEngine.java:222)
at net.i2p.crypto.CryptixAESEngine.encrypt(CryptixAESEngine.java:122)
at net.i2p.crypto.CryptixAESEngine.encrypt(CryptixAESEngine.java:79)
at net.i2p.crypto.ElGamalAESEngine.encryptAESBlock(ElGamalAESEngine.java:692)
at net.i2p.crypto.ElGamalAESEngine.encryptAESBlock(ElGamalAESEngine.java:632)
at net.i2p.crypto.ElGamalAESEngine.encryptNewSession(ElGamalAESEngine.java:540)
at net.i2p.crypto.ElGamalAESEngine.encrypt(ElGamalAESEngine.java:411)
at net.i2p.crypto.ElGamalAESEngine.encrypt(ElGamalAESEngine.java:453)
at net.i2p.router.message.GarlicMessageBuilder?.buildMessage(GarlicMessageBuilder?.java:201)
at net.i2p.router.message.GarlicMessageBuilder?.buildMessage(GarlicMessageBuilder?.java:170)
at net.i2p.router.message.OutboundClientMessageJobHelper?.createGarlicMessage(OutboundClientMessageJobHelper?.java:126)
at net.i2p.router.message.OutboundClientMessageOneShotJob?.send(OutboundClientMessageOneShotJob?.java:612)
at net.i2p.router.message.OutboundClientMessageOneShotJob?.access$300(OutboundClientMessageOneShotJob?.java:103)
at net.i2p.router.message.OutboundClientMessageOneShotJob?$SendJob?.runJob(OutboundClientMessageOneShotJob?.java:357)
at net.i2p.router.JobQueueRunner?.runCurrentJob(JobQueueRunner?.java:135)
at net.i2p.router.JobQueueRunner?.run(JobQueueRunner?.java:78)

comment:17 Changed 2 years ago by zzz

  • Status changed from accepted to testing

well, that's bad news in that we didn't fix it, but good news in that it's consistent with my thesis on the cause.

Attempted 'real' fix in 6d9441402863e2dd71beaea95d214ff3de967649 0.9.30-9, uses a class with final fields instead of an array. Let's see how it works.

comment:18 Changed 2 years ago by 2121

i updated to -9 and will report back in a week.
thanx for the patch

comment:19 Changed 18 months ago by zzz

  • Resolution set to fixed
  • Status changed from testing to closed

presumed fixed

Note: See TracTickets for help on using tickets.