Opened 22 months ago

Last modified 16 months ago

#2036 open enhancement

Improve performance of ElGamal

Reported by: str4d Owned by: str4d
Priority: minor Milestone: soon
Component: router/general Version: 0.9.31
Keywords: performance, metrics Cc:
Parent Tickets: Sensitive: no

Description

I2P's ElGamal engine has remained basically unchanged since it was written in 2004, around the release of Java 5. It uses BigInteger to implement the algorithm, with two features for performance:

  • It uses libgmp via NativeBigInteger for modPow() and modPowCT() on systems where available.
  • The randomness for the protocol is pre-generated and cached (in DHSessionKeyBuilder for transports, and YKGenerator for E2E), to take advantage of low-load periods.

The ElGamal subsystem is a particularly heavy part of the I2P router. Several potential improvements have been identified in recent times:

  • As we have a common fixed base across the entire network, we can pre-calculate tables for it to implement fast Montgomery multiplication.
  • ElGamalEngine.decrypt() uses a non-optimal algorithm, and initial benchmarks indicate that it results in severely-degraded performance in pure-Java.
  • Initial benchmarks indicate that in modern Java JVMs, the pure-Java performance might outstrip the NativeBigInteger code.

We need to set up benchmarks, run them on a variety of systems, validate the above improvements, and deploy the ones that prove useful.

Subtickets

Change History (7)

comment:1 Changed 22 months ago by str4d

Steps to close this:

  • Finish implementing the benchmarks
    • In progress in i2p.i2p.str4d.bench
  • Set up a CI system to build and collect benchmarks.
    • Using Buildbot for control and collection.
      • Will set up an EC2 latent worker for building the I2P JARs, which can then be pulled up to the master for distribution to benchmark machines.
    • Using Codespeed for storage and viewing.
      • Codespeed doesn't have Monotone support, and I can't rewrite revision IDs that aren't in the main branch. So we will only use Codespeed for canonical benchmarks (once the changes are merged into master), and just use Buildbot for obtaining benchmark values during development.
  • Collect benchmarks on current master.
    • We can set up various hardware + JVM combinations with Buildbot workers.
    • This won't need actual continuous integration; we can just bring up the workers as we need them.
  • Implement and test improvements.
  • Merge improvements and collect benchmarks.

comment:2 Changed 22 months ago by str4d

The benchmarks are now implemented. To test:

  • Check out i2p.i2p.str4d.bench.
  • Set jmh.home in override.properties to a folder containing jmh-core.jar, jmh-generator-annprocess.jar, jopt-simple.jar, and commons-math3.jar.
    • Fetch these from Maven Central
    • Tested using JMH 1.19 which requires jopt-simple 4.6 and commons-math3 3.2.
  • ant bench to compile the benchmarks.
  • Run the benchmarks:
    • ./benchmarks/benchmark.sh -h to see underlying JMH options
    • ./benchmarks/benchmark.sh to run the benchmarks in pure-Java mode
    • ./benchmarks/benchmark.sh --jbigi to run the benchmarks with jbigi.jar in the classpath
    • JAVA=/path/to/java ./benchmarks/benchmark.sh to run the benchmarks with a different JVM

comment:3 Changed 22 months ago by zzz

I don't think we need a new top level directory, and all our other test code is elsewhere.
Can we move the benchmarks/ top level dir to core/java/test/benchmarks/ ? seems like a better place. Maybe the benchmark.sh should go to tests/scripts/ ?

Can you put the instructions from comment 2 above somewhere? At the top of benchmark.sh perhaps? or a README or help file somewhere

comment:4 Changed 22 months ago by zzz

ACK on the choice of JMH (OP's reasons were simplicity and supported by OpenJDK)

OP asked for comments on the choice of initial test implementations and their parameters. They seem fine.

SHA256 is unlikely to be interesting, as we unconditionally use the JVM now, but it's fine as a demo.

ElG seems to be bearing immediate fruit.

AES could be interesting to add more length params, particularly around the current 704 boundary of JVM vs our impl (see CryptixAESEngine)

As I noted in IRC, porting (or just running) some of the diagnostic output from NBI.main(), SystemVersion?.main(), CPUID.main(), and/or others, may be helpful or even necessary to interpret benchmark results. Or just add it as a separate "test".

We also discussed ways to enable/disable jbigi and to use a specific dll/so.

Also, the ctx field in all 3 tests and most of the fields in the AES test can be final.

comment:5 in reply to:  4 Changed 22 months ago by str4d

Replying to zzz:

I don't think we need a new top level directory, and all our other test code is elsewhere.
Can we move the benchmarks/ top level dir to core/java/test/benchmarks/ ? seems like a better place. Maybe the benchmark.sh should go to tests/scripts/ ?

Can you put the instructions from comment 2 above somewhere? At the top of benchmark.sh perhaps? or a README or help file somewhere

Both good suggestions, now implemented.

Replying to zzz:

AES could be interesting to add more length params, particularly around the current 704 boundary of JVM vs our impl (see CryptixAESEngine)

As I noted in IRC, porting (or just running) some of the diagnostic output from NBI.main(), SystemVersion.main(), CPUID.main(), and/or others, may be helpful or even necessary to interpret benchmark results. Or just add it as a separate "test".

In the CI builder, I will just run these as additional steps before the benchmarks themselves.

We also discussed ways to enable/disable jbigi and to use a specific dll/so.

For the benchmark.sh script I've stuck with the current jbigi.jar strategy, for the purpose of simulating what end users will experience. Someone wanting to use a specific libjbigi.so can do so by copying it to ./libjbigi.so and then running:

java -Djava.library.path=. -jar core/java/build/i2p-benchmarks.jar [JMH_ARGS]

Also, the ctx field in all 3 tests and most of the fields in the AES test can be final.

Can be, but should they? I want to simulate regular usage as much as possible, and the final keyword has optimization advantages.

comment:6 Changed 22 months ago by str4d

Regarding the collection itself, here's an outline of the builders that I'm envisaging:

  • Compilation builder
    • Runs ant bench
    • Uploads benchmark.sh, i2p-benchmarks.jar and jbigi.jar to the build master
  • Benchmark builders (one per environment)
    • Download benchmark files
    • java -cp i2p-benchmarks.jar net.i2p.util.SystemVersion
    • java -cp i2p-benchmarks.jar:jbigi.jar freenet.support.CPUInformation.CPUID
    • java -cp i2p-benchmarks.jar:jbigi.jar net.i2p.util.NativeBigInteger
    • JAVA=java6 ./benchmark.sh
    • JAVA=java6 ./benchmark.sh --jbigi
    • JAVA=java7 ./benchmark.sh
    • JAVA=java7 ./benchmark.sh --jbigi
    • JAVA=java8 ./benchmark.sh
    • JAVA=java8 ./benchmark.sh --jbigi
    • Upload the benchmark.sh results to Codespeed
      • I think I will treat each invocation as a different "executable" in Codespeed parlance, e.g. i2p-java6 vs i2p-java6-jbigi

comment:7 Changed 16 months ago by str4d

Keywords: metrics added
Status: newopen
Note: See TracTickets for help on using tickets.