patternjavaMinor
Fastest way to multiply two byte arrays in Java
Viewed 0 times
arraysbytemultiplywayjavatwofastest
Problem
I have two byte arrays that represent unsigned 256-bit values and I want to multiply them resulting into a new value. What is the fastest ways to do this in Java? And are there other factors I should take into account?
This is my current multiply-function:
This takes ~1500ms on my PC for 10000000 iterations and something tells me this can be done faster.
This is my current multiply-function:
// two input values
// i.e. each 0x00000000000000000000000000000000000000000000000000000000000000ff
byte[] input1Array = new byte[32];
byte[] input2Array = new byte[32];
BigInteger input1 = new BigInteger(1, input1Array);
BigInteger input2 = new BigInteger(1, input2Array);
BigInteger result = input1.multiply(input2);
byte[] bytes = result.toByteArray();
ByteBuffer data = ByteBuffer.allocate(32);
System.arraycopy(bytes, 0, data.array(), 32 - bytes.length, bytes.length);
// expected 0x000000000000000000000000000000000000000000000000000000000000fe01
byte[] resultArray = data.array();This takes ~1500ms on my PC for 10000000 iterations and something tells me this can be done faster.
Solution
Your code is somewhat hypothetical, in the example you give, it is 0 * 0, which will be pretty fast....
Still, let's put things in to perspective. You are performing 10 million loops in 1.5 seconds. A modern Intel CPU runs at 3.9GHz. In 1.5 seconds, that's about 6billion cycles (5.85 billion or so).
about 6 billion cycles for 10 million loops is about 600 cycles per loop. Now, in that loop, you are:
That is an amazing amount of work to do in 600 instruction cycles.I would expect the various object creates (three BigIntegers, two arrays, and a Buffer) are the most expensive things to do as they require multiple instructions, allocations/reservations, and registrations in the object creation, including CPU/cache management, etc.
Frankly, what you have is pretty good.
You could reduce the amount of work you are doing by completely removing the ByteBuffer from the last section. This code:
could instead be:
But, what if the sum is more than 32 bytes?
Also, if you are at/near the limits, and you have negative input values, then you may end up with a negative result. I know your code appears to limit the negative impact by forcing a positive sign, but it could 'overflow' the 32 bytes.
Still, let's put things in to perspective. You are performing 10 million loops in 1.5 seconds. A modern Intel CPU runs at 3.9GHz. In 1.5 seconds, that's about 6billion cycles (5.85 billion or so).
about 6 billion cycles for 10 million loops is about 600 cycles per loop. Now, in that loop, you are:
- creating two byte arrays (OK, maybe you reuse those).
- creating 2 BigInteger objects
- running an add on those BigIntegers and computing a 3rd BigInteger
- converting that third big-integer back to an array.
- creating a ByteBuffer
- copying the array content to the byte buffer
- getting the byte buffer's internal array.
That is an amazing amount of work to do in 600 instruction cycles.I would expect the various object creates (three BigIntegers, two arrays, and a Buffer) are the most expensive things to do as they require multiple instructions, allocations/reservations, and registrations in the object creation, including CPU/cache management, etc.
Frankly, what you have is pretty good.
You could reduce the amount of work you are doing by completely removing the ByteBuffer from the last section. This code:
byte[] bytes = result.toByteArray();
ByteBuffer data = ByteBuffer.allocate(32);
System.arraycopy(bytes, 0, data.array(), 32 - bytes.length, bytes.length);
// expected 0x000000000000000000000000000000000000000000000000000000000000fe01
byte[] resultArray = data.array();could instead be:
byte[] bytes = result.toByteArray();
if (bytes.length < 32) {
byte[] tmp = new byte[32];
System.arrayCopy(byte, 0, tmp, 32 - bytes.length, bytes.length);
bytes = tmp;
}But, what if the sum is more than 32 bytes?
Also, if you are at/near the limits, and you have negative input values, then you may end up with a negative result. I know your code appears to limit the negative impact by forcing a positive sign, but it could 'overflow' the 32 bytes.
Code Snippets
byte[] bytes = result.toByteArray();
ByteBuffer data = ByteBuffer.allocate(32);
System.arraycopy(bytes, 0, data.array(), 32 - bytes.length, bytes.length);
// expected 0x000000000000000000000000000000000000000000000000000000000000fe01
byte[] resultArray = data.array();byte[] bytes = result.toByteArray();
if (bytes.length < 32) {
byte[] tmp = new byte[32];
System.arrayCopy(byte, 0, tmp, 32 - bytes.length, bytes.length);
bytes = tmp;
}Context
StackExchange Code Review Q#54250, answer score: 6
Revisions (0)
No revisions yet.