复制阵列性能



我想将两个矩阵相乘,所以我决定将矩阵分成一些部分。我写了两个不同的matriceSplit函数,但我感到困惑。我的一个函数使用 system arraycopy,另一个使用 for 循环。我观察到for looparraycopy方法运行得更快。

 private static int[][] getPartOfMatrix(int[][] matrix, int size, int part) {
        int[][] newMatrix = new int[size][matrix[0].length];
        for (int i = part * size; i < (part + 1) * size; i++) {
            System.arraycopy(matrix[i], 0, newMatrix[i], 0, matrix[i].length);
        }
        return newMatrix;
    }
 private static int[][] getPartOfMatrix2(int[][] matrix, int size, int part) {
        int[][] newMatrix = new int[size][matrix[0].length];
        for (int i = part * size, r = 0; i < (part + 1) * size; i++, r++) {
            for (int j = 0; j < matrix[0].length; j++) {
                newMatrix[r][j] = matrix[i][j];
            }
        }
        return newMatrix;
    }

我应该使用哪个,为什么?

package tests;
import org.openjdk.jmh.annotations.*;
import java.util.concurrent.TimeUnit;
@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class CopyArray implements UnsafeConstants {
    @Param({"0", "1", "10", "16", "1000", "1024", "8192"})
    public int arraySize;
    public int[] a;
    public int[] copy;
    @Setup
    public void setup() {
        a = new int[arraySize];
        copy = new int[arraySize];
    }
    @Benchmark
    public int[] arrayCopy(CopyArray state) {
        int[] a = state.a;
        int[] copy = state.copy;
        System.arraycopy(a, 0, copy, 0, a.length);
        return copy;
    }
    @Benchmark
    public int[] forLoop(CopyArray state) {
        int[] a = state.a;
        int arraySize = a.length;
        int[] copy = state.copy;
        for (int i = 0; i < arraySize; i++) {
            copy[i] = a[i];
        }
        return copy;
    }
    @Benchmark
    public int[] unsafeCopyMemory(CopyArray state) {
        int[] a = state.a;
        int arraySize = a.length;
        int[] copy = state.copy;
        U.copyMemory(a, INT_BASE, copy, INT_BASE, arraySize << INT_SCALE_SHIFT);
        return copy;
    }
}

结果:

Benchmark                       (arraySize)  Mode  Samples     Score     Error  Units
t.CopyArray.arrayCopy                     0  avgt       10     3,598 ▒   0,385  ns/op
t.CopyArray.arrayCopy                     1  avgt       10     7,566 ▒   0,961  ns/op
t.CopyArray.arrayCopy                    10  avgt       10     8,629 ▒   0,988  ns/op
t.CopyArray.arrayCopy                    16  avgt       10     9,994 ▒   0,667  ns/op
t.CopyArray.arrayCopy                  1000  avgt       10   164,613 ▒  19,103  ns/op
t.CopyArray.arrayCopy                  1024  avgt       10   320,658 ▒  26,458  ns/op
t.CopyArray.arrayCopy                  8192  avgt       10  2468,847 ▒ 204,341  ns/op
t.CopyArray.forLoop                       0  avgt       10     2,598 ▒   0,194  ns/op
t.CopyArray.forLoop                       1  avgt       10     4,161 ▒   0,841  ns/op
t.CopyArray.forLoop                      10  avgt       10    10,056 ▒   1,166  ns/op
t.CopyArray.forLoop                      16  avgt       10    11,004 ▒   1,477  ns/op
t.CopyArray.forLoop                    1000  avgt       10   207,118 ▒  36,371  ns/op
t.CopyArray.forLoop                    1024  avgt       10   206,291 ▒  26,327  ns/op
t.CopyArray.forLoop                    8192  avgt       10  1867,073 ▒ 238,488  ns/op
t.CopyArray.unsafeCopyMemory              0  avgt       10     7,080 ▒   0,082  ns/op
t.CopyArray.unsafeCopyMemory              1  avgt       10     8,257 ▒   0,184  ns/op
t.CopyArray.unsafeCopyMemory             10  avgt       10     8,424 ▒   0,365  ns/op
t.CopyArray.unsafeCopyMemory             16  avgt       10    10,129 ▒   0,076  ns/op
t.CopyArray.unsafeCopyMemory           1000  avgt       10   213,239 ▒  30,729  ns/op
t.CopyArray.unsafeCopyMemory           1024  avgt       10   310,881 ▒  34,527  ns/op
t.CopyArray.unsafeCopyMemory           8192  avgt       10  2419,456 ▒  66,557  ns/op

结论:

  • Unsafe.copyMemory永远不是一种选择。
  • 当数组大小为 2 的大幂时,for循环优于 System.arraycopy
  • 否则,需要对您的特定矩阵行宽进行额外的研究。

最新更新