Cross-platform software performance comparison of various permutations.
b no data
h Sponge construction [10] with $c$ = 256 bits, $r$ = 128 bits and 256 bits of output.
- "Hashing 500 bytes": AVR cycles for comparability with [5].
- AEAD timings from [8] are scaled to estimate permutaton timings.
Hashing 500 bytes on AVR ATmega
Hash function |
Cycles |
ROM Bytes |
RAM Bytes |
Spongent [5] |
25 464 000 |
364 |
101 |
Keccak-f [400] [5] |
1 313 000 |
608 |
96 |
Gimli-Hashh small |
805 110 |
778 |
44 |
Gimli-Hashh fast |
362 712 |
19 218 |
45 |
AVR ATmega
permutations |
Cycles/B |
ROM bytes |
RAM Bytes |
Gimli small |
413 |
778 |
44 |
ChaCha20 [31] |
238 |
– b |
132 |
Salsa20 [19] |
216 |
1 750 |
266 |
Gimli fast |
213 |
19 218 |
45 |
AES-128 small [22] |
171 |
1 570 |
– b |
AES-128 fast [22] |
155 |
3 098 |
– b |
ARM Cortex-M0
Permutation |
Cycles/B |
ROM Bytes |
RAM Bytes |
Gimli |
49 |
4 730 |
64 |
ChaCha20 [23] |
40 |
– b |
– b |
Chaskey [21] |
17 |
414 |
– b |
ARM Cortex-M3/M4
Permutation |
Cycles/B |
ROM Bytes |
RAM Bytes |
Ascon [15] |
196 |
– b |
– b |
Keccak-f [400] [30] |
106 |
540 |
– b |
AES-128 [25] |
34 |
3 216 |
72 |
Gimli |
21 |
3 972 |
44 |
ChaCha20 [18] |
13 |
2 868 |
8 |
Chaskey [21] |
7 |
908 |
– b |
ARM Cortex-A8
Permutation |
Cycles/B |
ROM Bytes |
RAM Bytes |
Keccak-f [400] (KetjeSR) [8] |
37.52 |
– b |
– b |
Ascon [8] |
25.54 |
– b |
– b |
AES-128 [8] many blocks |
19.25 |
– b |
– b |
Gimli single block |
8.73 |
480 |
– b |
ChaCha20 [8] multiple blocks |
6.25 |
– b |
– b |
Salsa20 [8] multiple blocks |
5.48 |
– b |
– b |
Intel Haswell
Permutation |
Cycles/B |
ROM Bytes |
RAM Bytes |
Gimli single block |
4.46 |
252 |
– b |
NORX-32-4-1 [8] single block |
2.84 |
– b |
– b |
Gimli two blocks |
2.33 |
724 |
– b |
Gimli four blocks |
1.77 |
1227 |
– b |
Salsa20 [8] eight blocks |
1.38 |
– b |
– b |
ChaCha20 [8] eight blocks |
1.20 |
– b |
– b |
AES-128 [8] many blocks |
0.85 |
– b |
– b |
Version:
This is version 2017.06.26 of the Speed web page.