ECHO hash function home design security hard soft compare

The ECHO hash function offers a very wide range of throughput/area tradeoffs for hardware implementations. Additionally, since the main building block is an AES round, it is particularly suited to co-exist with AES implementations.

This page gives an overview of the performance of the ECHO hash function on various type of hardware.

quick overview

256-bit hash 512-bit hash
FPGA Throughput
(Gbps)
#Slices
 
Throughput
(Gbps)
#Slices
 
High speed FPGA (Virtex 6) 29.4578,071
High speed FPGA (Virtex 5) 26.39010,407 7.8109,097
   Low area FPGA (Virtex 5) 0.072127
ASIC Throughput
(Gbps)
#Gates
(KGates)
Throughput
(Gbps)
#Gates
(KGates)
High speed ASIC (0.13 μm) 14.850521.1 7.750516.8
   Low area ASIC (0.09 μm) 0.20460.0


hardware implementation strategies

Because the ECHO hash function echoes the AES structure, there are basically two levels of throughput/area tradeoff:

  • the AES round level, where it is possible to use between 1 and 16 modules for the S-box step, as well as 1 to 4 modules for the MixColumn step. Each of these modules is in turn subject to a specific tradeoff.
  • the ECHO state level, where BIG.SubWords can use 1 to 32 AES round modules and BIG.Mix can use 1 to 64 MixColumns (on 32-bit inputs) modules.

Thus, hardware implementations (FPGA or ASIC) offer a very large choice of speed/area tradeoffs. In addition, it is possible to reuse the MixColumns modules of the BIG.Sub step to perform the BIG.Mix step (incurring extra cycles latency though). Eventually, it is possible to unroll the whole compression function by pipelining the BIG.Sub+BIG.Mix step.

Classes of implementation strategies can thus be represented by a 6-tuple:

(#Sbox, #MixColumns, #AESRound, ReuseMix, CompactSbox, CompactMix)

where the number of modules (the first half of the tuple) is per compression function iteration, and where the ReuseMix flag indicates if the MixColumn modules are shared between the BIG.Sub and BIG.Mix steps.

Among the wide spectrum of implementations, we find the two extremes:

detailed FPGA performances

Here is a summary of ECHO douple-pipe performance on common FPGA platforms. For comparative performance figures with other candidates, take a look the SHA3 zoo hardware page. The FPGA sections concern high speed and low area extremes. Note that ECHO offers one of the highest throughput and one of the most compact area on FPGA.

256-bit hash 512-bit hash
Tput.
(Gbps)
#Slices
or #LEs
Freq.
(MHz)
Latency
(cycles)
Tput.
(Gbps)
#Slices
or #LEs
Freq.
(MHz)
Latency
(cycles)
Strategy
high
speed
Xilinx Virtex 6 1Optimized fully unrolled and parallel compress for ECHO, on virtex 6 - xc6vlx75t-3ff784 29.4578,071#Slices LUT: 25,892 #Slices Registers: 6,411 172.69 Fully unrolled and parallel
(512, 192, 2, 0, 1, 0)
Virtex 5 1Optimized fully unrolled and parallel compress for ECHO, on virtex 5 - xc5vlx155t-3ff1136 26.39010,407#Slices LUT: 33,152 #Slices Registers: 10,870 154.69 Fully unrolled and parallel
(512, 192, 2, 0, 1, 0)
Virtex 5 2Hardware Evaluation of SHA-3 Hash Function Candidate ECHO 14.8609,333 87.19 7.8109,097 83.911 Fully unrolled and parallel
(512, 192, 2, 0, ?, 0)
Virtex 5 4SHA-3: FPGA Implementation of ESSENCE and ECHO Hash Algorithm Candidates Using Bluespec, on virtex 5 - xc5vlx155t
(core)
23.86015,006#Slices LUT: 29,330 #Slices Registers: 4,105 139.09 Fully unrolled and parallel
(512, 192, 2, 0, ?, 0)
Virtex 5 4SHA-3: FPGA Implementation of ESSENCE and ECHO Hash Algorithm Candidates Using Bluespec, on virtex 5 - xc5vlx155t
(core)
3.5612,061#Slices LUT: 14,407 #Slices Registers: 8,800 187.081 BIG.Sub: 1/4th of ECHO
16 SBox/AES, 2 rounds
BIG.Mix: 64 in a row
(128, 92, 2, 0, ?, 0)
Virtex 5 5Evaluation of Hardware Performance for the SHA-3 Candidates Using SASEBO-GII, on virtex 5 - xc5vlx30-3ff324 2.3122,827#Slices LUT: 9,885 #Slices Registers: 4,198 149.099 BIG.Sub: 1/8th of ECHO
16 SBox/AES, 1 round
BIG.Mix: 16 cells/reuse 4
(64, 16, 1, 1, 1, 0)
Altera Cyclone II 3Implementation and evaluation of SHA-3 candidates on FPGA 0.39739,091 70.6273 0.21239,091 70.6341 BIG.Sub: 1/32th of ECHO
16 SBox/AES, 1 round
BIG.Mix: 64 in a row
(16, 68, 1, 0, 0, 0)
low
area
Xilinx Virtex 5 7A Compact FPGA Implementation of the SHA-3 Candidate ECHO, on virtex 5 - xc5vlx50-2 0.072127
+1 mem
352.06593 BIG.Sub: 1/256th of ECHO
1 SBox/AES, 1 round
BIG.Mix: 1 MixColumns
reused
(1, 1, 1, 1, 1, 1)

The VHDL implementations, whenever publicly available, can be downloaded. We also provide a new implementation with very high throughput on Xilinx Virtex 5 and Virtex 6: the synthesis and mapping reports of the Xilinx ISE software are included in the source package.

detailed ASIC performances

Here is a summary of ECHO douple-pipe performance on ASIC platforms. As for FPGAs, comparative studies with other candidates can be found on the SHA-3 zoo for high speed as well as low cost designs.

256-bit hash 512-bit hash
Tput.
(Gbps)
#Gates
(KGates)
Freq.
(MHz)
Latency
(cycles)
Tput.
(Gbps)
#Gates
(KGates)
Freq.
(MHz)
Latency
(cycles)
Strategy
high
speed
UMC 0.09 μm 6Developing a Hardware Evaluation Method for SHA-3 Candidates 13.966260.0 29132 BIG.Sub: 1/4th of ECHO
16 Sbox/AES, 2 rounds
BIG.Mix: 16 in a row
(128, 32, 2, 0, 1, 0)
UMC 0.13 μm 2Hardware Evaluation of SHA-3 Hash Function Candidate ECHO 14.850521.1 87.19 7.750516.8 83.311 Fully unrolled and parallel
(512, 192, 2, 0, ?, 0)
UMC 0.18 μm 8High-Speed Hardware Implementations of BLAKE, Blue Midnight Wish, CubeHash, ECHO, Fugue, Grøstl, Hamsi, JH, Keccak, Luffa, Shabal, SHAvite-3, SIMD and Skein 2.246141.49 141.8497 BIG.Sub: 1/8th of ECHO
16 SBox/AES, 1 round
BigMix: 16 in a row
(64, 32, 1, 0, ?, 0)
low
area
UMC 0.09 μm 6Developing a Hardware Evaluation Method for SHA-3 Candidates 0.20460.0 137.0611034 BIG.Sub: 1/128th of ECHO
4 SBox/AES, 1 round
BIG.Mix: 64 in a row
(4, 65, 1, 0, 1, 0)
UMC 0.13 μm 2Hardware Evaluation of SHA-3 Hash Function Candidate ECHO 0.37382.8 66.6274 BIG.Sub: 1/32th of ECHO
16 SBox/AES, 1 round
BIG.Mix: 64 in a row
(16, 68, 1, 0, ?, 0)

The smallest of the reported implementations of ECHO requires 60.0 KGE. We however claim that this figure is overestimated. Indeed, the authors of [6Developing a Hardware Evaluation Method for SHA-3 Candidates] use a (4, 65, 1, 0, 1, 0) strategy which is still far away from the (1, 1, 1, 1, 1, 1) strategy used for FPGAs in [7A Compact FPGA Implementation of the SHA-3 Candidate ECHO]. In addition, the most compact implementations of the AES (using a single S-box and a single MixColumns unit) have an area of 3.1 KG [9Design and Implementation of Low-area and Low-power AES Encryption Hardware Core], but the authors use two S-boxes as well as a dedicated key expansion unit. Without those units, the area drops to roughly 1.77 KG. Obviously, ECHO additionally requires to sore the state and a (fixed) salt, a 64-bit key, a 64-bit addition unit, and some logic and counters for the FSM driving the BIG.Sub/BIG.Mix/BIGFinal units. The addition unit and the FSM logics amounts to 1.25 KG approximately and storing the state requires about 5000 bits (i.e. 30 KG since a bit on UMC 0.13 μm ASIC is 6 GE, see for instance [10Compact Implementations of Pairings]). Hence, it should be possible to implement ECHO in about 33 KG.

references