Skip to main content

Improved homomorphic evaluation for hash function based on TFHE

Abstract

Homomorphic evaluation of hash functions offers a solution to the challenge of data integrity authentication in the context of homomorphic encryption. The earliest attempt to achieve homomorphic evaluation of SHA-256 hash function was proposed by Mella and Susella (in: Cryptography and coding—14th IMA international conference, IMACC 2013. Lecture notes in computer science, vol 8308. Springer, Heidelberg, pp 28–44, 2013. https://doi.org/10.1007/978-3-642-45239-0_3.) based on the BGV scheme. Unfortunately, their implementation faced significant limitations due to the exceedingly high multiplicative depth, rendering it impractical. Recently, a homomorphic implementation of SHA-256 based on the TFHE scheme (Homomorphic evaluation of SHA-256. https://github.com/zama-ai/tfhe-rs/tree/main/tfhe/examples/sha256_bool) brings it from theory to reality, however, its current efficiency remains insufficient. In this paper, we revisit the homomorphic evaluation of the SHA-256 hash function in the context of TFHE, further reducing the reliance on gate bootstrapping and enhancing evaluation latency. Specifically, we primarily utilize ternary gates to reduce the number of gate bootstrappings required for logic functions in message expansion and addition of modulo \(2^{32}\) in iterative compression. Furthermore, we demonstrate that our optimization techniques are applicable to the Chinese commercial cryptographic hash SM3. Finally, we give specific comparative implementations based on the TFHE-rs library. Experiments demonstrate that our optimization techniques lead to an improvement of approximately 35–50% compared with the state-of-the-art result under different cores.

Introduction

Fully homomorphic encryption (FHE) is a cryptographic technique that allows performing arbitrary function on ciphertexts without decryption. This remarkable property makes FHE an ideal solution for addressing security concerns in various domain such as machine learning, cloud computing, medical diagnostic and financial data analysis. Since Gentry (Gentry 2009) proposed ingenius bootstrapping technique to construct the first true fully homomorphic encryption scheme, extensive research spanning over a decade has resulted in significant advancements in both theoretical understanding and practical implementations of FHE. Some representative works include BGV (Brakerski et al. 2012), BFV (Brakerski 2012; Fan and Vercauteren 2012), CKKS (Cheon et al. 2017, 2018), FHEW (Ducas and Micciancio 2015), TFHE (Chillotti et al. 2020) and Final (Bonte et al. 2022).

Indeed, one of the major challenges in FHE is the significant expansion in ciphertext size, which is generally three to six orders of magnitude larger than the plaintext size. Transciphering (Naehrig et al. 2011), by combining FHE with symmetric encryption scheme, can tackle the challenge of ciphertext size expansion, thereby mitigating the impact on communication costs between the client and the cloud. Specifically, instead of encrypting the data using fully homomorphic encryption scheme, the client encrypts the data using traditional symmetric encryption scheme. The encrypted data, in the form of symmetric ciphertexts, is then transmitted to the cloud. In this way, the ciphertext size expansion ratio of the data is only 1 (i.e., the ciphertext size divided by the plaintext size). Some additional operations need to be performed on the server side: convert the symmetric ciphertext to homomorphic ciphertext by evaluating the decryption circuit of symmetric encryption scheme homomorphically. Once the conversion is complete, the cloud can proceed to evaluate the desired function homomorphically. Therefore, optimizing the multiplicative depth of the decryption circuit is vital for achieving efficient execution within the transciphering framework.

Homomorphic evaluation of symmetric encryption schemes, including block ciphers and stream ciphers, has garnered significant attention in recent years. Early in 2012, Gentry et al. (2012) presented a homomorphic evaluation of AES-128 encryption using the BGV scheme, and they obtained an execution time of more than 4 min based on the leveled mode and a latency of 18 min based on the boostrapped mode(in updated version of this paper). Since then, optimized evaluations of AES have been developed, and a recent work (Trama et al. 2023) claimed to reduce the evaluation time of an AES block to 30 s. In addition to optimizing AES, researchers have explored the use of lightweight block ciphers to achieve lower evaluation latency. On the other hand, researchers have also delved into the investigation of specialized FHE-friendly block ciphers (Albrecht et al. 2015) or stream cipher (Dobraunig et al. 2018; Cid et al. 2022) with lower multiplicative depth and complexity.

Motivation for homomorphic evaluation of hash function

Fully homomorphic encryption in combination with symmetric encryption solves the ciphertext size expansion problem. What about with hash functions? A direct application is to verify the integrity of data in a homomorphic sense. The earliest evaluation of hash function can be traced back to Mella and Susella (2013), who presented a homomorphic evaluation of the SHA-256 hash algorithm based on the BGV scheme. However, the main challenge encountered in evaluating SHA-256 homomorphically is the extremely high multiplicative depth caused by its significant number of iteration rounds, and the authors did not provide a practical implementation time. Compared with the BGV scheme, TFHE has the advantage of not being limited by circuit depth, e.g., Lou and Jiang (2019) evaluated deep neural networks by means of TFHE. Recently in Bendoukha et al. (2022), Bendoukha et al evaluated hash functions constructed by lightweight block ciphers such as PRINCE, SIMON, and LowMC using the TFHE scheme. They also proposed several intriguing application scenarios for homomorphic evaluation of hash, such as Homomorphic Data Integrity Check, Single Secret Leader Election, Homomorphic Database Querying and Oblivious Authenticated (Homomorphic) Calculation, which greatly encourage and highlight the need for homomorphic evaluation of hash functions. However, it is worth noting that their homomorphic evaluation of hash functions is directly derived from some previous evaluation of lightweight block ciphers, and these constructed hash functions are not already standardized, making them difficult to deploy in industry. In this paper, we focus on the well-studied and standardized hash algorithm SHA-256 and Chinese commercial cryptographic hash SM3 (https://oscca.gov.cn/sca/xxgk/2010-12/17/1002389/files/302a3ada057c4a73830536d03e683110.pdf). We note that a homomorphic implementation of SHA-256 (Homomorphic evaluation 2023) is proposed based on the TFHE scheme, but there is still significant room for optimization.

Our contributions

In this paper, we revisit the evaluation of SHA-256 in the context of TFHE homomorphic encryption and concentrate on improving the latency of SHA-256 evaluation. We first discuss modifications to the SHA-256 code to make it more friendly to the TFHE scheme. One significant improvement is the utilization of ternary gates, which effectively reduces the number of gate bootstrappings required for evaluating SHA-256. Specifically, the logic functions \(\sigma _0\)Footnote 1\(\sigma _1\),Footnote 2\(s_0\),Footnote 3\(s_1\)Footnote 4 and MajFootnote 5 required in message expansion can be evaluated with only a single bootstrapping. For the expensive addition of modulo \(2^{32}\), we present a number of optimization techniques to further minimize the number of required gate bootstrappings. Moreover, we show that our optimization techniques are also applicable to the evaluation of SM3 hash algorithm. Finally, we provide a concrete implementation based on the TFHE-rs library. Our experimental results show that our optimization tricks can achieve about 35%-50% efficiency gains compared with the state-of-the-art under different CPUs.

Related works

The transciphering framework was initially proposed in Naehrig et al. (2011), and early works mainly focused on some popular symmetric ciphers, such as AES (Gentry et al. 2012), SIMON (Lepoint and Naehrig 2014), SPECK (Togan et al. 2015) and PRINCE (Doröz et al. 2016). However, their evaluation efficiency is not satisfactory due to the high multiplicative depth. Two recent works (Stracovsky et al. 2022; Trama et al. 2023) based on TFHE’s programmable bootstrapping technique greatly improve the evaluation latency of AES.

There has been significant research on designing FHE-friendly symmetric cryptographic primitives, aiming to achieve lower multiplicative complexity and depth. LowMC (Albrecht et al. 2015) is the first FHE-friendly cipher, however, it has been found to be vulnerable to algebraic attack (Dinur et al. 2015; Dobraunig et al. 2015; Rechberger et al. 2018). In 2022, an FHE-friendly block cipher called Chaghri (Ashur et al. 2022) with lower multiplicative depth is proposed, which is 63% faster than the evaluation of AES using the BGV scheme. Another line of research focuses on FHE-friendly stream cipher design that allow some expensive computations to be performed offline due to the fact that their encryption and decryption are simple XORs. Canteaut et al. (2016) first evaluated the Trivum algorithm in the eSTREAM project and proposed Kreyvium with a 128 bit security level. Since then, numerous FHE-friendly stream cipher designs have emerged, such as FLIP-like (Méaux et al. 2016, 2019; Hoffmann et al. 2020; Cosseron et al. 2022) and Rasta-like (Dobraunig et al. 2018; Ha et al. 2020; Hebborn and Leander 2020; Dobraunig et al. 2023; Cid et al. 2022). Mandal and Gong (2021) et al studied the gate complexity of boolean circuits from NIST lightweight cryptography (LWC) round 2 candidates and gave their evaluation latency based on the TFHE scheme. Moreover, Cho et al. (2021) proposed a transciphering framework for approximate homomorphic encryption, called RtF, which consists of stream cipher over modular domain and transformation from BFV to CKKS. Also they proposed the stream cipher HERA as building block of the RtF framework. Ha et al. (2022) proposed faster Rubato cipher suitable for the RtF framework, which has lower multiplicative depth.

The first SHA-256 evaluation based on BGV scheme was given by Mella and Susella (2013). The required multiplication depths for word-sliced implementation, packed implementation and bit slice implementation are 2762.5, 3310.5 and 2634, respectively. Due to ultra high multiplication depths, it is not possible to give a practical implementation of SHA-256. Bendoukha et al. (2022) homomorphically evaluated hash functions based on the construction of “FHE-friendly” grouping ciphers such as PRINCE, LowMC and SIMON. In Homomorphic evaluation (2023) the authors presented a practical implementation of SHA-256 based on the TFHE scheme combined with a number of optimization techniques.

Paper organization

The paper is organized as follows. In “Preliminaries” section, we review the preliminary knowledge required for this paper, in particular, about the TFHE cryptosystem. “Specifications of SHA-256 and SM3” section gives an introduction about the NIST standard hash SHA-256 and the Chinese commercial cryptographic hash SM3. “Hash goes to homomorphic” section provides details about how to convert the these two hash algorithms to efficient homomorphic computation. “Implementation and experimental results” section presents specific performance and implementation results. We conclude this paper in “Conclusion” section.

Preliminaries

Notations

Let \({\mathbb {T}} = {\mathbb {R}}/{\mathbb {Z}}\) be the real torus, i.e., the additive group of real numbers modulo 1. We will use \({\mathbb {T}}_N[X]^{k}\) to denote the set of polynomials of size k that have coefficients in \({\mathbb {T}}\) and modulo \((X^N+1)\), where N is usually a power of 2. \({\mathbb {B}}_N[X]\) denotes the polynomials with binary coefficients and modulo \(X^{N}+1\). \(<,>\) denotes the inner product. We use \(\ggg\) to denote right-rotation, and \(\gg\) to represent right-shift operations, such as \(x \gg n\) by discarding the rightmost n bits and then adding n zeros to the left.

Hash function

Hash function can map message (data) with arbitrary length into hash value with fixed length (also known as message digest), which is widely used in cryptography, typically for signature, encryption, message authentication code and other authentication, etc. Hash function need to satisfy the following security properties:

  • Collision resistance: Finding two messages with the same hash value is computationally difficult.

  • Pre-image Resistance: Given the value h, which is the output of some hash function H, finding the message m such that \(h = H(m)\) is computationally hard.

  • Second Pre-image Resistance: Given a message m and its hash value h, i.e., \(H(m)=h\), finding another message \(m'\ne m\) such that \(H(m)=H(m')\) is computationally hard.

The TFHE cryptosystem

TFHE (Chillotti et al. 2020) is currently the fastest scheme to achieve bootstrapping, which builds on the FHEW scheme (Ducas and Micciancio 2015). There are three types of ciphertexts defined in the TFHE scheme, and they play different roles in fast bootstrapping.

TFHE ciphertexts

  • TLWE: \((a, b = <a, s>+m+e) \in {\mathbb {T}}^{n+1}\), where a is uniformly sampled from \({\mathbb {T}}^{n}\), m is the encoded message, the secret key s is uniformly sampled from \({\mathbb {B}}^{n}\), and the error \(e \in {\mathbb {T}}^{n}\) is sampled from Gaussian distribution with mean 0 and standard deviation \(\sigma\).

  • TRLWE: \((a, b = <a, s>+m+e) \in {\mathbb {T}}_N[X]^{k+1}\), where a is uniformly sampled from \({\mathbb {T}}_N[X]^{k}\), m is the encoded phase polynomial, the secret key s is uniformly sampled from \({\mathbb {B}}_N[X]^k\) and the error \(e \in T_N[X]\) is a polynomial with random coefficients from sampled from Gaussian distribution with mean 0 and standard deviation \(\sigma\). Generally, \(k = 1\).

  • TRGSW: \(2\ell _{PBS}\) fresh TRLWE samples. In detail, TRGSW encrypts the message \(m \in {\mathbb {B}}\) into C as follows:

    $$\begin{aligned} C = \begin{pmatrix} a_1(x) & b_1(x)\\ a_2(x) & b_2(x)\\ \vdots & \vdots \\ a_{\ell _{PBS}}(x) & b_{\ell _{PBS}}(x)\\ a_{\ell _{PBS}+1}(x) & b_{\ell _{PBS}+1}(x)\\ \vdots & \vdots \\ a_{2\ell _{PBS}}(x) & b_{2\ell _{PBS}}(x)\\ \end{pmatrix} + m \cdot \begin{pmatrix} 1/\beta _{PBS} & 0\\ 1/\beta _{PBS}^2 & 0\\ \vdots & \vdots \\ 1/\beta _{PBS}^l & 0\\ 0 & 1/\beta _{PBS}\\ \vdots & \vdots \\ 0 & 1/\beta _{PBS}^{\ell _{PBS}} \end{pmatrix} \end{aligned}$$

    where \((a_i(x), b_i(x)), \text {for}\ 1 \le i \le 2\ell _{PBS}\) is are TRLWE ciphertexts encrypting 0 using the same secret key, \(\beta _{PBS}\) denotes the basis of gadget decomposition and \(\ell _{PBS}\) is the length of gadget decomposition.

Remark

In TFHE’s bootstrapping, the TLWE ciphertext is the input to be bootstrapped, TRLWE is the ciphertext that encodes the test polynomial and will be used as intermediate ciphertext in the bootstrapping. Each part of the TLWE secret key would be encrypted to be TRGSW ciphertext as bootstrapping key, which can be precomputed.

TFHE bootstrapping

Bootstrapping allows refreshing ciphertext with large noise to support further homomorphic computation. The most important feature of the TFHE scheme is the efficient bootstrapping, which consists of three core algorithms: blind rotation, sample extraction and key switching, as shown in Algorithm 2.

Key Switching Two kinds of Key Switching are proposed by Chillotti et al. (2020). The first one is Public Functional KeySwitching, which allows packing TLWE samples into TRLWE sample or switching secret key. It can also evaluate the public linear function f on the input TLWE samples. The second one is Private Functional KeySwitching, which can evaluate private linear function on the input TLWE samples by encoding the secret f into the KeySwitching key.

Blind Rotation

Blind rotation, as the name implies, rotates a polynomial encrypted as TRLWE ciphertext by an encrypted index, which is the core operation in bootstrapping. In fact, the blind rotation is mainly constructed by successive external products. Algorithm 1 presents the detailed blind rotation operation.

Algorithm 1
figure a

BlindRotation (Chillotti et al. 2020)

Sample Extraction

This operation can extract the TLWE ciphertext encrypting any \(m_i\) from the TRLWE ciphertext encrypting the message \(m(x) \in {\mathbb {T}}_N[X]\). For example, SampleExtract\(_0(a(x),b(x))\) is \((a_0, -a_{N-1}, \cdots , -a_1, b_0) \in {\mathbb {T}}^{N+1}\), which encrypts \(m_0\). This can be simply proved by the decryption of TRLWE.

Algorithm 2
figure b

TFHE’s Gate Boostrapping (Chillotti et al. 2020)

Specifications of SHA-256 and SM3

SHA-256 (Science 2012) is a hash function developed by the NSA and published by NIST in 2001, while SM3 (https://oscca.gov.cn/sca/xxgk/2010-12/17/1002389/files/302a3ada057c4a73830536d03e683110.pdf) is a Chinese commercial cryptographic hash algorithm standard published by the Chinese National Cryptography Administration in 2010. Both of them are Merkle-Damg\(\mathring{\text {a}}\)rd structure that processes a 512-bit block of input messages and returns a 256-bit hash value. The hash function SHA-256 and SM3 operate on 32-bit variables, combining NOT, XOR, OR, AND, rotation and addition of modulo \(2^{32}\).

Message padding

Assume that the message m has \(\ell\) bits length. First add “1” to the end of the message followed by k zeros, where k is the smallest non-negative integer such that \(\ell + k + 1 = 448 \pmod {512}\). And then add a 64-bit string which is equal to the binary expansion of \(\ell\). The bit length of the padded message M is a multiple of 512.

Recall on SHA-256 hash function

Some useful logical functions

These useful functions will be used in the message schedule and iterative compression function.

$$\begin{aligned} \begin{aligned} \sigma _0(x)&= (x \ggg 7) \oplus (x \ggg 18) \oplus (x \gg 3) \\ \sigma _1(x)&= (x \ggg 17) \oplus (x \ggg 19) \oplus (x \gg 10) \\ \textit{Ch}(x, y, z)&=(x \wedge y) \oplus (\lnot x \wedge z) \\ \textit{Maj}(x, y, z)&=(x \wedge y) \oplus (x \wedge z) \oplus (y \wedge z) \\ s_0(x)&= (x \ggg 2) \oplus (x \ggg 13) \oplus (x \gg 22) \\ s_1(x)&= (x \ggg 6) \oplus (x \ggg 11) \oplus (x \gg 25) \\ \end{aligned} \end{aligned}$$

SHA-256 hash computation

Then, each message block \(M^{1}, M^{2}, \ldots , M^{N}\) would be processed using the following four loop steps, for i from 1 to N:

(1)Message schedule:

$$\begin{aligned} w_t = {\left\{ \begin{array}{ll} M^i_{t}, \text {\ if} \ 0 \le t \le 15\\ \sigma _1(w_{t-2})+w_{t-7}+\sigma _0(w_{t-15}) + w_{t-16}, \text {\ if } \ 16 \le t \le 63\ \end{array}\right. } \end{aligned}$$

(2)Initialization:

$$\begin{aligned} a= H^{i-1}_0, b= H^{i-1}_1, c= H^{i-1}_2, d= H^{i-1}_3 \\ e= H^{i-1}_4, f= H^{i-1}_6, g= H^{i-1}_0, h= H^{i-1}_7 \end{aligned}$$

(3)Iterative compression:

for \(t=0 \text { to } 63:\)

\(\begin{aligned} T_1&= h + s_1(e) + Ch(e,f,g) + K_t +w_t, T_2 = s_0(a) + Maj(a,b,c) \\ h&= g, g = f, f = e, e = d + T_1, d = c, c = b, b = a, a = T_1 + T_2 \end{aligned}\)

(4) Compute the \(i-\)th intermediate hash value H(i):

$$\begin{aligned} H_{0}^{(i)}&=a+H_{0}^{(i-1)}, H_{1}^{(i)}=b+H_{1}^{(i-1)}, H_{2}^{(i)}=c+H_{2}^{(i-1)}, H_{3}^{(i)}=d+H_{3}^{(i-1)} \\ H_{4}^{(i)}&=e+H_{4}^{(i-1)}, H_{5}^{(i)}=f+H_{5}^{(i-1)}, H_{6}^{(i)}=g+H_{6}^{(i-1)}, H_{7}^{(i)}=h+H_{7}^{(i-1)} \end{aligned}$$

After repeating steps one through four a total of N times, the resulting 256-bit message digest is \((H^{(N)}_0 || H^{(N)}_1 || H^{(N)}_2 || H^{(N)}_3 || H^{(N)}_4 || H^{(N)}_5 || H^{(N)}_6 || H^{(N)}_7)\). Figure 1 illustrates the state update step of SHA-256.

Fig. 1
figure 1

The state update function of SHA-256

Recall on SM3 hash function

SM3 consists of two parts: message expansion and status update transformation. Below, we will describe these two parts. The auxiliary functions \(P_{0}\) and \(P_{1}\), which operate on 32-bit words, are defined as follows:

$$\begin{aligned} P_{0}(X)&= X \oplus (X \lll 9) \oplus (X \lll 17)\\ P_{1}(X)&= X \oplus (X \lll 15) \oplus (X \lll 23) \end{aligned}$$

Message expansion

The input here is the 512 message block splitted as 16 32-bit words \(W_0, \ldots , W_{15}\) and then is expanded to 68 32-bit words \(W_i\):

$$\begin{aligned} W_i = P_1(W_{j-16} \oplus W_{j-9} \oplus (W_{j-3} \lll 15)) \oplus (W_{j-13} \lll 7) \oplus W_{j-6} \end{aligned}$$

for \(16 \le i < 68\) and 64 expanded words \(W_i^{'} = W_{i} \oplus W_{i+4}, \text {\ for } 0 \le i < 64\).

State update transformation

In SM3, the state update transformation starts with fixed initial values of eight 32-bit words and updates them in 64 rounds. Let ABCDE\(F,G\ \text {and}\ H\) denote the inner state registers, the j-th round transformation is given by

$$\begin{aligned} SS1&= ((A \lll 12) + E + (T_j \lll j)) \lll 7, \\ SS2&= SS1 \oplus (A \lll 12), \\ TT1&= FF_j(A,B,C)+ D+ SS2+ W_j^{'}, \\ TT2&= GG_j(E,F,G)+ H+ SS1+ W_j, \\ D&= C, C = B \lll 9 , B = A, A = TT1, \\ H&= G, G = F \lll 19, F = E, E = P_0(TT2), \end{aligned}$$

where the bitwise boolean functions \(FF_{j}\) and \(GG_{j}\) are defined by

$$\begin{aligned} FF_{j}(X,Y,Z)&= {\left\{ \begin{array}{ll} X \oplus Y \oplus Z & \text {if } 0 \le j \le 15,\\ (X\wedge Y)\vee (Y\wedge Z)\vee (X\wedge Z) & \text {if } 16 \le j< 64. \end{array}\right. } \\ GG_{j}(X,Y,Z)&= {\left\{ \begin{array}{ll} X \oplus Y \oplus Z & \text {if } 0 \le j \le 15,\\ (X\wedge Y)\vee (\lnot X\wedge Z) & \text {if } 16 \le j < 64.\\ \end{array}\right. } \end{aligned}$$

Note that \(T_j = 0x79cc4519\) for \(0 \le j < 15\) and \(T_j = 0x7a879d8a\), for \(16 \le j < 63\). After the last step of the state update transformation, the initial values are added to the output values of the last step. The result is the final hash value or the initial value for the next message block, as SHA-256.

Hash goes to homomorphic

Indeed, when designing hash functions, it is crucial to ensure efficient computation on software platforms. As shown in “Specifications of SHA-256 and SM3” section, the core computation units of hash functions typically involve basic instructions such as AND, OR, NOT, and ROTATION. The TFHE scheme boasts efficient gate bootstrapping, and obviously the evaluation of function designed by gates based on this scheme is more flexible and not limited by the circuit depth compared with the BGV or BFV scheme. Therefore we will present the homomorphic computation of SHA-256 and SM3 by means of TFHE.

It is important to highlight that gate bootstrapping is computationally demanding when gates are used as the basic computational unit in the encrypted domain. To improve the overall computational performance, minimizing the number of gates consumed by the circuit becomes a crucial consideration. In particular, in SHA-256 and SM3, the basic operation mainly consists of functions composed of logic gates and addition of modulo \(2^{32}\). In the following, we present our circuit optimization.

A short reminder of gate bootstrapping.

For ease of representation, in gate bootstrapping, binary messages 0 and 1 are encoded as \(-1/8\) and 1/8 over the torus, respectively. Now assume two TLWE ciphertexts \(c_1\) and \(c_2\), then some basic homomorphic gate operations are as follows:

  • HomNOT(c) = (\(\textbf{0}, 1/8\)) \(- c\) (no bootstrapping);

  • HomAND(\(c_1, c_2\)) = \((\textbf{0}, -1/8) +\) Bootstrap(\(c_1 + c_2\));

  • HomXOR(\(c_1, c_2\)) = \((\textbf{0},1/4)+\)Bootstrap(\(2(c_1 \pm c_2)\));

  • HomOR(\(c_1, c_2\)) = \((\textbf{0}, 1/8) +\) Bootstrap(\(c_1+ c_2\));

  • HomMUX\((c,d_{0},d_{1})\) can evaluate \(c?d_{1}:d_{0}=(c \wedge d_{1})\oplus ((1-c)\wedge d_{0})\) using two gate bootstrappings and a public key switching.

Trivial gate reduction in SHA-256

In Homomorphic evaluation (2023), the authors proposed optimizations for reducing the usage of logic gates in the Ch and Maj functions of the SHA-256 algorithm, thereby reducing the number of gate bootstrappings required. Specifically, for function \(Ch(x, y, z) = (x \wedge y) \oplus (\lnot x \wedge z)\), it can be easily inferred that the result is y when \(x=1\), and z when \(x=0\), which behaves like a bitwise multiplexer. In this way, we can replace the 4 gates in the Ch function with a HomMUX gate in the encrypted domain. The function Thanks to the the boolean distributive law

$$\begin{aligned} (x \wedge y) \oplus (x \wedge z) = x \wedge (y \oplus z), \end{aligned}$$

and

$$\begin{aligned} Maj(x,y,z) =(x \wedge y) \oplus (x \wedge z) \oplus (y \wedge z) \end{aligned}$$

can be simplified as

$$\begin{aligned} (x \wedge (y \oplus z)) \oplus (y \wedge z). \end{aligned}$$

As a result, the number of gates required by Maj can be reduced from 5 to 4. While these optimizations do improve the overall evaluation efficiency of the SHA-256 hash, they are still not sufficient for achieving optimal efficiency within the TFHE scheme.

Further gate reduction of function in SHA-256

In this subsection, we further reduce the number of gates needed to evaluate the SHA-256 in the encrypted domain. We observe that the \(\sigma _0, \sigma _1, s_0 \text { and } s_1\) functions involve different rotations or shifts of 32-bit word, followed by two consecutive XOR operations. The rotation and shift operations are now free due to bit-wise encryption, and next we will explain how to implement the XOR between the 3 inputs using one gate bootstrapping (i.e., one blind rotation). Moreover, the Maj function can also be implemented with only one gate bootstrapping.

Ternary gates are introduced into the TFHE scheme in Matsuoka et al. (2021), containing XOR3 and 2OF3Footnote 6 gates, where XOR3 is the XOR of 3 inputs, and 2OF3 gate outputs true if at least two inputs are true.

The implementation of the ternary gates in the encrypted domain is as follows:

$$\begin{aligned} {\textbf {HomXOR3}}(a,b,c)&: \text {Bootstrap}(-2(a+b+c)), \\ {\textbf {Hom2OF3}}(a,b,c)&: \text {Bootstrap}(a+b+c). \end{aligned}$$

Now we give a high-level explanation for their correctness. Note that the test (negacyclic) polynomial in the gate bootstrapping is set to:

$$\begin{aligned} f(X) = \frac{1}{8}+ \frac{1}{8}X+ \cdots + \frac{1}{8}X^{N-1}. \end{aligned}$$

From another point of view, for the \(\textit{XOR3}\) function, the result is equal to the least significant bit of the sum of the 3 inputs. As we show in Fig. 2, when three plain inputs are 0||0||0 or 1||1||0 (independent of the order), i.e., their encoding phase sum = \(-\frac{3}{8}\) or \(\frac{1}{8}\), the desired result is 0, i.e., \(-\frac{1}{8}\) on torus; and when the input is 1||0||0 or 1||1||1 (independent of the order), i.e., phase sum = \(-\frac{1}{8}\) or \(\frac{3}{8}\), the desired result is 0, i.e., \(\frac{1}{8}\) on torus. Therefore, to match the test polynomial, we simply multiply the sum by \(-2\) such that phase can be divided into two separate pieces on the torus. For \(\textit{2OF3}\), the result is the most significant bit of the sum of the three inputs, which exactly match the settings of the test polynomial.

Fig. 2
figure 2

The mapping relationship required for the sum of the three inputs, for XOR3 and 2OF3 respectively

In this way, the \(\sigma _0, \sigma _1, s_0, s_1 \text { and } Maj\) functions can be computed homomorphically by just one expensive blind rotation, while the Ch function needs to be implemented in the encrypted domain using HomMUX at the cost of about two blind rotations. One thing that must be noted is that the ternary gate requires the sum of 3 inputs, and it is better to use larger parameters in order not to affect the correctness of the decryption. In the experiment, we show that the parameter sets satisfy this requirement.

Addition of modulo \(2^{32}\)

In addition to some logical functions, the arithmetic addition of modulo \(2^{32}\) is also widely used in SHA-256, which would be the most time-consuming operation. Integer arithmetic can be directly implemented in the second generation FHE schemes such as BGV and BFV, but bootstrapping efficiency of these schemes currently perform poorly, which is unfriendly to deep circuits. As mentioned in the previous section, we choose the efficient TFHE scheme to implement the hash function homomorphically. A natural question is how to efficiently evaluate the required homomorphic addition of modulo \(2^{32}\) via TFHE.

For the addition of two \(n-\)bit integers, a naive method is to use Ripple Carry Adder(RCA), which is constructed by cascading multiple full adder gates, as illustrated in Fig. 3. For an \(n-\)bit adder, there must be n full adder gates. The output of the full adder can be obtained by the following equation:

$$\begin{aligned} {\left\{ \begin{array}{ll} C_{i+1} & = (a_i \wedge b_i) \vee (C_i \wedge (a_i \oplus b_i)) \ \ \ \ \ \ (*) \\ & = (a_i \wedge b_i) \oplus (C_i \wedge (a_i \oplus b_i)) \ \ \ \ \ \ (**) \\ S_i &= a_i \oplus b_i \oplus C_i \end{array}\right. } \end{aligned}$$
Fig. 3
figure 3

Bitwise addition of two \(n-\)bit numbers a and b, where \(a_i, b_i, c_i, s_i\) are ith-bit of abcarry and the result, respectively. Due to modulo \(2^{32}\), the last carry bit \(c_{n}\) would be discarded, in other words, we don’t need to compute it

Indeed, \(C_{i+1} = 2\text {OF}3(a_i,b_i, C_i), S_i= \text {XOR}3(a_i, b_i, C_i)\). Klemsa and Önen (2022) also apply this to the addition of integer. Therefore, we only need \(32*2-1=63\) instead of \(32*5-3 = 157\) gate bootstrappings to evaluate addition of modulo \(2^{32}\) by utilizing ternary gates.

Optimization of sequential addition

Note that we have

$$\begin{aligned} w_t = \sigma _1(w_{t-2})+w_{t-7}+\sigma _0(w_{t-15}) + w_{t-16}, \text {\ if } \ 16 \le t \le 63 \end{aligned}$$

in the message schedule and \(T_1 = h + s_1(e) + Ch(e,f,g) + K_t +w_t\) in the iterative compression. These two functions involve successive addition operations, which can be optimized using the Carry Save Adder(CSA).

CSA has a very small carry propagation delay when performing the addition of multiple numbers, the idea behind it is that the sum of three inputs is reduced to the sum of two inputs and the carry C and sum S are computed separately for each bit, thus it is faster.

It is interesting to note that the carry save adder can be constructed by the full adder, so the optimizations we introduced previously for full adder can be extended to CSA as well.

Parallel implementation

The disadvantage of RCA is that the carry-in bit of each full adder is derived from the carry-out bit of the previous cascaded full adder, making the critical path of the adder circuit positively correlated with the bit length of input. The Carry LookAhead Adder (CLA) reduces the depth of the critical path by parallel computation. CLA computes one or more carry bits before the sum, which reduces the waiting time of computing the carry bit, so this seems to be very friendly to the BGV scheme. Mella and Susella (2013) firstly used CLA for homomorphic computation of SHA-256 based on the BGV scheme, for 32-bit addition they estimated to consume 10 multiplication depths. The multiplication depth for computing CLA was further reduced from 10 to 5 for 32-bit addition in Togan et al. (2015). The idea is to use the Equation(**) instead of Equation(*) to compute the carry bit, which eliminates the evaluation of the OR function. Specifically, let \(P_i = a_i \oplus b_i\) and \(G_i = a_i \wedge b_i\), then

$$\begin{aligned} {\left\{ \begin{array}{ll} C_{i+1} &= G_i \oplus (C_i \wedge P_i), \\ S_i &= P_i \oplus C_i . \end{array}\right. } \end{aligned}$$

\(P_i\) and \(G_i\) can be precomputed in parallel when there are more CPUs available, independent of carry bits. In this way, we can rewrite the carry bit of 32-bit adder respectively as follows:

$$\begin{aligned} C_{1}&=G_{0} \oplus C_0P_0, \\ C_{2}&=G_{1} \oplus C_{1} P_{1}=G_{1} \oplus \left( G_{0} P_{1}\right) \oplus (C_0P_0P_1), \\ C_{3}&=G_{2} \oplus C_{2} P_{2} = G_2\oplus \left( G_{1} P_{2}\right) \oplus \left( G_{0} P_{1} P_{2}\right) \oplus (C_0P_0P_1P_2),\\ \ldots \\ C_{31}&=G_{30} \oplus C_{30} P_{30}=G_{30} \oplus \left( G_{29} P_{30}\right) \oplus \left( G_{28} P_{29} P_{30}\right) \oplus \cdots \oplus \left( G_{0} P_{1} P_{2} \cdots P_{30}\right) . \end{aligned}$$

Thus, the result of 32-bit adder is \(S_i = P_i \oplus C_i, \text { for } 0 \le i \le 31\).

In Mella and Susella (2013), Togan et al. (2015), they exploited the batch packing capability of the BGV scheme. However, it is hard to give a practical time for homomorphic computation of SHA-256 and SPECK cipher based on the BGV scheme because the parameters of the leveled BGV scheme are related to the multiplicative depth of the circuit and the bootstrapping is not efficient enough.

In the context of the TFHE scheme, we similarly utilize the Equation(**) rather than Equation(*) like (Togan et al. 2015). The reason for this is that successive XOR give us much room for optimization. For all carry bit \(C_i, \text { for } 2\le i \le 31\), we can still utilize HomXOR3 gate to reduce the gate required in encrypted domain.

Compared to the BGV scheme, the TFHE scheme does not support batch processing. Hence a natural solution for TFHE scheme is to do parallelization using multiple CPUs, which is reasonable for the cloud server with a large number of CPUs. Some advanced Parallel Prefix Adders (Payal et al. 2015) for CLA structures such as Brent–Kung adder, Kogge–Stone adder and Ladner–Fischer adder are proposed for high performance arithmetic structures in industry. In Homomorphic evaluation (2023), they utilized the Brent–Kung and the Ladner–Fischer Adder for optimization. See “Appendix A” for a more detailed description. For a fair experimental comparison, we also exploit these two optimization techniques.

Analysis of functions in SM3

In this subsection, we give an analysis of homomorphic evaluation of the hash algorithm SM3. The interesting observation is that for the \(GG_j\) function, the result is z if \(x=0\) and y otherwise, which is equivalent to the Ch function for \(16 \le j < 64\), i.e., the mux gate. For the \(FF_j\) function, it can be seen from Table 1 that it implements the same function as the Maj function for \(16 \le j < 64\). Thus, the \(FF, GG, P_0 \text { and } P_1\) functions can be implemented using only one bootstrapping. For addition modulo \(2^{32}\), it can be observed that SM3 uses fewer consecutive modulo additions compared to SHA-256 in the iterative compression function, enabling it have a lower latency evaluation. For the specific evaluation method of SM3 we use the method mentioned in the above section, please refer to the next section for the specific implementation results.

Table 1 The truth table of the function \(FF_j\), if \(16 \le j <64\)

Implementation and experimental results

In this section we provide a detailed explanation of our implementation for evaluating the hash functions SHA-256 and SM3 based on the TFHE scheme. To the best of our knowledge, the TFHE-rs libraryFootnote 7 is the fastest public implementation of the TFHE scheme among the homomorphic cryptographic libraries (https://www.zama.ai/post/announcing-tfhe-rs). Therefore, we implement our evaluation method in the TFHE-rs library. All tests were conducted on 12th Gen Intel(R) Core(TM) i5-12500 \(\times\) 12 with 15.3GB RAM, running the Ubuntu 20.04 operating system.

Experimental parameter setting

Now we present our parameter settings in the TFHE scheme. We use two parameter sets from the TFHE-rs library, as shown in Table 2, both of which provide at least 128 bits of security. “DEFAULT_PARAMS” guarantees an error probability bound of \(2^{-40}\) and “TFHE_LIB_PARAMS” provides a lower decryption error rate of \(2^{-165}\), which can be used for different scenario requirements.

Table 2 Parameter sets of the TFHE scheme

Performance result

In this subsection, we present a comparison of our evaluation experimental data. A trival implementation of SHA-256 based on the TFHE-rs library is currently publicly available from Homomorphic evaluation (2023). For a fair experimental comparison, we run their code on our machine. One thing to note is that in addition to bit-wise encryption, the TFHE-rs implementation based on two-bit encryption is available from Github.Footnote 8 However, this implementation takes up to 23 min due to the fact that this encryption is not suitable for rotation operation, resulting in huge latency even when we use multiple CPUs. Therefore, we did not consider further optimization of this implementation.

As in their experiments, we use Rayon, a multi-threaded crate of the Rust programming language, to parallelize the implementation when there are available CPUs. Specifically, we can control the number of CPUs used by calling the interface rayon::ThreadPoolBuilder::new().num_threads().build_global().unwrap(). We present the comparison of homomorphic evalaution of SHA-256 and SM3 based on the parameter sets “DEFAULT PARAMS” and “TFHE_LIB_PARAMS” for different CPU cores in Figs. 4 and 5, respectively. More detailed data, please refer to Table 3 in “Appendix B”.

Fig. 4
figure 4

Comparison of implementations of SHA-256 and SM3 based on parameter set “DEFAULT_PARAMS” under different CPU cores

Fig. 5
figure 5

Comparison of implementations of SHA-256 and SM3 based on parameter set “TFHE_LIB_PARAMS” under different CPU cores

Experimental results show that for the SHA-256 and SM3 algorithm we achieve about 35%-50% efficiency improvement compared to the state-of-the-art work, especially up to 50% when only one CPU is used. We observed that the Brent–Kung adder outperforms the Ladner-Fishcher adder, particularly when fewer CPUs are used. The overall SM3 evaluation latency is lower than SHA-256 due to its use of fewer additions. It is worth noting that when using the “TFHE_LIB_PARAMS” parameter, the evaluation latency tends to be higher. However, this parameter set offers the benefit of a lower decryption error rate, ensuring higher reliability in the evaluation results.

Conclusion

In this paper, we explore the application of ternary gates to the various logic functions required for hash functions and further reduce the number of gate bootstrapping required by SHA-256 and SM3 in the context of TFHE, realizing an improvement in efficiency. This advancement holds significant potential for various applications, including data integrity checking and private database retrieval, where hash functions play a vital role.

Further optimization directions for hash function evaluation include utilizing the fully homomorphic encryption scheme FINAL (Bonte et al. 2022) constructed by NTRU cipher, which achieves faster gate bootstrapping efficiency compared with TFHE. We believe this can directly reduce the overall runtime latency. Lower latency can be obtained when there is a large number of CPUs available, such as GPU.

Availibility of data and materials

Not applicable.

Notes

  1. \(\sigma _0(x) = (x \ggg 7) \oplus (x \ggg 18) \oplus (x \gg 3)\).

  2. \(\sigma _1(x) = (x \ggg 17) \oplus (x \ggg 19) \oplus (x \gg 10)\).

  3. \(s_0(x) = (x \ggg 2) \oplus (x \ggg 13) \oplus (x \gg 22)\).

  4. \(s_1(x) = (x \ggg 6) \oplus (x \ggg 11) \oplus (x \gg 25)\).

  5. \(\textit{Maj}(x, y, z) =(x \wedge y) \oplus (x \wedge z) \oplus (y \wedge z)\)

  6. Generally, 2OF3 is also called Majority(Maj), that is, the output is the value that accounts for the most of the 3 inputs.

  7. https://github.com/zama-ai/tfhe-rs.

  8. https://github.com/JoseSK999/sha256_fhe_int.

References

  • Albrecht MR, Rechberger C, Schneider T, Tiessen T, Zohner M (2015) Ciphers for MPC and FHE. In: EUROCRYPT 2015, vol 9056. Springer, Heidelberg, pp 430–454. https://doi.org/10.1007/978-3-662-46800-5_17

  • Ashur T, Mahzoun M, Toprakhisar D (2022) Chaghri–A fhe-friendly block cipher. In: Proceedings of the 2022 ACM SIGSAC conference on computer and communications security, CCS 2022. ACM, New York, pp 139–150. https://doi.org/10.1145/3548606.3559364

  • Bendoukha A, Stan O, Sirdey R, Quero N, Souza LF (2022) Practical homomorphic evaluation of block-cipher-based hash functions with applications. In: Foundations and practice of security—15th international symposium, FPS 2022. Lecture notes in computer science, vol 13877. Springer, Cham, pp 88–103. https://doi.org/10.1007/978-3-031-30122-3_6

  • Bonte C, Iliashenko I, Park J, Pereira HVL, Smart NP (2022) FINAL: faster FHE instantiated with NTRU and LWE. In: ASIACRYPT 2022, vol 13792. Lecture notes in computer science. Springer, Cham, pp 188–215

  • Brakerski Z (2012) Fully homomorphic encryption without modulus switching from classical GapSVP. In: CRYPTO 2012. Springer, Heidelberg, pp 868–886

  • Brakerski Z, Gentry C, Vaikuntanathan V (2012) (leveled) fully homomorphic encryption without bootstrapping. In: Innovations in theoretical computer science 2012. ACM, New York, pp 309–325

  • Canteaut A, Carpov S, Fontaine C, Lepoint T, Naya-Plasencia M, Paillier P, Sirdey R (2016) Stream ciphers: a practical solution for efficient homomorphic-ciphertext compression. In: FSE 2016. Lecture notes in computer science, vol 9783. Springer, Heidelberg, pp 313–333. https://doi.org/10.1007/978-3-662-52993-5_16

  • Cheon JH, Han K, Kim A, Kim M, Song Y (2018) Bootstrapping for approximate homomorphic encryption. In: EUROCRYPT 2018, vol 10820. Lecture notes in computer science. Springer, Cham, pp 360–384

  • Cheon JH, Kim A, Kim M, Song YS (2017) Homomorphic encryption for arithmetic of approximate numbers. In: ASIACRYPT 2017. Springer, Cham, pp 409–437

  • Chillotti I, Gama N, Georgieva M, Izabachène M (2020) TFHE: fast fully homomorphic encryption over the torus. J Cryptol 33(1):34–91

    Article  MathSciNet  Google Scholar 

  • Cho J, Ha J, Kim S, Lee B, Lee J, Lee J, Moon D, Yoon H (2021) Transciphering framework for approximate homomorphic encryption. In: ASIACRYPT 2021. Lecture notes in computer science, vol 13092. Springer, Cham, pp 640–669. https://doi.org/10.1007/978-3-030-92078-4_22

  • Cid C, Indrøy JP, Raddum H (2022) FASTA—a stream cipher for fast FHE evaluation. In: CT-RSA 2022, vol 13161. Lecture notes in computer science. Springer, Cham, pp 451–483

  • Cosseron O, Hoffmann C, Méaux P, Standaert F (2022)Towards globally optimized hybrid homomorphic encryption—featuring the Elisabeth stream cipher. IACR Cryptol ePrint Arch 180

  • Dinur I, Liu Y, Meier W, Wang Q (2015) Optimized interpolation attacks on lowmc. In: ASIACRYPT 2015. Lecture notes in computer science, vol 9453. Springer, Heidelberg, pp 535–560. https://doi.org/10.1007/978-3-662-48800-3_22

  • Dobraunig C, Grassi L, Helminger L, Rechberger C, Schofnegger M, Walch R (2023) Pasta: a case for hybrid homomorphic encryption. IACR Trans Cryptogr Hardw Embed Syst 3:30–73. https://doi.org/10.46586/TCHES.V2023.I3.30-73

    Article  Google Scholar 

  • Dobraunig C, Eichlseder M, Grassi L, Lallemand V, Leander G, List E, Mendel F, Rechberger C (2018) Rasta: a cipher with low and depth and few ands per bit. In: CRYPTO 2018. Lecture notes in computer science, vol 10991. Springer, Cham, pp 662–692. https://doi.org/10.1007/978-3-319-96884-1_22

  • Dobraunig C, Eichlseder M, Mendel F (2015) Higher-order cryptanalysis of lowmc. In: ICISC 2015, vol 9558. Lecture notes in computer science. Springer, Cham, pp 87–101

  • Doröz Y, Hu Y, Sunar B (2016) Homomorphic AES evaluation using the modified LTV scheme. Des Codes Cryptogr 80(2):333–358

    Article  MathSciNet  Google Scholar 

  • Ducas L, Micciancio D (2015) FHEW: bootstrapping homomorphic encryption in less than a second. In: EUROCRYPT 2015. Springer, Heidelberg, pp 617–640

  • Fan J, Vercauteren F (2012) Somewhat practical fully homomorphic encryption. Cryptology ePrint Archive, Report /144. https://eprint.iacr.org/2012/144

  • Gentry C (2009) A fully homomorphic encryption scheme

  • Gentry C, Halevi S, Smart NP (2012) Homomorphic evaluation of the AES circuit. In: CRYPTO 2012, vol 7417. Springer, Heidelberg, pp 850–867

  • Ha J, Kim S, Choi W, Lee J, Moon D, Yoon H, Cho J (2020) Masta: an he-friendly cipher using modular arithmetic. IEEE Access 8:194741–194751. https://doi.org/10.1109/ACCESS.2020.3033564

    Article  Google Scholar 

  • Ha J, Kim S, Lee B, Lee J, Son M (2022) Rubato: noisy ciphers for approximate homomorphic encryption. In: EUROCRYPT 2022. Springer, Cham, pp 581–610. https://doi.org/10.1007/978-3-031-06944-4_20

  • Hebborn P, Leander G (2020) Dasta—alternative linear layer for rasta. IACR Trans Symmetric Cryptol 2020(3):46–86. https://doi.org/10.13154/TOSC.V2020.I3.46-86

    Article  Google Scholar 

  • Hoffmann C, Méaux P, Ricosset T (2020) Transciphering, using filip and TFHE for an efficient delegation of computation. In: INDOCRYPT 2020, vol 12578. Lecture notes in computer science. Springer, Cham, pp 39–61

  • Homomorphic evaluation of SHA-256 (2023) https://github.com/zama-ai/tfhe-rs/tree/main/tfhe/examples/sha256_bool

  • https://oscca.gov.cn/sca/xxgk/2010-12/17/1002389/files/302a3ada057c4a73830536d03e683110.pdf

  • https://www.zama.ai/post/announcing-tfhe-rs

  • Klemsa J, Önen M (2022) Parallel operations over TFHE-encrypted multi-digit integers. In: CODASPY ’22. ACM, New York, pp 288–299. https://doi.org/10.1145/3508398.3511527

  • Lepoint T, Naehrig M (2014) A comparison of the homomorphic encryption schemes FV and YASHE. In: AFRICACRYPT 2014, vol 8469. Lecture notes in computer science. Springer, Cham, pp 318–335

  • Lou Q, Jiang L (2019) SHE: a fast and accurate deep neural network for encrypted data. In: NeurIPS 2019, pp 10035–10043

  • Mandal K, Gong G (2021) Homomorphic evaluation of lightweight cipher Boolean circuits. In: FPS 2021. Springer, Cham, pp 63–74. https://doi.org/10.1007/978-3-031-08147-7_5

  • Matsuoka K, Hoshizuki Y, Sato T, Bian S (2021) Towards better standard cell library: Optimizing compound logic gates for TFHE. In: WAHC ’21: proceedings of the 9th on workshop on encrypted computing & applied homomorphic cryptography. WAHC@ACM, New York, pp 63–68. https://doi.org/10.1145/3474366.3486927

  • Méaux P, Journault A, Standaert F (2019) Improved filter permutators for efficient FHE: better instances and implementations. In: INDOCRYPT 2019, vol 11898. Springer, Cham, pp 68–91 https://doi.org/10.1007/978-3-030-35423-7_4

  • Méaux P, Journault A, Standaert F, Carlet C (2016) Towards stream ciphers for efficient FHE with low-noise ciphertexts. In: EUROCRYPT lecture notes in computer science, vol 9665. Springer, Heidelberg, pp 311–343 (2016). https://doi.org/10.1007/978-3-662-49890-3_13

  • Mella S, Susella R (2013) On the homomorphic computation of symmetric cryptographic primitives. In: Cryptography and coding—14th IMA international conference, IMACC 2013. Lecture notes in computer science, vol 8308. Springer, Heidelberg, pp 28–44. https://doi.org/10.1007/978-3-642-45239-0_3

  • Naehrig M, Lauter KE, Vaikuntanathan V (2011) Can homomorphic encryption be practical? In: CCSW 2011. ACM, New York, pp 113–124

  • Payal R, Goel M, Manglik P (2015) Design and implementation of parallel prefix adder for improving the performance of carry lookahead adder. Int J Eng Tech Res 4:12

    Google Scholar 

  • Rechberger C, Soleimany H, Tiessen T (2018) Cryptanalysis of low-data instances of full lowmcv2. IACR Trans Symmetric Cryptol 2018(3):163–181

    Article  Google Scholar 

  • Science TN Secure hash standard (shs) (2012) http://csrc.nist.gov/publications/PubsFIPS.html

  • Stracovsky R, Mahdavi RA, Kerschbaum F (2022) Faster evaluation of AES using TFHE. In: Poster Session, FHE.Org—2022. https://rasoulam.github.io/data/poster-aes-tfhe.pdf

  • Togan M, Lupascu C, Plesca C (2015) Homomorphic evaluation of speck cipher. Proc Roman Acad Ser A: Math Phys Tech Sci Inf Sci 16:375–384

    MathSciNet  Google Scholar 

  • Trama D, Clet P, Boudguiga A, Sirdey R (2023) A homomorphic AES evaluation in less than 30 seconds by means of TFHE. In: Proceedings of the 11th workshop on encrypted computing & applied homomorphic cryptography. ACM, New York, , pp 79–90. https://doi.org/10.1145/3605759.3625260

  • Wei B, Lu X (2023) Improved homomorphic evaluation for hash function based on TFHE. In: Information security and cryptology—19th international conference, Inscrypt 2023

Download references

Acknowledgements

We would like to thank the anonymous reviewers and editors for detailed comments and useful feedback.

Funding

This work was supported by the Huawei Technologies Co., Ltd and CAS Project for Young Scientists in Basic Research Grant No. YSBR-035.

Author information

Authors and Affiliations

Authors

Contributions

BW completed the major work on this paper, XL participated in problem discussions and all authors have read and agreed to contribute.

Corresponding author

Correspondence to Xianhui Lu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is a full-text version of the paper published as Poster in Inscrypt 2023 (Wei and Lu 2023).

Appendices

Appendix A: Parallel prefix adder

Parallel prefix adder (PPA) can be designed in many different forms depending on the requirements. PPA is faster adder and is used in industry for high performance arithmetic structures. Parallel prefix adder is done in three steps: (1) Preprocessing stage (2) Carry generation network (3) Post-processing stage.

Parallel prefix adder is mainly categorized into three types according to the carry generation network: Kogge–Stone Adder, Brent–Kung Adder and Ladner–Fischer Adder. The parallel prefix network of the Kogge–Stone structure is shown in Fig. 6. It is characterized by a very small number of logic depth and fan-outs, but a very high number of nodes and long spaced interconnecting wires. Brent–Kung structure is shown in Fig. 7, which is characterized by a very small fan-out and fewer nodes, but maximum logic depth. It relieves the fanout pressure by adding additional logic depth. Ladner–Fischer structure is shown in Fig. 8. It has low logic depth but large fan-outs.

Fig. 6
figure 6

32-Bit Kogge–Stone adder

Fig. 7
figure 7

32-Bit Brent–Kung adder

Fig. 8
figure 8

32-Bit Ladner–Fischer adder

Appendix B: Detailed experimental results

We present detailed experimental results in Table 3 in “Performance result” section.

Table 3 The latency(s) of homomorphic evaluation of SHA-256 and SM3 based on different parameter sets using different CPUs

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, B., Lu, X. Improved homomorphic evaluation for hash function based on TFHE. Cybersecurity 7, 14 (2024). https://doi.org/10.1186/s42400-024-00204-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s42400-024-00204-0

Keywords