Uniﬁer register to protect an efﬁcient modular exponentiation algorithm

. Simple power analysis (SPA) attacks are widely used against several cryptosystems, principally against those based on modular exponentiation. Many types of SPA have been reported in the literature in the recent years. There is a real necessity to eliminate the vulnerabilities of cryptosystems, such as CRT-RSA or the Elliptic Curve Cryptosystem, that make them susceptible to these attacks. There are many modular exponentiation algorithms that try to reinforce the security of these systems, of which one was proposed by Da-Zhi et al. Da-zhi’s algorithm was presented as a secure and e ﬃ cient countermeasure against side channel attacks; however, recently it was shown that its security can be defeated. In this paper, a means of protecting the algorithm is presented. The proposed technique can be applied in any algorithm that computes dummy operations through its execution.


Introduction
A large amount of data are broadcasted daily through electronic communication channels.All of this information must be protected because of the importance and sensitivity of the transmitted data.The best way to protect such information is via high security cryptographic systems.
There are many strong cryptographic systems; two of the most important systems used currently are the RSA cryptosystem, proposed by Rivest Shamir and Adleman [1]; and the Elliptic Curve Cryptosystem.
A cryptographic system bases its security on many important aspects, such as the mathematical concepts on which it was founded, the computational difficulty needed to obtain the secret keys that ensure data protection, the transmission channel and, in many cases, the use chosen by the final user.Previously, a security system could be considered trustworthy if its mathematical argument was reliable, and the computational difficulty of finding the keys was high; however, this idea was proven false in 1996, when Kocher opened the door to a new type of attacks, called the side channel attacks (SCA) [2].
SCA are based on a new and different attack methodology that does not depend on formal mathematical a Corresponding author: dativa19@hotmail.comconcepts or computing power.This new attack scheme is based on observations of the execution time of an electronic device that is running up the modular exponentiation algorithm.
Kocher knew that the execution time of the device could be observed and measured and that from these measurements, it is possible to obtain information as valuable as the secret key of the system.Later, based on this first attack scheme, the extraction of important data from electronic devices by measuring the power consumption of the crypto devices was presented [3].
The measurement of electronic device power consumption at the moment that an algorithm is executed inside of the device created numerous opportunities for physical attacks based on measuring, analyzing, and interpreting any physical signals emanating from the device.Signals such as power consumption, heat emanation, electromagnetic signals, and any other possible signals issued by the device can be measured and used to breach the security of the information systems.
SCA were quickly used to break down the security of cryptosystems based on modular exponentiation (Add and double in the elliptic curve cryptosystem), such as RSA, and the method of secure exchange of cryptographic keys Diffie-Hellman [4].After the SCA, Bonhe, DeMillo and Lipton presented Fault attacks (FA) [5].FA are more aggressive than SCA because FA physically disturb the execution of the device that is running the cryptographic algorithm.
Many modular exponentiation algorithms have been proposed to avoid these types of physical attacks.The Square-and-Multiply Always algorithm developed by Coron [6] was the first algorithm specifically designed to avoid such attacks; however, this algorithm was attacked by Safe error attack (SEA) [7].SEA uses dummy operations that the algorithm calculates to break the security of the system.A dummy operation is an operation that is calculated through execution but does not affect the final result of the algorithm's execution.
There are many physical attacks ( [8], [9], [10], [11]) trying to break the different modular exponentiation algorithms ( [12], [13], [14], [15], [16], [17]) but in 2006 Boreale [18] presented a new type of attack, his attack uses a combination between FA and SCA, and according to him, it is possible to get the binary string of the secret key d using the Jacobi symbol concept.He attacked the Square and Multiply Right-to-Left modular exponentiation and he proved that his attack is effective even in the presence of message blinding.Schmidt and Medwed [19] used the Jacobi symbol to create an attack which breaks the security of the Montgomery ladder in its blinded form.In the same way Chong Hee Kim designed an attack in 2010 [20], based on the Jacobi symbol too, to break the security of the Add-Only and Add-Always algorithms, both algorithms proposed by Joye in 2007 [21].
Sun Da-Zhi et al. developed an efficient algorithm against simple power analysis [22] that uses two binary strings instead of one to calculate modular exponentiation; however, in [23] was presented a way to defeat the security of Da-Zhi's algorithm.The threat was implemented in four steps.It is important to highlight that this assault was designed by observing the behavior of the basic characteristics of Da-zhi's algorithm.

Quadratic residues and Jacobi symbol
For any prime number p, x is a quadratic residue if gcd(x, p) = 1 and x = y 2 mod p for some y.If gcd(x, p) = 1 but x is not a quadratic residue mod p, then x is called a quadratic non-residue modulo p.
In order to illustrate this concept, we will use a modulo n = 17.The residues modulo 17 of 1 Ifa is a quadratic residue mod p −1 If a is a quadratic non-residue mod p 0 If there is a common factor Then, we have that a n = a p 1 • • • a p k is the Jacobi symbol (JS), where n = p 1 • • • p k , and p's are prime factors.The Jacobi symbol is a generalization of the Legendre symbol.

Modular exponentiation
Modular exponentiation is the core of several cryptographic algorithms, and it is also the principal intrusion point of different physical attacks.The classical modular exponentiation algorithm is the Square and Multiply Right-to-Left, given as the algorithm 1.This kind of algorithm can be classified into two types: left-to-right and righ-to-left, depending on the position of the binary string at which the algorithm begins its execution.
end if R 1 ← R 2 1 mod N 9: end for 10: Return R 0 There are many variations of the modular exponentiation algorithm, all designed to make the security protocol safer.[24].This technique, according to the authors, is a generic scheme and is virtually applicable to any algorithm.The idea says that a process can be seen as a sequence of instructions, and each is equivalent to all other instructions.Hence, they cannot be differentiated through an analysis such as side channel attack; this technique is shown in the algorithm 2.

Jacobi symbol attacks
As there are many modular exponentiation algorithms, there are also many physical attacks attempting to undermine the security of the exponentiation algorithms.
After the apparition of SCA and FA, attacks that used one of these two techniques but were also based on a combination of mathematical and numerical concepts were developed, such as the attack proposed in [8].
Other types of attacks that combine physical attacks with numerical concepts use the Jacobi symbol to defeat cryptographic security.
Attacks using the Jacobi symbol began with Boreale, when he attacked the binary Square and Multiply Rightto-Left modular exponentiation (Algorithm 1) in 2006 [18].He puts a fault z in R 1 when a squaring is executed in iteration i − 1 of the Square-and-Multiply algorithm, and then, depending on the calculation of (S /N), where S is the output value of the algorithm and N is the modulus, it is possible to know what value of the bit d i was attacked.This scheme works by assuming that m N = 1, and its behavior is similar to the Safe error: if the bit in iteration i is equal to 0, the fault does not affect the calculation of the JS of (R 0 i /N), and z is squaring, so that (z 2 /N) = 1, but if d i = 1, z affects the register R 0 i and can have the JS value (R 0 i /N) = −1.Then, z can be or not be a quadratic residue.If z is a quadratic residue, then the final result will be (S /N) = 1, but if it is a quadratic non-residue, the final result will be (S /N) = −1.For this reason, his attack is a probabilistic model.After Boreale, Schmidt [19] proposed an attack that entailed sending a message m with (m/N) = −1 to the Fumaroli-Vigilant algorithm [25] and skipping the operation 2 in the algorithm.By observing the JS of the resulting value, it is possible to learn about the values of d i and The attacks mentioned above are easy to implement, and they are powerful because they require only the JS in the returned value by the attacked algorithm to breach the security of a cryptosystem.However, a way to avoid this kind of attack was shown in [26].

Attack on Da-zhi's algorithm
Sun Da-Zhi et al. published a modular exponentiation algorithm that separates the original binary string of the exponent d into two binary strings d 1 and d 2 .The algorithm is given as algorithm 3, where sq (y) (A) means performing y modular squares on the integer A [22].
The key idea in algorithm 3 is to separate the k-bit binary string d into two ( k/2 )-bit binary strings d 1 and d 2 , and depending on the values of d 1 and d 2 in each iteration, four registers C 0 • • • C 3 can be utilized to calculate the exponentiation.If d i,1 and d i,2 (Where i, 1 is the i-bit of d 1 and i, 2 is the i-bit of d 2 ) are equal to 0, the chosen register is C 0 ; if d i,1 = 1 and d i,2 = 0, the chosen register is C 1 , and so on, for each register.Algorithm 3 was presented as a secure algorithm against SCA.However in [23] was demonstrated that the security of the algorithm can be defeated using an attack consisting of four steps.The attack is described below: 1.When d i,1 = d i,2 = 0, register C 0 is used.Every time this register is picked, Da-zhi's algorithm executes a dummy operation.If an attacker performs a FA when C 0 is selected, then the final result will not be affected.Thus, an attacker can determine the bits combination relative to the point attacked; thus, he can find a quarter of the total binary string.
2. We can see at line 12 from algorithm 3, that the register C 1 is always squared.Therefore, if any FA is put over that register and alters its JS, Jacobi symbol value in the final calculation will always become 1 because all negative JS change to a positive JS due to the operation C = sq ( k/2 ) (m C 1 ).Thus, by performing several FA and studying the behavior of the returned value, an attacker can easily identify where the Register C 1 was used; therefore, he can know one of four bits combinations that the algorithm uses.

Protecting Da-zhi's algorithm using a unifier register
As described previously, the first step in attacking the Da-zhi's algorithm is to determine the position of the exponent's binary string in which a dummy operation is executed.To avoid this situation, it is necessary to unify all of the algorithm's registers C 0 ...C 3 to prevent a dummy operation from being used.
We realized that using the register C 0 in an independent manner can permit the unification of the registers; this value can later be allocated in any other register used by algorithm 3. Due to the necessity of making the C 0 register independent, a first approach to solving this problem involved the use of the technique called atomicity, proposed in [24].As mentioned previously, this technique requires that each iteration of an algorithm is equal in time and power consumption to the others.The implementation of this concept alone does not solve the problem, but it can allow us to work with the C 0 register separately.
In algorithm 4, we can see the implementation of the atomicity technique over the Da-zhi's algorithm.
1: Input m, d, N with Algorithm 4 uses all of the registers to correctly calculate of the exponentiation result.Thus, it does not matter which is the altered register as the final result will always be inaccurate.This is an advantage over the original algorithm because in the original algorithm, if one fault was placed at C 0 , the final result was unaltered.It should also be noted that the C 0 register is executed in each iteration of algorithm 4, which is an important feature of this algorithm.Despite its characteristics, this algorithm is insecure against Jacobi symbol attacks.This vulnerability will be explained in the next section.
After a way was found to manipulate C 0 in an independent form, algorithm 4 was modified to make it safe against FA and JS attack, thus obtaining the algorithm 5, where ⊕ represents the binary operation OR exclusive.This algorithm avoids the use of dummy operations through an interconnection of all the registers used in its execution, meaning that a dummy operation is converted into an operation that alters the final result of the calculation.
Algorithm 5 Da-zhi's algorithm with atomicity and the unifier register.
1: Input m, d, N with Algorithm 5 has a higher number of registers than algorithm 3, including the C 4 register.However, the increase in registers is minimal because this modification will provide security for any information system in which this idea can be implemented.In other words, a minimum amount of memory is sacrificed, but a guarantee of security will be obtained.Now, we will see how the algorithm 5 avoids the dummy operation.When a fault is allocated into C 0 , the fault will modify the value of this register.Such erroneous values always disturb the final result.To obtain the general disturbance, C 0 will multiply to C 1 and C 4 when d i,1 = d i,2 = 0. Therefore, C 0 will not be multiplied by itself anymore (the importance of this fact will be described in the next section).This means that when an FA is allocated into C 0 , this erroneous value will join another register through the algorithm's execution.As a result, it will be altering all of the necessary values involved in the calculation of the exponentiation.Thus, C 0 becomes a unifier register.Registers can be unified, and the erroneous value in C 0 will disturb the final result of the operation, thereby eliminating the dummy operations of the algorithm's execution.
C 4 register is used to contain a multiplicative inverse of C 0 , which will eliminate the C 0 value from C 1 at line 12 of algorithm 5. We can also see that only one register is protected with the value of C 0 and so, it was not necessary to protect all of the remaining registers.
It is easy to see that the register selection is different than that of the algorithm 3.In this algorithm, when d i,1 = d i,2 = 0, the registers C 1 and C 4 are used; when d i,1 = 1 and d i,2 = 0, C 3 is used; when d i,1 = 0 and d i,2 = 1, C 2 is used; and finally, when d i,1 = 1 and d i,2 = 1, C 1 is used.C 0 is executed in all the bits combinations.First, C 0 is calculated.Then, the register corresponding to the combination is used.This algorithm is only slightly more complex than the original one.The first step in its design was to determine the expressions to calculate b and k to make the algorithm performs three multiplications when d i,1 = d i,2 = 0 and two multiplications in any other case.Both formulas were obtained with the help of Karnaugh maps.
The unifier register is not limited to the Da-zhi's algorithm.It can be used in any algorithm that makes dummy operations through its execution to avoid FA.

Analysis of the displayed algorithm
The Jacobi symbol has been used to attack many exponentiation algorithms.The general form of the attack consists of putting an FA through the algorithm's execution, calculating the JS of the obtained result, verifying the JS values 1 and -1, repeating the same attack many times, and comparing the obtained JS from those attacks.Thus, an attacker can obtain the secret key's binary string.
To verify the security of Da-zhi's algorithm modified with the atomicity technique (algorithm 4) and to compare it with our algorithm implemented with the unifier register (algorithm 5), Jacobi symbol attacks were made over them.The attack is described below.
When we performed safety tests of the algorithm 4 with respect to the JS, we realized that it was effective against common FA; however, it allows attacks by the JS concept.If a fault is placed in C 0 when d i,1 = d i,2 = 0, the next operation in the iteration i + 1 of the algorithm is C 0 = C 0 • C 0 .Hence, if the erroneous value had a JS equal to −1 in i, in the iteration i + 1 the JS will be changed to 1.The other three combinations of d i,1 and d i,2 do not exhibit this behavior because the JS = −1 of C 0 always affects one of three registers in the next step such that 1 (where 2kd i,2 + kd i,1 0, for this case).Based on these data, we can determine that if a fault was placed when d i,1 = d i,2 = 0, the algorithm will always have a JS = 1 in the output value.
Although the implementation of the atomicity technique over Da-zhi's algorithm is very useful and witty, it has vulnerabilities against attacks that use the JS.The algorithm implemented with the unifier register (algorithm 5) does not present vulnerabilities against these types of attacks because it does not perform the operation C 0 = C 0 • C 0 .Thus, a negative JS in C 0 is unified with others registers.In the same manner that an FA can be propagated through the algorithm's execution, a JS value can be propagated through the calculations, and can disturb any final result (as explained previously).This avoids the identification of any specific bits combinations as all of them alter the JS of the final result.For this reason, an algorithm with unifier register has a better security level.
Because the algorithm has an atomized behavior, it is a regular algorithm and is secure against simple side channel attacks.
The version of Da-zhi's algorithm presented here has different characteristics than the original.Table 1 shows a summary of the characteristics of such algorithms.
In table 1 the multiplications and the squaring are considered a multiplication.In the column "number of variables" d 1 , d 2 , k, i, m, d and N are not counted because they are variables inherent in the modular exponentiation and are not related to their implementation.
Many modular exponentiation algorithms execute more calculations, whereas the binary form of the exponent has more 1's.For this reason, binary representations with less 1's are sought.Non adjacent form (NAF) [28] is a signed binary representation that can be used to reduce the quantity of 1's inside of a binary string.NAF has two main characteristics: First, the quantity of 1's in any binary string is n/3 on average, where n is the bit length.Second, this form has a bit length of 1 + n bits.We can see this behavior in the next example: The last characteristic can be a disadvantage when considering that many crypto devices that execute modular exponentiation algorithms are designed to work with a specific exponent's bit length (secret key).This means that they are designed to operate with 1024 bits, not 1025.In other words, an electronic device that executes a cryptographic algorithm would have to be redesigned to work properly with signed binary representations, a very expensive situation.Thus, it is important to note that the modified algorithm by the unifier register is not compatible with signed binary representations, such as NAF.As stated previously, the algorithm 5 makes three operations when the bit combination is d i,1 = d i,2 = 0. Therefore, while more bits equal to 0 exist, more operations are performed, and their execution times are longer.In this case, it is not necessary to search alternate binary representations that minimize the quantity of 1's.This can be considered an advantage over other algorithms because it can work more efficiently with unsigned binary representations.Hence, it can be in concordance with crypto devices that use standard bit lengths to function.

Conclusions
In this work we have presented a way to protect algorithms in right-to-left form that use dummy operations in their execution.When the unifier register is used, the algorithm's security is greater than that in the algorithms that are not protected.The protected algorithms can be incompatible with signed binary representations, such as NAF.This means that they are more efficient when the binary string has few bits equal to 0. However, this characteristic can be seen as an advantage if the implementation over crypto devices that use standard key bit lengths to execute an encryption algorithm is considered.Therefore, it is not necessary to change the hardware to use the unifier register, which may be a very important property.
In the specific case of the Da-zhi's algorithm, this simple modification avoids that the bits of the binary string which represents the secret key can be obtained as described by [23].
Additionally, this algorithm is secure against simple side channel attacks due to atomicity.
1: Input m, d, N, with d To use the register C 2 , it is necessary that d i,1 = 0 and d i,2 = 1.To use the register C 3 , it is necessary that d i,1 = 1 and d i,2 = 1.It can be noted that for both registers, the value of the bit d i,2 in i will always be equal to 1; therefore, it is possible to determine that all of the unknown values of the bits in d 2 are equal to 1.

Table 1 .
Caracteristichs of the modified algorithms.