A study of hardware implementations of the CRC computation algorithms

The paper is about hardware implementations of the CRC computation algorithms. Combinational circuits of CRC8 and CRC32 computation devices, which can be embedded in satellites for error checking in configuration memory and data transmission module, were considered. The conclusion about the advantages of matrix-driven algorithm hardware implementation, which are a simple diagram is built using only logic «exclusive OR» was done. The examples of CRC8 and CRC32 devices working were presented according to parametric model with different input data.


Introduction
For the aerospace and defense electronics, telecommunications and control systems are very important to be able to make sure that the configuration data in the Field-Programmable Gate Arrays (FPGAs) on board does not contain any errors. Ionizing radiation in space can cause unwanted switching of memory cells in the FPGA, which contains the configuration memory, user memory and registers. Such switching may cause equipment failure, which is critical in outer space. CRC (Cyclic redundancy code) is algorithm of checksum computation, which is applied in different data transmission standards, compress algorithms, coding standards and bitmaps [2,3]. CRC checksums can be used to detect errors and failures in the configuration memory and data transmission modules There are different types of CRC such as CRC8, CRC16, CRC32, which are differed by generator polynomial length [1] and accordingly by the checksum length. In the papers [4,5] software implementation of table-driven and matrix-driven algorithms were described. The advantages of matrix-driven algorithm implementation, which are concluded in the low requirements of memory, were showed. The asynchronous hardware implementations of CRC8 and CRC32 for FPGA, which can be embedded in satellites, were offered in the paper. The implementation allows to simplify device circuit and to apply features of matrix-driven algorithm in hardware implementation.    According to the table-driven algorithm of CRC computation [4] a byte of data is summed by modulo 2 with checksum byte (the initial value of checksum is chosen by parametric model of algorithm [4]). After that the result of addition is used to index of the table which is filled by reminders of all byte combinations by generator polynomial. The value from the table is a CRC8 checksum. CRC32 computation algorithm is similar to CRC8 with the exception of some features. In this case the length of CRC, the length of generator polynomial and the length of table elements increases to 32 bits. For the hardware implementation of CRC32 an additional input init [31..0] for circuit is required. 32-bit initial value of CRC32 is supplied to the init input according to parametric model. A Low byte of init (7 -0 bits) is summed by modulo 2 with data byte. Then similarly to CRC8 the element of the table is selected and summed by modulo 2 with high three bytes of init (31 -8 bits), which are shifted to the right 8 times. Figure 3 shows functional diagram of CRC32 computation device.
In the CRC32 implementation for table-driven algorithm logic «exclusive OR» will be added to combination circuit and the length of constant will be increased. Herewith the hardware implementation of matrix-driven algorithm of CRC computation allows to avoid multiplexors, storage usage and build combinational circuit using only logic «exclusive OR».

Hardware implementation of the matrix-driven algorithm
Feature of matrix-driven algorithm of CRC computation is multiplication by modulo 2 of vector (input byte) by matrix [6] which is calculated based on generator polynomial. The feature allows implement matrix-driven algorithm by combinational circuit without storage using only logic «exclusive OR».  Thus, figure 5 represents the combinational circuit of CRC8 computation by matrix algorithm. There are no multiplexors and constants in the circuit. CRC computation is produced without storage elements using only logic «exclusive OR», which is a feature of matrix algorithm hardware implementation. The combinational circuit for CRC32 matrix algorithm is built similarly, but the length of matrix values is increased to 32, which increases the number of logic elements «exclusive OR».
Developed combinational circuits are represented in functional blocks (figures 1, 3) and allow compute CRC for one data byte. CRC computation for data packet is produced by serial communication of functional blocks. Figure 6 shows the diagram of CRC8 matrix hardware implementation such as part of emergency protection system. Each next byte is summed by modulo 2 with computed CRC and result is supplied to next CRC8 block input.
Thus, asynchronous hardware implementation of matrix algorithm has an advantage over tabledriven which is reflected in the simplicity of circuit, the absence in the circuit multiplexers and storage elements.

Testing of the hardware implementations
Simulation of CRC computation for FPGA Cyclone for different input data was conducted to ensure efficiency of developed devices. Table 1 presents parametric model for CRC8 computation [4]. Figure  7 shows results of CRC8 computation for table-driven and matrix-driven algorithm. In hardware implementation of CRC algorithms such as in software implementation the parametric model can be changed by changing Poly, Init, Refln, ReflnOut, XorOut parameters. However unlike a software implementation in hardware each parameter requires I/O FPGA pins which prevent parametric model control. Usually for a hardware implementation determined parametric model is selected for the device. In the developed devices only the Init parameter can be changed.  Thus, values 0xAA and 0x31 are supplied to input data for testing device; values 0x00 and 0xFF are supplied to init input. Other values of parametric model do not change. Figure 8 shows the result of CRC8 computation by software for the same parametric model.  Table 2 presents parametric model for CRC32 computation [4]. Figure 9 shows the results of CRC32 computation for table-driven and matrix-driven algorithms using the parametric model. Figure  10 shows the results of CRC32 computation by software with the same parametric model.

Conclusion
The paper discussed hardware implementations of CRC computation algorithms, such as a tabledriven which is widely used in data transmission protocols, also a matrix-driven, which allows significantly simplify combinational circuit of CRC computing device. CRC computing device based on FPGA Cyclone which can be used for the satellites configuration memory control and data transmissions modules was developed and tested following parametric model described by R. Williams [4]. Combinational circuit of hardware implementation of matrix-driven algorithm is simpler than table-driven and does not require any complex combinational devices such as multiplexor and storage elements for constant storing. CRC computation circuit in this case is built using only logic «exclusive OR» which gives advantage in hardware implementation. This fact allows embedding CRC computation device based on FPGA in satellites data transmission and configuration modules saving hardware resources for other important modules.