Research on robust audio zero watermarking algorithm based on discrete cosine transform

. This paper mainly uses the discrete cosine transform algorithm characteristics, combined with the "zero embedding features, to improve the robustness of the watermark algorithm, to balance the uncoordinated relationship between robustness and transparency.


Watermark generation
This algorithm uses binary image as a watermark The image size is ab  . First, convert the image into one-dimensional sequence: And then scramble the one-dimensional sequence , of which M a b , get the embedded watermark sequence Q: Of which,

()
Perm  is the scrambling function, which is the watermark length, key1 as the key, save the Key1 for the watermark extraction.

Watermark embedding process
First, the algorithm will segment the original audio, DCT transform each audio segment, and extract the DC coefficient (DCC) of each segment, then, scramble the watermark, and sort the DC coefficients based on absolute value from big to small. The positive and negative feature of DC coefficients in DCT domain is not easy to change, so the positive and negative feature of the DC coefficients of the large absolute values are more stable. Therefore, in order to improve the robustness of the algorithm, orderly extract the direct and negative feature of the ranked DC coefficient, to obtain a set of binary feature sequences with the same length as the watermark, then perform the XOR operation in these feature sequence and the watermark bit, store the results as a key for the detection of the watermark, thus realizing the embedding of the zero watermark [1,4].

Embedding algorithm
Assuming that the original audio signal has an N sampling point, Each frame of audio contains an L sample point, the total frame count is To ensure that the audio length is sufficient to embed all watermarks, it requires UM  . The embedding steps for the watermark are as follows: (1) segment the original audio S, and DCT transforms each audio to obtain the coefficients of the DCT domain (1) Of which, the () sort  is sort function, DESC represents descending order, the element in 2 key , is the serial number 2 key as the key to extract the watermark.
(4) Extract the positive and negative feature of each coefficient in DC F , generate binary feature sequences , shown in the following formula:

Watermark extraction
Watermark detection is the inverse process of watermark embedding, which can be carried out according to the following steps: (1) Assuming that the audio signal to be measured is , divide the S  into U frames, Each frame of L sample points.
(2) DCT transform for each frame of audio to obtain k f  , extract the DC coefficients of each frame to obtain: (3) using the key 2 key , get feature sequences of audio

Simulation results and analysis
In the simulation experiment of this chapter, we use Stir mark for Audio V02 to variously deal with of the music and speech embedded the watermark respectively to simulate the attack of audio in the actual application [4]. Using mono-channel, and use music and voice sections with sampling rate of 1kHz as the original audio signal, audio length of 100s, take a binary image as a watermark, the length of each frame is l=20. In this section, the transparency and robustness of the algorithm are simulated and analyzed. The original audio and the original image are as follows: Fig. 1. Original audio image.

Transparency test
We adopt the standard based on signal-to-noise ratio (SNR) to measure the audio signal quality of embedded watermark. The formula for SNR is as follows: In the formula, S (k) and S ' (k) respectively represent the original audio signal to be embedded with the watermark part and the audio signal embedded with the watermark part [2].
In order to prove the feasibility of audio watermarking, in the transparency test, we do not choose to use the above mentioned audio as the original audio, instead, we choose classical music, pop music, jazz and rock music as the original file, to test the transparency [1,3].
According to the test, the waveform file is almost no difference, in terms of signal-tonoise ratio, the largest SNR value of its classical music is 32.6863dB, the lowest SNR value of rock music is 29.5657dB.
Because this algorithm uses zero watermark embedding method, the watermark embedding does not bring the original audio sampling point coefficient change. As a result, the quality of audio is almost unaffected. The calculated algorithm, that is, the signal-to-noise ratio is infinitely large.

Robustness test
In the robust simulation experiment of this algorithm, perform the adverse watermark on all kinds of attacked audio and display the restored watermark image. Simultaneously calculates the NC and BER under various attacks. The bandwidth calculated according to In order to highlight the robustness of the algorithm, we will compare the robustness between this algorithm and that which is presented in the "DCT Domain Audio Watermark: Watermark Algorithm and Non-Perceptual Test" published in the electronics newspaper by the Wen Quan, Wang Shuxun, and Nian Guijun. Many experiments show that to achieve better robustness, the bandwidth of embedded watermarking needs to be reduced [1]. So in this experiment, using music and voice sections with 100s as the original audio signal, using a 100-bit binary sequence as a watermark. The parameters compared with the algorithm in the reference literature are set as follows: 441 sampling points per frame, the F (1) is divided into 100 sequences, the length of each sequence is 100, and the embedding strength d=20. This algorithm sets the frame length of L = 20. According to the formula

M Bandwidth
Fs N   , The bandwidth of this experiment can be calculated as Bandwidth=1bps.

Summary
The simulation results show that under the same embedded bandwidth, the algorithm has better robustness and better transparency under most attacks. The relationship between