The comprehensive quality evaluation of scutellariae radix based on HPLC fingerprint and antibacterial activity

. Scutellaria Radix, a traditional Chinese medicine, studies on its main active ingredient are limited. In this study, the purpose was to investigate the quality difference of Scutellariae Radix from different origins based on chemical components and biological activities. The chromatographic fingerprints of Scutellariae Radix from 33 origins were established using HPLC, and the antibacterial activities were studied with the microdilution method. Moreover, orthogonal partial least-square regression, pearson correlational analysis and grey relational analysis methods were performed to explore the relationship between the compositions and bioactivities. In addition, and origin identification model was established to comprehensively evaluate the quality of Scutellariae Radix. The results showed that Scutellariae Radix had in-vitro antibacterial effect on Staphylococcus aureus, and the best were in Gansu, Shandong Province. Multivariate statistical analysis common showed that three components were positively correlated with antibacterial activity, which were respectively wogonin, baicalein and oroxylin. In conclusion, the quality of Scutellariae Radix varies greatly from different origins, and the better was in Gansu and Shandong Province. This work successfully provides a general model that combined the chromatographic fingerprint and bioactivity assay to study the spectrum–effect relationships, which could be used to discover the primary active ingredients in traditional Chinese medicines.


Introduction
etc. [2] . The main medicinal part of SR is rhizomes, among which baicalin, baicalein, wogonin and wogonoside are its main active ingredients and quality control indicators [3] . In recent years, the wild resources of SR have been declining due to the over exploitation, making it difficult to meet market demand [4][5] . Blind introduction of SR without considering the ecological suitability, making the quality of SR from different producing areas varies greatly. Therefore, it is necessary to evaluate the quality of SR from different geographical origins.
Since traditional Chinese medicine (TCM) had the characteristics of multiple targets and multiple components in the treatment of diseases, it is more difficult to comprehensively evaluate the quality of TCM and excavate the active ingredients. From a macro perspective, the chromatographic fingerprint can reveal the overall chemical characteristics of TCM and solve the one-sidedness of a single index in quality evaluation to a certain extent [6][7] . However, it can not be used to find the main active ingredients of TCM to further explore its pharmacological effects. "Spectrumeffect relationship" can establish a relationship between fingerprint peaks and specific pharmacodynamic indexes, and then this relationship can be used to find the main effect components of TCM to reflect its internal quality [8][9] . This method can reflect the real active ingredients and more comprehensive pharmacological information to a certain extent, so it is more suitable to control the quality of TCM. Some mathematical statistical methods have been applied to find the corresponding active components in the spectrum-effect relationship. For example, Liu et al. explored the spectrum-effect relationship between fingerprints and three pharmacological effects through gray correlation analysis and partial least square regression, and found the main active components corresponding to the main pharmacological effects of Farfarae Flos [10] . At present, the application of this method in the quality evaluation of SR is limited.
To this end, based on the samples collected in China, the microdilution method was performed to detect the difference in antibacterial activity of SR from five main producing areas and determine its chemical fingerprint. Combining the "spectrumeffect relationship research, the main medicinal components of the antibacterial activity of SR were preliminary discussed. In addition, on this basis, an orthogonal partial least squares discrimination analysis (OPLS-DA) identification model for distinguishing different origins was established to comprehensively evaluate the quality of SR collected from different origins.

Materials and reagents
33 batches of SR from five producing areas covering Gansu (GS), Inner Mongoria (NM), Shanxi (SX), Shandong (SD) and Shaanxi (SN) province were collected and authenticated by Prof. Huibin Lin (Academy of TCM, Shandong, China). Table 1 shows the origins of the 33 batches of samples. All samples were sampled according to the rule of quartering method specified in Chinese Pharmacopeia (four part, 2020 edition) and a quarter of samples were selected for retention. The remaining medicinal materials were crushed and sifted through 20 mesh sieve for screening with 65 mesh sieve. The samples were weighed in proportion, bagged and labeled respectively, and then stored in dry room temperature environment before chromatographic analysis. Staphylococcus

Prepration of sample and standard solutions
Accurate weighing SR powders according to the ratio of 0.57 g between 24-65 mesh and 65 mesh sieve, and put it into a round bottom flask containing 50 mL of boiled distilled water. After being refluxed for 40 min, take the additional filtrate as the sample solution. The internal standard solution of puerarin (30 g·mL -1 ) was mixed with the sample solution by equal volume, and filtered with 0.45 μm filter membrane. All the samples were stored in a refrigerator at 4℃ for further analysis.

Methodology validation of fingerprint analysis
Precision was evaluated by six consecutive injections of the sample solution, while repeatability was performed by six replicates of a sample from the same origin. For the storage stability test, the sample solutions were tested in a day (0, 2, 6, 8, 12 and 24 h). Chempattern TM software was used to obtain fingerprint information and verify the methodology test. After calculating the relative peak area value and relative retention time value, results of relative standard deviation (RSD) value of 19 peaks were all less than 3.00%, which was verified by the methodology experiment.

Common peak identification of fingerprints
The chromatographic peak retention time and peak area values of 33 samples were recorded at 280 minutes. Combined with the chromatographic information and literature data of the reference substances, the chemical components of the common peaks were preliminarily analyzed and determined.

Experiments of antibacterial effects
In this study, a blank group, a positive drug group (amoxicillin group), a bacterial solution control group, and an administration group were established. The optical density (OD) value of the 96-well plate was measured by the xMark microplate reader. The samples were placed in a constant temperature and humidity incubator for 18-24 h at 37°C. The microdilution method was applied to determine the antibacterial rate of the sample water extract.

Grey relational analysis
Grey correlation analysis is a method to determine the degree of correlation between factors based on the similarity degree of each factor [11] . Therefore, it can be considered as a simple and effective comprehensive evaluation method of spectrum-effect relationship. In this study, the program for grey correlation analysis was carried out by MATLAB R2017a software. The correlation coefficients between the independent variable (relative peak areas) and the dependent variable (antibacterial rates) was measured with this grey relational analysis model.

Pearson correlation analysis
Pearson correlation coefficient (r) was regarded as a metric of checking the linearity of relationships between different variables [12] . These coefficients were calculated when comparing relative peak areas (X) and antibacterial rates (Y). The analysis was performed by IBM SPSS 20.0 software.

Orthogonal partial least squares regression Analysis (OPLSR)
As a statistical analysis method to find the causal relationship between variables, regression can be used to analyze the relationship between dependent variables and independent variables. It can also be used for predicting the mean value of dependent variables through independent variables [13] . The orthogonal partial least squares regression (OPLSR) method has high applicability when the datasets were small and the correlation was large. In this study, the OPLSR model was established by SIMCA-P+ 14.1 software based on the 19 common peaks and one pharmacodynamic index. The main effective components of the antibacterial effect were found by observing the importance of variables in projection (VIP) and regression coefficient.

Geographical origin traceability based on orthogonal partial least squares discrimination analysis
OPLS-DA is a linear supervised classification method based on orthogonal partial least squares regression algorithm, which can characterize the identification ability of SR samples from different geographical origins based on HPLC fingerprint dataset. In this method, variables with the maximum covariance are found from the content matrix (X) and the classification matrix (Y), and are classified according to the sample score. Y = 1 means that the sample belongs to a specific classification, and Y = 0 means that the sample does not belong to the specific classification [14] . In this study, 28 samples were divided into training sets randomly to build the model, and 5 samples were used as test sets to externally verify the model performance. The model was established based on internal 7-fold cross validation. The stability of the model was evaluated according to some important parameters such as root mean square error of cross validation (RMSECV) and root mean square error of prediction (RMSEP). When the values of these parameters were smaller, it indicated that the model had a better fitting degree [15] .

Results of HPLC fingerprints
HPLC fingerprints of 33 batches SR samples were obtained through the optimization of chromatographic condition. 19 common peaks were marked as P1-P19 according to the range of the retention time. Compared with the standard, they were identified as P1, P2, P3, P4, cynaroside, P6, 5,7,8-trihydroxyflavone, baicalin, P9, chrysin-7-O-glucuronide, norwogonin-7-O-glucuronide, P12, wogonoside, P14, P15, baicalein, wogonin, chrysin and oroxylin. The results of HPLC fingerprints are shown in Fig. 1. The chromatographic peak retention time and peak area values within 280 min were recorded for 33 samples. In order to establish a mathematical model in connection with the antibacterial effect, the relative peak area of the HPLC under bacteriostatic concentration was calculated, as shown in Table  2. Method validation for HPLC fingerprint results shown that the relative standard deviation (RSD) for method precision and repeatability, alone with storage stability of sample solutions within 24 h appeared less than 3.00%. (1)

Results of antibacterial experiments
The micro-dilution method was applied to determine the antibacterial rate of the sample water extract. The results are shown in Table 3. The results showed that good quality was mostly found in GS and SD province, however, Samples from SN province had the worst antibacterial effects. In addition, we found that different batches of samples from the same origin had different antibacterial effects, indicating a great difference in quality.

Results of grey relational analysis
In this study, the grey correlation analysis method was used to determine the contribution of each component of SR to the inhibitory effect of Staphylococcus aureus according to the degree of correlation. Therefore, the antibacterial rate was regarded as the parent sequence, and the peak value of each component was taken as the subsequence. Due to the different units or dimensions of the original data, dimensionless processing was required for each sequence before the correlation analysis. The consistency of dimension was achieved by means of average transformation, and then the calculation was carried out [16][17][18] . The calculation results were shown in Table 4. According to the principle of grey correlation, the component with a higher correlation degree has a more significant influence on the antibacterial effects. The value of correlation degree grater than 0.9 indicates that the sub-sequence has a significant influence on the parent sequence. When was greater than 0.7 and less than 0.8, it meant that there was an obvious influence. When was less than 0.6, there is a very small effect. The components that had significant influence on the antibacterial effect were component 1,  (17), oroxylin (19). It provided the theoretical basis for the quality control model based on spectrum-effect relationship.

Results of orthogonal partial peast squares regression analysis
The relative peak area data of the 19 fingerprint peaks were pre-processed by normalization as X matrix, and the antibacterial rate of each sample was taken as Y variable for analysis. The standardized regression coefficient of each variable to dependent variable was obtained, as shown in Fig. 2 (A). Based on the regression coefficient, OPLSR model was established and the regression equation (2) was obtained. The larger regression coefficient indicated that the independent variable had a greater contribution to the dependent variable. The positive value of the coefficient meant that the component had a positive correlation with the antibacterial rate. VIP, variable importance index, used to describe the explanatory ability of the independent variable to the dependent variable [19] . Wold suggested that variables with VIP greater than 1.0 could be considered to have a big contribution to the dependent variable [20] . The results of VIP analysis are shown in Fig. 2 (B).

Results of pearson correlation analysis
In the bivariate correlation analysis, the relative peak area of 19 fingerprint chromatographic peaks of 33 batches of SR and the pearson correlation coefficient of the antibacterial efficacy value were calculated. The results are shown in Table 5. -0. 497 * 0.604** 0.614 ** -0.388 * 0.448 ** Notes: Significant differences (P <0.05) marked as "*", (P < 0.01) marked as "**" The results showed that among the relative peak area of the 19 fingerprint peaks, component 6, component 9, chrysin-7-O-glucuronide (10), component 12, wogonoside (13), component 14, and chrysin (18) were significantly correlated with the value of antibacterial rate (P<0.05). The components of baicalin (8), norwogonin-7-O-glucuronide (11), 15, baicalein (16), wogonin (17), and oroxylin (19) were very significantly correlated with the value of antibacterial rate (P<0.01). In general, wogonin, baicalein, oroxylin could be used as an evaluation index to reflect the positive bacteriostatic effect. Moreover, as shown in Fig. 3, we found that the correlation between the twelve components and antibacterial value were basically consistent with the results of orthogonal partial least squares regression analysis.
The discriminant results of the model training set and the prediction set were further evaluated. According to Table 6, the classification accuracy of the training set was 85%. The cross misclassification phenomenon existed in NM, SN and SX, which indicated that the original HPLC data had redundant information, thus affecting the classification accuracy of the model. In addition, all the samples in the prediction set were classified correctly, which proves that the model has an excellent effect in predicting unknown samples. In a word, the established OPLS-DA model had excellent learning ability and higher prediction accuracy, which was suitable for the origin identification of SR. As shown in Fig. 4, the potential factors were screened using OPLS-DA, and the first three potential factors could interpret 77.73% of the chromatographic information cumulatively. At this time, R 2 was 0.5782 (R 2 > 0.5), indicating a good fitting accuracy of the established model. The values of RMSECV and RMSEP of the models were 0.2190 and 0.2927 respectively, suggesting that the discriminant error was within an acceptable range. In addition, 200 permutation tests (Fig. 5 A) were used to further verify the overfitting of the model. The Q 2 -intercepts intersected the negative half axis of the Y-axis and had a value of 0.4509. The R 2 and Q 2 values generated by each operation were lower than the original R 2 and Q 2 values, indicating that there was no risk of overfitting in the model. The 3D scatter plot shown that GS and SD had great differences in origin, and NM has a certain degree of overlap with SN and SX. This result could also be verified according to the confusion matrix results of the OPLS-DA model (Table 6). According to the screening results of VIP feature variables (Fig. 5 B), Wogonin, Cynaroside, Baicalein, 5,7,8-Trihydroxyflavone and Wogonin-7-O-glucuronide could be regarded as the key indicators of geographic origin identification.

Discussion
At the beginning of building the HPLC detection method was to detect the Yinhuang granules, the raw medicinal materials (Scutellariae Radix, Lonicerae Japonicae Flos) and their extracts at one time. In this method, the relevant components of Lonicerae Japonicae Flos were detected in the first 95 minutes, the SR was detected after 95 min. So we are only showed the latter part of this chromatogram starting at 95 min.
It is important for pharmacodynamics study to select a reasonable data processing method. In the process of data analysis, multiple methods are selected to verify each other, which can make the results more convincing. At present, the most commonly used data processing methods include grey correlation analysis, correlation analysis, cluster analysis, principal component analysis, partial least squares regression analysis, artificial neural network and etc. Grey relational degree analysis can reveal the unknown information according to the known information, which contains the idea of holistic view, so it is suitable for the analysis of complex components of traditional Chinese medicine [21] . At the same time, there may be multicollinearity between the characteristic peaks of traditional Chinese medicine fingerprints, so the common multiple linear regression analysis is not applicable. However, OPLSR can not only simplify data structure and regression modeling, but also have unique advantages for data samples with small sample size and multiple correlation problems between variables [22] . Therefore, OPLSR was used to reflect the relationship between each component of SR and its bacteriostatic rate in this study.
The results shown that baicalein, wogonin, chrysin, oroxylin, 5,7,8-trihydroxyflavone (norwogonin), baicalin, norwogonin-7-O glucoside, cynaroside, component 1, 3, 6 and component 14 contributed significantly to the antibacterial rate, which was consistent with the results of correlation analysis. 5,7,8-Trihydroxyflavone (7), baicalein (16), wogonin (17) and oroxylin (19) were positively correlated with antibacterial activity. The material basis of SR for inhibiting Staphylococcus aureus was preliminarily determined, which coincided with the research of Xing et al. [23] . The least squares support vector machine (LS-SVM) method was used to establish a mathematical model. The model could be used to accurately predict the antibacterial rate of the pharmacodynamic index value by detecting the HPLC fingerprint of SR. The OPLS-DA model based on HPLC data can effectively identify the origin of SR and provide a reliable method for its quality control.

Conclusion
In this study, the spectrum-effects relationship of SR was discussed by combining HPLC fingerprint and antibacterial activity. Results of OPLSR, grey correlation analysis and pearson correlation analysis showed that baicalein, wogonin and oroxylin were the main effective antibacterial component. The structures and actions of substances such as P1, P2, P3, P4, P6, P9, P12, P14, and P15 still require further confirmation, as these components may have the potential to have antibacterial effects. In addition, the origin identification model which could be applied to other samples of Traditional Chinese medicine was also established. In the end, the exact pharmacological mechanism of active ingredients in SR will be studied in the future.