Design and implementation of an Agricultural IoT based on LoRa

. In order to build a large-scale Agriculture IoT, a sensor network based on LoRa is designed instead traditional Zigbee network. The data collected by the sensors is transmitted to remote server for storage through GPRS. The environment data is displayed on the browser for users to use. Because the agricultural production environment is complex, collected data is greatly influenced by noise and can not be analyzed directly. To solve this problem, the time series analysis method is used to model the raw data and a prediction model is obtained, which can fill or replace the missing data, abnormal data and so on, and can effectively predict the future data. It provides a good data source for the analysis of subsequent data. The experiment shows that the system can satisfy the design requirements and can operate efficiently and steadily.


Introduction
Agricultural Internet of Things (IoT) refers to the application of the core technology of the Internet of Things to the production, management, and management of agriculture [1]. Combining the Internet of Things with agricultural production can realize the real-time monitoring of environmental information that affects the growth of crops. It can achieve the precise control and precise production needs of China's agricultural production and can also solve problems such as farmland environmental pollution and resource scarcity. Therefore, it is very necessary to establish a simple, practical, reliable and stable agricultural IoT system [2].
Large-scale agricultural environment monitoring has quite different application needs and technology needs than the existing small-scale farm environment monitoring. The technical problems includes: • Zigbee, which is frequently used in wireless sensor network, has the disadvantages that the signal is easily interfered by noise and the transmission distance is short [3][4]. When the monitoring range increases, the number of network nodes also increases, which leads to the complexity of the network structure. Therefore, Zigbee is not suitable for largescale agricultural monitoring.
• The monitoring environment of the Agricultural IoT is complex, and the noise generated by human, natural, hardware and other reasons has a great influence on the data collected, and the data that is polluted by noise can not perform subsequent data analysis and data mining directly. Therefore, we need a fast, efficient, low-cost data preprocessing method to preprocess the collected data and provide a better data source for subsequent data analysis.
To solve the above problems, this paper designs an agricultural IoT system based on LoRa. Comparing with the deficiencies of existing agricultural monitoring systems, this system has studied and improved existing technologies in the following two aspects: • LoRa technology is used instead of Zigbee to build a large-scale wireless sensor network and realize large-scale agricultural environment monitoring.
• Time series analysis method is used to model the data which can fill or replace the abnormal or missing data, and can effectively predict the future data. This provides a better data source for subsequent data analysis and other work.
The paper is organized as follows: The second chapter introduces the overall design of the agricultural IoT system. The third chapter introduces the design of LoRa wirless sensor network. The fourth chapter describes the design of the Web-based remote monitoring platform. The fifth chapter introduces the research of data prediction model based on time series analysis. Finally, we summarize the paper in the sixth chapter.

The design of Agricultural IoT system
The demands of the Agricultural IoT include wide monitoring range, high scalability and low total cost. According to the demands, we design the system as follows.

Selection of wireless communication technology
At present, there are many wireless communication standards in the market, which can be divided into short, medium and long distance wireless communication according to transmission distance, such as Zigbee, Bluetooth, WiFi, GPRS. The communication distance and transmission rate of them are proportional to each other, and the power consumption is also proportional to the communication distance.
The LoRa technology released by Semtech company perfectly solves the problem of transmission distance and power consumption, which can achieve long distance communication under low power consumption. After actual testing, the communication distance in open area can reach 5km or more. The advantage of LoRa in the transmission distance makes it very simple to build a long distance sensor network [5]. The LoRa network generally uses star topology, which can hold tens of thousands of terminals in the star network, which is very suitable for application to the agricultural Internet of things [6]. Therefore, this paper uses LoRa instead of traditional ZigBee to organize an Agricultural IoT.

Structure of the system
The structure diagram of Agricultural IoT system based on LoRa is shown in figure 1. The agricultural IoT system can be divided into three modules: LoRa wireless sensor network, GPRS communication network, and Web-based remote monitoring platform. The LoRa wireless sensor network includes multiple LoRa terminal nodes and a few sink nodes.
In a large area of farmland, multiple LoRa terminal nodes integrate the data collected by the sensor, the geographic location information and the time information in the GPS module, and then send data packet to the LoRa sink node through the LoRa wireless network. The sink node summarizes the data and sends it to the remote monitoring platform through the GPRS module. The remote server receives the data information sent by the sink node, stores the data in the database, and finally displays it on Web-based remote monitoring platform.

Design of terminal node
The terminal node structure is shown in figure 2. The variety of sensors are responsible for collecting temperature, humidity and other farmland environmental data. The Arduino master microcontroller is responsible for processing and converting the environmental data collected by the sensor and the location and time data in the GPS module. The LoRa wireless communication module is responsible for transmitting the processed data through the LoRa network. The control process of Arduino has 4 steps: • Power up the terminal node, the Arduino module initializes and defines the pins. • The LoRa module initializes and sets the sending frequency and power.
• Data acquisition starts after the specified delay time is reached. • Data packet is sent to the sink node through the LoRa module.

Design of sink node
The LoRa sink node module mainly includes Arduino main control platform, LoRa wireless communication module, GPRS wireless communication module and power module. Arduino first controls the LoRa module to receive the information from the terminal node and process it. Then the Arduino controls the GPRS wireless communication module to send the collected environment data, location information, and time information together to the server of the remote monitoring platform.
The Arduino control process in the sink node has 3 steps: • Power up the sink node, initialize hardware such as Arduino modules and LoRa wireless communication modules. • Accept the data from each terminal node and integrate the data. • All data is sent to the server of the remote monitoring platform through GPRS.

Design of Web-based remote monitoring platform
The web-based remote monitoring platform is the remote management and control center of the entire agricultural IoT system and an important part of the agricultural IoT system.
The remote monitoring platform is based on B/S (Browser/Server) architecture. The main function of the platform is to display the latest data and query historical data. To facilitate the user's query of data, the functions of querying data such as temperature and humidity is designed according to different time periods such as current day, nearly one week, etc. Users can also use the calendar plug-in provided by the platform to query data for specific dates. Data query platform provides two kinds of data presentation methods, list display and polygon graph, and can be exported into Excel form as well. The result of the humidity query is shown in figure 3.

Data preprocessing based on time series analysis
We carried out atmospheric temperature monitoring experiment using the designed system in greenhouse or cropland of 5 farms in Wuhan. After a period of experiment, we found that there were various kinds of abnormalities in the collected data. For example, we carried out 125 hours of temperature data acquisition in a greenhouse and the data acquisition frequency was set to 10 minutes. We should get 750 temperature data, but actually only 735 data is obtained, and there are obvious abnormal data. The above results show that agricultural monitoring environment is complex. The environmental noise caused by various reasons has great influence on the collection data. The raw environmental data with missing and abnormal values can not be directly analyzed and mined. Therefore, data preprocessing is necessary. As the temperature data collected is a typical time series, we use time series analysis methods to process the temperature data [7][8]. The processing steps are discussed in following sections.

Time series preprocessing
First, we should determine whether the collected raw time series data is stable. Here, we use the adftest function provided by the Matlab system identification toolbox to detect the stability of 735 temperature experimental data. The results showed that h = 0, indicating that the sequence was not stable. However, after performing a first-order differential operation on the sequence, the result shows h = 1 and the sequence is stationary.
Next, we should test whether the data is a purely random sequence or not, because purely random sequence means there is no correlation between the sequence values of the sequence. The past behavior has no influence on the future. Using Matlab to calculate the autocorrelation coefficient of the stationary sequence, the first-order coefficient is 0.47. Hence it can be judged that the sequence is not pure random sequence.

Modeling of stationarity time series
For a stable non-pure random sequence, a linear model can usually be established to fit the development trajectory of the stationary sequence. This articlel uses the ARMA model to model and analyze the sequence of the first 600 data in the 735 temperature data, and use the following data for model testing.
We use the autocorr and parcorr functions provided by Matlab to compute the autocorrelation coefficient (ACF) and the partial autocorrelation coefficient (PACF) with a confidence degree of 95%. It can be seen from figure 4 that the ACF and PACF of the sequence all exhibit tailing properties. From figure 4, we can confirm that the sequence is ARMA(p, q). The p and q values of the AR(p) model and the MA(q) model can be judged from the partial autocorrelation and autocorrelation graph properties, respectively, while p and q in ARMA(p, q) model cannot be determined directly. We built ARMA models from (1,1), then increase the values of p and q to find out a series of models. Here the AIC criterion is used for judgement. The maximum order of the fitting model is set to be 12. The experimental results show that when p = 2, q = 12, the value of AIC is the smallest. Therefore, we can confirm that the model of this sequence is ARMA (2,12). The next step is to identify the unknown parameters in ARMA(2,12) model. We use the least squares estimation to estimate the parameters. The results show that the specific parameters of the ARMA (2,12) model are as follows:

Stationary Sequence Prediction
The ARMA(2,12) model can be used to predict the 135 data in the test set. We compare it with the original data and calculate the error of each prediction value. The result is shown in figure 5. The prediction error E is defined as follows: where Y is the original data, F is the prediction data. After the experiment, the minimum error of 135 forecast data is 0.02%; the maximum error is 16.6%; the average error is 5.7%. The error distribution is shown in figure 6.

Fig. 5.
The result of data prediction. Fig. 6. The error distribution.
From figure 6 we can see that among the 135 forecast data errors, nearly 50% of the error values are less than 5% and nearly 82% of the error values are less than 10%. The results show that the fitting model obtained by the experiment is effective and can predict the change trend of temperature series well.

Conclusions
In this paper, an agricultural IoT system based on LoRa is designed, which can collect the data in a large area of farmland all day without interruption. With LoRa technology, the system can use fewer sensor nodes to build a wide range of wireless sensor network. In this paper, the collected temperature data are analyzed in time series. An effective temperature prediction model is proposed. The model can predict the change of temperature data and provide an effective method for abnormal data processing. With the development of the agricultural IoT and the promotion of national policies, the system will have a good application prospect.