An Overview of the Latest Progress and Core Challenge of Autonomous Vehicle Technologies

The mass production of autonomous vehicle is coming, thanks to the rapid progress of autonomous driving technology, especially the recent breakthroughs in LiDAR sensors, GPUs, and deep learning. Many automotive and IT companies represented by Waymo and GM are constantly promoting their advanced autonomous vehicles to hit public roads as early as possible. This paper systematically reviews the latest development and future trend of the autonomous vehicle technologies, discusses the extensive application of AI in ICV, and identifies the key problems and core challenges facing the commercialization of autonomous vehicle. Based on the review, it forecasts the prospects and conditions of autonomous vehicle’s mass production and points out the arduous, long-term and systematic nature of its development.


Introduction
The fully interconnected and highly intelligent ICV industry ecology is emerging. As one of the most important manifestations of ICV, autonomous vehicle technology has made rapid progress in recent years. Automotive enterprises, parts and components enterprises, IT companies, and scientific institutions continue to increase investment in autonomous driving research, some of which have also proposed timetables for its commercialization. For example, Waymo was approved in January 2018 to operate as a Transportation Network Company (TNC) in Arizona and launched self-driving ride-hailing service in December. General Motors plans to roll out their public ride-hailing services with selfdriving vehicles starting in 2019. And Uber also launched the autonomous taxi test service in San Francisco in March 2018 [1][2].
Meanwhile, technology is still the main bottleneck of autonomous vehicle ' s commercialization. Generally speaking, the technology of autonomous driving involves three aspects: perception, decision, and action, the connotation and relationship of which are shown in Figure 1. Based on the current status of autonomous vehicle technologies, this paper analyzes the latest development and future trend of them and then discusses the extensive application of AI in ICV area and the key problems and core challenges of the comprehensive commercialization. Following that, it points out the key points that the industry and academia must tackle to provide reference to the R&D of autonomous vehicle products and technology in the future.

Automotive radar
Radar is an electronic device that detects targets by electromagnetic wave. It can be divided into narrow band and wide band by bandwidth and the larger offers higher range resolution. 24 GHz and 77 GHz are two dominant bands: the former is for short and medium range detection of 5-70m and the latter is mainly for medium and long range detection of 100-250m. With the improvement of accuracy requirements of automotive radar system, 77 GHz with ultra-high frequency band will become the mainstream. Radar can also be divided into short-range radar (SRR), medium-range radar (MRR) and long-range radar (LRR) by covered area. SRR has higher range resolution and wider field of view (FoV ±65°~± 80°), which is used for collision avoidance and object detection, and its typical applications include blind spot surveillance, parking assist and ACC support. MRR (typical FoV ±40°~±50°) is used for LCA; LRR has high gain and narrow beam for long detection range and accurate direction measurement, with smaller FoV around ±10º and more range resolution tolerance, which is mainly used for ACC, FCW, and AEB [3][4]. Key radar sensor manufacturers in the world include Bosch, Delphi, Continental, Denso, Valeo, ZF-TRW, etc. Freescale, Infineon, ST, Texas Instruments and ADI are the key chip suppliers. Automotive radar has the following technology development trends in order to satisfy the autonomous driving requirements of high range resolution, high angular resolution, high distance /speed/angle accuracy, long detection range, wide opening angle, high ASIL, low power consumption, low cost and small size. (1) Moving to higher frequency (77~81 GHz): Highresolution millimeter-wave radar that operates in the 79 GHz band is expected to replace 24 GHz SRR completely in the future, with combination of high transmit power and high bandwidth allowing for simultaneous long range and high range resolution, smaller antenna size for a given bandwidth, better angular resolution, and smaller sensors reducing volume/weight related costs. (2) Chip technology roadmap: GaAs has high output power, high gain, very low noise, good linearity, but low integration level. SiGe has an adequate performance at mmWave bands, size and performance advantages, high temperature tolerance. CMOS is cheaper, allowing for higher levels of integration, lower power consumption. It is expected that chips will develop along the route from GaAs to SiGe and then to CMOS. (3) Bistatic over monostatic: Bistatic Radars have higher transmitter output power, better noise performance, and improved conversion gain of the receiver. (4) Multi-mode sensors: Multimode sensors usually support both MRR and LRR configurations in one sensor module, thus providing a significant cost reduction to automakers.

LiDAR
LiDAR is a kind of integrated optical detection and measurement system working in optical frequency band, current sensor performance of which is range 100~300 m, accuracy ±2.0~±5.0 cm, FoV 120~360°(horizontal), 25~120°(vertical), angular resolution 0.1~0.4°( horizontal), 0.4~2.0°(vertical). The advantages lie in its ability to create high-resolution 3D image, mid-range, accurate depth information and target discrimination, wide FoV, high angular resolution, independent of light conditions, and to detect lane markers and street signs, but it cannot transmit in rain, snow, fog or dust, and at present it cannot meet the performance, size and cost requirements of mass-produced autonomous vehicles at the industrial level simultaneously. Typical applications include mapping & navigation, obstacle detection and incabin monitoring.
LiDAR can be divided into mechanical scanning and solid-state LiDAR according to the presence or absence of mechanical rotating parts. Mechanical LiDAR has rotating parts to control the angle of laser emission to achieve wide HFoV 360°, while mass production is not feasible due to insufficient durability [5], large volume and relatively expensive price of mobile parts. Solid-state LiDAR relies on electronic components to control the angle of laser emission. Because of the narrow HFoV, multiple sensors are required for 360°view. However, its integrated design is more durable, compact and costeffective. Therefore, solid-state is one of the most important trends in this field [6][7].
Solid-state LiDAR, including OPA, flash and MEMS, is the key technology to autonomous driving, in which area bigger vendors are working to make sensors costeffective and startups continue to pop up around the world, some could emerge as industry leaders, given enough time and money. Such as Quanergy, which was invested by Aptiv in 2015, is in the development of OPA LiDAR; ASC, which was acquired by Continental in 2016, is committed to the development of 3D flash LiDAR and the products are expected to be available in two to three years; in addition, MEMS LiDAR will also be put into production, and Valeo is cooperating with LeddarTech in the development of related products.

Camera (Image sensor)
Camera converts an optical image into electronic signals, including CCD and CMOS. CCD transfers each pixel's charge packet sequentially to a common output structure to convert charge to voltage, with less noise, generally higher quality images particularly in low light conditions, high dynamic range, better depth of color, higher resolution and light sensitivity; Charge-to-voltage conversion of CMOS takes place in each pixel. It has the advantages of lower power consumption, high integration, smaller size, easy manufacture and lower price, improved quality since its adoption. Automotive camera can be divided into front view cameras, surround view cameras, rear view cameras and in-cabin cameras. Key tier 1 camera manufacturers include Bosch, Aptiv, Continental, Denso, Valeo, etc. Key image sensor suppliers are Infineon, Freescale, Texas Instruments, ST, Panasonic and so on. Key optics suppliers are Sony, Panasonic, Sharp, Konica Minolta, Omron and other enterprises.

ICTTE 2019
The development trends of automotive camera include range improvement, modularization, scalability and interconnection, side or surround view cameras to expand rear view scope and replace side mirrors, in-cabin ToF camera (or 3D Flash LiDAR) for presence detection & gesture recognition and all-weather sensing (timegated sensors). All-weather sensing needs night vision solution ， in which regard IR camera is an important solution to increase driver seeing distance in darkness/poor weather and make up for the fatal weakness of the camera that can not work properly under low illumination [8]. It ' s currently only offered in luxury brands due to the high cost. Autoliv is the key system supplier. IR cameras are mainly divided into two categories: far infrared (FIR, passive, e.g. BMW Night Vision System) and near infrared (NIR, active, e.g. Mercedes Night Vision System).

Ultrasonic sensors
Ultrasonic sensors have a close detection range (0.15-6m), excellent very-near-range 3D-mapping and repeatable linear response, high precision & resolution due to slow travel of sound wave, and work equally well in snow, fog, rain, and dust due to short range. They are small and cheap as well, however not useful for gauging speed due to short range, and susceptible to temperature fluctuations, humidity, wind and EMI. So sensors need temperature and humidity compensations and proper shield. Its main applications include space detection, parallel parking, reverse parking, blind-spot detection, kick-to-open liftgates. In this field, Bosch Gen 6 Ultrasonic Sensors are the representatives, which are available in three mechanically compatible sensor variants.

Mapping & Localization
Based on the mapping by LiDAR and other systems and localization technologies such as GPS/INS and cloud, vehicles can obtain prior information beyond visual range and ensure redundancy to a certain extent.
The development of autonomous vehicles requires high-resolution map, which needs the vehicle coordinates with higher accuracy, more detailed information (lane markings, street signs, traffic signals, potholes, height of curb, etc.), and live update in path planning. The functions of HD map is to provide priori information for a level of redundancy, to reduce sensor loads so that sensors can focus on processing moving objects, extreme situation aid when road markings wear away or disappear under snow, precise localization, to identify other dynamic entities such as other cars and pedestrians, and incorporating human psychology into visual display [9].
The collection of HD maps mainly includes fleet survey and crowdsourcing. Some cartographic companies and technology giants adopt the former mode, that is, sending a fleet of probe vehicles to survey selected areas. Its advantages are high efficiency, comprehensive road information acquisition, but the task of data acquisition is arduous, the cost of mapping is high and it is hard to frequently update. Most enterprises adopt the low-cost crowdsourcing, that is, to fit sensing system on OEM cars/trucks to transmit their precise locations and images of the roads they are traveling, and to synthesize captured data into one giant digital map for near-real-time update [10]. At present, there is no formal industry standard for HD maps in the world, and the mapping methods and characteristics of all participants are different. For example, Waymo map paints 3D portraits of the world with the help of LiDAR, and extracts road features manually and using computer algorithms; Here map is based on satellite/aerial imagery and fleet probe data, while using computer vision algorithms and manual labor to extract road features; TomTom map is a detailed depth map, capturing unique pattern of roadway scenery instead of feature extraction, which is insensitive to environment changes; Civil Maps map is a crowdsourced 3D semantic map (machine vision data converted to map vectors), which combines a live video feed of road with map data using AR display.
Localization technologies, including absolute localization, dead reckoning (relative localization) and SLAM, work together to achieve accurate results. GPS is not accurate enough for AVs due to variable radio wave speed, blockage by canyons/tunnels, multi-path errors, radio interference, etc. Odometry and IMU are used for relative positioning. Using wheel encoders, odometry integrates wheel speeds over time to produce incremental vehicle motions and sums the vehicle movements to produce a position estimate. The errors mainly include systematic errors associated with imperfect kinematic vehicle model and wheel base and wheel diameter measurement errors, and random errors due to the slip between wheels and ground, loss of contact between wheels and ground caused by uneven or rocky terrain. IMU, consisting of 3 orthogonal accelerometers and 3 laser ring gyroscopes to provide 3D acceleration measurements and angular rate measurements about each axis, integrates accelerations twice and angular rates once to determine incremental vehicle movements. Compared with the odometry, it is not susceptible to wheel slip, but angle measurements drift with time and errors quickly accumulate. SLAM builds/updates a map of a vehicle ' s surroundings while keeping the vehicle located within the map. Bayesian filter is a framework to reduce the location uncertainty by using recursive state estimation, which calculates the probabilities of multiple beliefs to allow an AV to infer its position and orientation, and allows vehicles to continuously update their most likely position within a coordinate system, based on the most recently acquired sensor data [11]. LiDAR sensors are mainly divided into two categories: LiDAR and vision, among which the former is more mature.

V2X communications
Complementary to other sensors, V2X communications support 360°non-line-of-sight (NLOS) awareness and extend a vehicle's ability to see, hear and understand the environment down the road, at blind intersections, or in bad weather conditions, which cover a variety of application scenarios of V2V, V2I, V2P and V2N.
DSRC and C-V2X are two main technical routes of V2X [12]. DSRC is a two-way half-duplex short-tomedium-range (<1 km) wireless communication technology, with high data transmission (3 to 27 Mbps) critical in communications-based active safety applications, which can meet extremely low latency requirement for road safety messaging and control. DSRC has been basically finalized in 2014 after 10 years of R&D and testing and adopted by the US. There are relevant variant standards in Europe and Japan, and DSRC is likely to become mandated in the US for all light vehicles starting in as early as 2021. However, LOS technology imposes challenges in urban conditions where buildings situated at crossroads, and there is a lack of technology evolution path to meet more stringent autonomous vehicle requirements. In contrast, C-V2X is a vehicle wireless communication technology based on the evolution of cellular network communication technologies such as 4G/5G, which works both incoverage and out-of-coverage (direct communication) and has improved V2I & V2N communication, extended V2V range and better non-line-of-sight awareness. Today's 4G LTE is only suited for non-safety-related usecases, but a clear technology evolution roadmap exists. 5G will provide scalable connectivity to support an extreme variation of requirements, such as extreme throughput in millimeter wave bands, edgeless connectivity, high reliability (1ms end-to-end latency), high availability, see-through augmented reality and vulnerable road user discovery, thus becoming the key to enabling fully autonomous driving capabilities [13].

Sensor fusion
Multi-sensor effective fusion for objects detection/classification/tracking, location and V2X communication, i.e. association, aggregation and integration of data, can achieve complementary multimodality-sensing, extended NLOS perception, redundancy/robustness for 360°situational awareness in all conditions, thus ensuring the speediness and correctness of decision-making, which is the inevitable trend of autonomous driving [14]. Sensor fusion is divided into two categories (homogeneous and heterogeneous) and includes four processing layers (signal level fusion, object level fusion, feature level fusion and decision level fusion), which can be realized by Kalman filter, fuzzy logic, Bayesian networks and CNN, etc.
The architectures of sensor fusion include fully distributed processing, centralized processing and hybrid processing. Fully distributed processing requires lower bandwidth and central processing power, but it is larger, requires more expensive sensor modules with application and has a higher functional security requirement. It is hard to synchronize signals across nodes and may lose context information as well. Today, distributed data processing is almost exclusively used to do feature-level and decision-level fusion, which is not enough for L4/5 autonomy due to its limited sense of environment and loss of contextual information. So centralized processing is needed to complete environmental information available, with sensor modules of small size, low cost and lower functional safety requirement, but the central processing power/speed and bandwidth requirements are high and the center is non-scalable. Thanks to the fast development of supercomputers, a centralized architecture is now entirely possible. However, it is difficult to manage multi-supplier coordination on one chip, and there are no common standards for languages, formats, protocols, and networking. Hybrid processing takes both advantages into account, by local processing with intelligent sensors, reducing bandwidth requirement and centralized processing for path-finding, maneuvering, and motion trajectory, which is flexible and scalable. Therefore, the development from distributed processing and centralized processing to hybrid processing has become the main trend of information fusion system for autonomous vehicle, with corresponding requirements for communication and computing power.

Planning, decision making and vehicle control
The process of planning and decision-making includes route planning, behavioral decision-making and motion planning. Path planning is to select a route from the current position to the destination through the road network. By representing road network as a weighted directed graph corresponding to the cost of traversing a road segment, path planning can be converted into the problem of finding a minimum-cost path on the road network graph. Behavioral decision-making is to select the appropriate driving behavior at a specific time point based on the perceived behavior of other traffic participants (brake lights, turning lights, etc.), road conditions and infrastructure signals. In this process, machine learning based techniques (such as Gaussian mixture models, Gaussian process regression, etc.) and probabilistic planning (such as MDPs and their generalizations) are required to predict intention and estimate future trajectories of other traffic participants and then dictate driving actions. Motion planning is to translate selected driving behavior into a path/trajectory, to ensure dynamic feasibility and efficiency for vehicle, to ensure safety and comfort for passengers, and to avoid collisions with detected obstacles. Numerical approximation methods such as non-linear optimization in function space, graph-search approaches for the shortest path and incremental tree-based approaches to select the best branch are widely used.
The core function of vehicle control is path tracking/trajectory control, which executes the path/trajectory specified by the motion planning system, by selecting appropriate actuator inputs to carry out the planned motion and using a feedback controller to correct tracking errors, and ensures the reference path or trajectory stability in the presence of modeling errors and other uncertainties [15].
In the aspect of E/E architecture, distributed architecture is widely used, under which a large number of ECU units work together to provide various functions for drivers. With the demand for higher level automation (demanding an unprecedented increase in sensor technology, data bandwidth, processing performance, safety and security, etc.) and vehicle connectivity, from distributed processing to centralized processing gradually has become the main development trend of vehicle control, as shown in Table 1. Centralized E/E architecture will achieve higher overall system efficiency, faster system response time, and reduced performance requirements for lower level ECUs, with extremely high performance requirement for the high-level ECU and development/evaluation difficulties. The migration path of this process is firstly to add partial domain controllers to deal with sensor fusion and decision-making, so that more core functional modules are concentrated in the domain controllers, to achieve functional integration and domain fusion, and ultimately to realize a high-level vehicle controller managing overall autonomous strategy, passing decision information to lower level systems.

Application and development of artificial intelligence
AI plays a decisive role in achieving autonomous driving and improving driving experience. AI can be divided into three levels according to the tasks it can accomplish: narrow or weak AI, general or strong AI, and super AI. The demand for data computation has been significantly increased from L3 to L5, with the increasing demand for AI. There is still a great controversy about whether or not super AI is achievable and what are its implications to mankind. Therefore, the current considerations mainly focus on narrow AI and general AI. ANI refers to machine intelligence that implements functions in a limited and fully defined field. Today's AI systems are all ANI, based on which automobile is one of the most complex application cases to realize basic driving assistance functions. AGI refers to the realization of machine intelligence equal to human beings in different broad areas, which can perform comprehensive human perception tasks and is the main goal of AI research at present. As far as autonomous driving is concerned, L4 is facing 5-10 times more computational load than L3 and needs to achieve AGI basically to meet the needs; while L5 has a higher demand and needs to achieve AGI fully to cope with various complex application scenarios. Compared with traditional rulebased automatic driving decision-making, end-to-end deep learning is a possible method for realizing AGI, which gradually acquire the ability of autonomous driving decision-making by learning real driving data without manual processing, splitting and marking. The whole system is a black box of input perception and output decision-making. Its advantage is that the more scene data used in training, the more reliable the system will be. There is no need to mark and annotate scene data accurately, and the system is self-learning as a whole. But its disadvantage is that the decision-making process is not intuitively understood so companies are more comfortable to take traditional rule-based approach to obtain continuous data accumulation and capacity enhancement. In the future, the combination of deep learning with supercomputers will eventually bring selfdriving cars to reality by training the system with a large amount of data.
AI has broad and important application prospects in vehicle safety and personalization, convenience/virtual assistance, besides autonomous driving functions. For example, fingerprint, iris, face and voice recognition can be accomplished by means of in-car optical scanner, NIR or visual camera, camera, audio transducer, to realize security & personalization like vehicle entry control, ignition switch control, mirror, seat, climate and music setting. Vital tracking and face/head/gaze TRK are carried out by cameras and on-board/wearable monitors for driver drowsiness/distraction/emotion monitoring and driver/passenger health condition monitoring. Voice/gesture activated HMI based on acoustic/language modeling, EEG analysis and other means, lip reading in noisy environments, and faster response with direct brainto-vehicle interaction, will also be applied in ICV.
5 Key issues and core challenges 5.1 The utmost concern of Self-Driving vehiclessafety With the great progress of sensor technology and computing power, safety is always the primary demand of ICV and the most concerned problem of autonomous vehicle.
It is necessary to incorporate safety into the design, to test and verify strictly, and to iterate between the two to achieve safe driving. To incorporate the safety into design, we should follow the systems engineering approach, adopt appropriate engineering tools such as FTA, FMEAs and HAZOP, adopt adaptive E/E architecture for service-oriented cloud services and OTA updates, and protect cybersecurity, including individual ECU protection, domain separation, internal network protection, intrusion detection, and gateways, firewalls, message authentication. Redundancy should be provided as well, such as redundant critical systems including computing, braking, steering, power source, communication path, collision detection and avoidance system and diagnostic monitoring, IMU and LiDAR for localization, multi-modality sensors of overlapping views. System self-detection, such as redundant system crossdetection. System self-check is also necessary, such as redundant system cross-check. The test/validation process includes not only conventional and SOTIF validations, but also the simulation/proving grounds/public road testing. There are many challenges, such as the driver is out of loop, the requirement is not always known, correct and complete or unambiguously specified, ISO26262 is not fully applicable, and behavior of probabilistic systems with non-deterministic algorithms is not repeatable even for identical initial conditions. It is also hard to verify inductive learning systems even with comprehensive training data, and complete testing is not feasible.
In addition, to ensure the safety of autonomous driving, technical support (MFG quality, cyber-security measures, rail-operational/fail-safe, HMI, etc.) and government laws are also needed, which are far more complex than traditional automobiles. Only by doing a good job in all aspects of the system engineering, can we finally win the public acceptance.

The real challenge -Edge cases handling
The edge case is a problem or situation that rarely occurs but can occur at an extreme operating parameter for autonomous vehicle. This is a special case usually involving input values that need special processing in program algorithms. In order to validate program behavior in this case, unit tests are needed to test the boundary conditions of algorithms, functions or methods. The edge case of autonomous driving involves handling unconventional events, including extreme weather conditions, other traffic participants' unpredictable behaviors, emergency vehicles, special road signs, etc. For example, how to identify pedestrians in wheelchairs on the sidewalk, how to judge whether children are chasing the ball behind when identifying football, and so on. The extreme pursuit of safety makes edge cases especially challenging. Object classification and decision making rely on machine learning algorithms, the effectiveness of which is directly related to data training. However, these edge cases don't happen frequently in real life and are therefore difficult to gather data on, so specialized training is required and it ' s time-consuming. What is more serious is that the extreme nature of edge cases may lead to millions of combinations that algorithms have not learned, which will inevitably lead to potential safety hazards [17]. Compared with autonomous driving, human drivers can easily handle these edge cases by sending subtle signals to each other (such as common social interaction used in road negotiation), however, even AGI can hardly fully understand and deal with these situations correctly. Waymo is at the forefront of the industry in this respect, and its test mileage has exceeded 20 million kilometers, so its algorithm system has been better trained, which also explains why its vehicles ' disengagement rate is significantly lower than competitors.

Conclusion
This study systematically analyzes the latest progress and trend of autonomous driving perception, decision and action technology, combined with the research on the application status and prospect of AI in ICV and the key problems and core challenges of autonomous driving, and makes a comprehensive judgment on the development of autonomous vehicle technologies. Autonomous vehicle technologies have made considerable progress that companies are comfortable to roll out L4 self-driving vehicles with careful design considerations including hardware and software redundancy, fail-operational/failsafe measures, cybersecurity mechanisms, geo-fenced operation, remote support, etc.. Meanwhile, the nature of inductive learning systems makes autonomous vehicles hard to validate even with comprehensive sets of training data, not to mention in the endless edge cases where data is not easily or feasibly collected. Therefore, the first batch of L4 vehicles must be operated in their ODD, which defines the safe operating conditions including geographies, roadway types, speed range, environmental conditions, and state and local traffic laws. Looking forward to the future, the ODD will gradually be expanded as autonomous vehicle development progresses, meanwhile the leap from L4 to L5 is still a great challenge, because the core technology of super nonlinear enhancement, the huge amount of data required for AI training and the new transportation system matching with the autonomous vehicle are all waiting for a long time. In this sense, it may take years or even decades for us to see millions of self-driving cars running in unconstrained environments. The development of automobile technology is not only a problem of the automobile itself, but also a comprehensive consideration from the perspective of ITS and intelligent city in the future [18].