Engineering Autonomous Vehicles and Robots. Shaoshan Liu
level. The spatial feature layer is particularly useful in less-crowded open environments such as the countryside.
4 The fourth layer is the semantic layer, which contains lane labels, traffic light and traffic sign labels, etc. The semantic layer aids vehicles in making planning decisions such as routing.
1.4 Modular Design
Before we go into the details of the rest of this book, let us briefly go over the modular design methodology and introduce each module. Hopefully with this introduction, readers will be able to easily follow the contents of this book.
Figure 1.2 shows a DragonFly Pod [13], a low-speed autonomous passenger pod built utilizing the modular design methodology described in this book. This vehicle consists of multiple components, a RTK GNSS module for localization, a DragonFly computer vision module for localization (using visual inertial odometry technology) and active perception, a mmWave radar and a sonar for passive perception, a planning and control module for real-time planning, and a chassis module. Figure 1.3 shows the architecture diagram of this design and shows how the modules interact with each other.
Figure 1.2 Modular design of a DragonFly Pod.
Figure 1.3 Modular design architecture.
1.4.1 Communication System
First, to connect different modules to form a working system, a reliable communication system is needed. The Controller Area Network (CAN) bus is the most widely used in-vehicle communication network today due to its simplicity, and it can be used to connect Electronic Control Units (ECUs), sensors, and other components to enable communication with each other. Before going into the details of other components, readers should first understand how the CAN bus works.
1.4.2 Chassis
The traditional vehicle chassis utilizes mechanical control, such as mechanical cables, hydraulic pressure, and other ways of providing a driver with direct, physical control over the speed or direction of a vehicle.
However, for autonomous driving to work, we need a drive-by-wire-ready chassis such that the chassis can apply electronic controls to activate the brakes, control the steering, and operate other mechanical systems. Specifically, the chassis module provides the basic application program interfaces for the planning and control module, such that the planning and control module can perform steer, throttle, and brake actions to make sure that the vehicle travels on the planned trajectory.
1.4.3 mmWave Radar and Sonar for Passive Perception
For mid-range obstacle detection, we can apply 77 GHz mmWave radar such that the planning and control module can make decisions when obstacles are detected. Similarly, sonars cover near-range obstacles and act as the very last line of defense; once sonars detect an obstacle, they directly signal the chassis to stop to minimize risks of an accident.
mmWave radar and sonar sensors can be combined and used for passive perception. By passive perception, we mean that when obstacles are detected, the raw data are not fed to the planning and control module for decision making. Instead, the raw data are directly sent to the chassis through the CAN bus for quick decision making. In this case, a simple decision module is implemented in the chassis to stop the vehicle when an obstacle is detected within a short range.
The main reason for this design is that when obstacles are detected in close range, we want to stop the vehicle as soon as possible instead of going through the complete decision pipeline. This is the best way to guarantee the safety of passengers as well as pedestrians.
1.4.4 GNSS for Localization
The GNSS system is a natural choice for vehicle localization, especially with RTK capability, GNSS systems can achieve very high localization accuracy. GNSS provides detailed localization information such as latitude, longitude, altitude, as well as vehicle heading. Nonetheless, GNSS accuracy suffers when there are buildings and trees blocking an open sky, leading to multipath problems. Hence, we cannot solely rely on GNSS for localization.
1.4.5 Computer Vision for Active Perception and Localization
Computer vision can be utilized for both localization and active perception. For localization, we can rely on visual simultaneous localization and mapping (VSLAM) technologies to achieve accurate real-time vehicle locations. However, VSLAM usually suffers from cumulative errors such that the longer the distance the vehicle travels, the higher the localization error. Fortunately, by fusing VSLAM and GNSS localizations, we can achieve high accuracy under different conditions, because GNSS can be used as the group-truth data when it is not blocked, and VSLAM can provide high accuracy when GNSS is blocked.
In addition, computer vision can be used for active perception as well. Using stereo vision, we can extract spatial or depth information of different objects; using deep learning techniques, we can extract semantic information of different objects. By fusing spatial and semantic information, we can detect objects of interest, such as pedestrians and cars, as well as getting their distance to the current vehicle.
1.4.6 Planning and Control
The planning and control module receives inputs from perception and localization modules, and generates decisions in real time. Usually, different behaviors are defined for a planning and control module and under different conditions, one behavior is chosen.
A typical planning and control system has the following architecture: first, as the user enters the destination, the routing module checks the map for road network information and generates a route. Then the route is fed to the behavioral planning module, which checks the traffic rules to generate motion specifications. Next, the generated route along with motion specifications are passed down to the motion planner, which combines real-time perception and localization information to generate trajectories. Finally, the generated trajectories are passed down to the control system, which reactively corrects errors in the execution of the planned motions.
1.4.7 Mapping
A mapping module provides essential geographical information, such as lane configurations and static obstacle information, to the planning and control module. In order to generate real-time motion plans, the planning and control module can combine perception inputs, which detect dynamic obstacles in real time, localization inputs, which generate real-time vehicle poses, and mapping inputs, which capture road geometry and static obstacles.
Currently, fully autonomous vehicles use high definition 3D maps. Such high precision maps are extremely complex and contain a trillion bytes of data to represent not only lanes and roads but also semantic and locations of 3D landmarks in the real world. With HD maps, autonomous vehicles are able to localize themselves and navigate in the mapped area.
1.5 The Rest of the Book
In the previous sections we have introduced the proposed modular design approach for building autonomous vehicles and robots. In the rest of the book, we will delve into these