In the world of self-driving cars, there are four technical areas needing consideration for successful navigation of public roads. Each comes with unique obstacles, and each one must work in synchronization with the others. These four areas are: sensing, localization, path planning, and execution.
While some are more solved than others (execution being the most mature of the four), a few areas still have some significant barriers to overcome. For camera based systems, sensing is still in the early stages of development. And in regards to localization and path planning, there is an endless stream of unique obstacles to overcome when you consider that a system must work in every type of environment, from rural back roads in the Midwest to the snow-covered downtown gridlock in Boston. Below is a more detailed expansion of these four areas and the types of problems and solutions that each one involves.
Being the first step, an unmanned vehicle’s intelligent system must take in the surrounding environment and everything within it, such as road layouts, other vehicles, pedestrians, bicycles, traffic lights, and more. This can be accomplished with a variety of sensors that work together in a process called sensor fusion. Below is a set of charts displaying the advantages and disadvantages of each:
Ultra-sonic and radar are mature technologies that have low production costs. Many cars today employ these for tasks such as parking sensors, blind-spot monitoring, and emergency braking. They are not designed for granular tasks but instead provide a good picture of the surrounding environment and are very reliable. Ultra-sonic sensors are used for up close tasks, such as parking or tight traffic, while radar can have a significant range better suited for highway driving.
Passive visual is just another name for cameras. This is what you will find on leading consumer technologies, the most famous being Tesla’s Enhanced Autopilot software that uses a multitude of cameras around the car to provide human-like sensory perception of the environment. The systems are not costly, but they require a large amount of processing power and deep-learning algorithms to make sense of the data. Some in the industry think everything can be solved with just cameras (Tesla, Comma.ai), while others think it must be integrated with other more reliable technologies, such as LIDAR. Depth perception does not come naturally and must be worked out with various algorithms, and even then it is only a best estimate and not perfect.
LIDAR uses a spinning 360-degree vertical stack of lasers to paint a picture of solid reflections around the vehicle. It provides a very accurate landscape and reliably makes use of depth perception, which is important for a moving vehicle. Expense of this technology is a consideration though, with the flagship devices costing $10-20,000 each, though costs are quickly being driven down as more manufacturers enter the market. This is seen by many as a requirement for successful deployment of vehicles, with Waymo (Google) leading the way, supposedly building its own LIDAR devices now to keep costs down. It is lighting agnostic, working in both day and night times, as well as in situations with too much contrast for a traditional camera to handle, such as, for example, shadows and direct sunlight.
This task involves taking in all the raw data from the sensors, creating a virtual map of the environment, and tracking the vehicle’s location within it. While there are currently technologies designed just for this task, such as GPS, they are not reliable or precise enough for the task of navigating at speeds of up to 80 mph. If you are traveling down the interstate at 70 mph, you will travel over 100 feet each second, so every bit of accuracy matters. This can be a problem when GPS is only accurate within 5+/- meters.
Using on-board sensor fusion, you can bypass the need for external sources of information, provided you have a detailed map of your environment. If your radar detects a specific grouping of trees 30 m ahead and then passes a unique intersection design on its left, it can use that knowledge from the map to know that there is no other coordinate that would result in that specific situation of objects, thereby localizing the car.
Mapping is a topic that many companies are focusing on now, where they create detailed environments of areas that are most likely to be driven through, such as urban city centers and such so that they can then be used by vehicles in the future to know what is around the corner before they get there. Of course this does not take into account mobile objects such as other vehicles, but it provides a strong baseline for planning a path, which is the next step in synchronization.
Another tool to aid localization is the IMU (Internal Measurement Unit), which senses the forces acted on it to discern changes in vehicle direction. These can be integrated within GPS to increase accuracy, such as when a car enters a tunnel and loses signal. If you are halfway through the tunnel and come upon a fork, the IMU can detect the yaw and report back to the navigation software accordingly. This can also be useful when the GPS signal is working as intended, as the usual latency of pings may not respond quickly enough in all situations.
Once you know your surroundings, you can begin planning a path to complete your goal of reaching a specific location. There are many different algorithms you can use that range from brute-force computation of all possible paths (not very likely) to employing various heuristics to ease the burden. In the early days of GPS, it could take significant amounts of time to calculate a route due to the exponentially increasing number of possibilities that comes with increasing distance from A to B. Optimally, other ‘rules’ are also considered, as follows:
- Speed limit
- Maximum acceleration — 10 m/s² for typical comfort
- Maximum jerk (as a derivative of acceleration) — 10 m/s³
- Do not hit other cars
- Stay on the road
When developing an algorithm, the focus is on finding the lowest cost route, in which cost can be a combination of metrics such as time, legality, safety, comfort, and distance. You would develop a cost function for your specific needs that weights these metrics and adds them up for a total cost of each route. Below you can see a more informative collection of these considerations that will all go into the model you could create.
Some of the more famous path traversal algorithms one may learn in an introductory computer science course are Djikstra’s and A*. Though these are much too simplistic for a modern autonomous vehicle, they provide a good basis for thinking about the costs and heuristics associated with the different methods of finding paths.
One method of improving this simplistic models is the hybrid A* approach, which takes into account real-world limitations of a car, such as the steering angle and turn radius, factors that may make it much more difficult to navigate tight quarters compared to a human just walking around a room. The control of a standard vehicle is referred to as non-holonomic when the controllable degrees of freedom are less than the total degrees of freedom the vehicle exists in. While a car may exist in both the X and Y axis as well as heading (orientation), it can only actuate using forward/back and altering the steering angle. These limitations can be witnessed by attempting to parallel park in tight quarters. One issue with this approach is that errors can compound over time, so it is good to work additionally with other methods to keep an accurate baseline.
The final step in the system loop is to execute the final decision by the path planning model and attempt to follow the optimal path as closely as possible. Though just following a chosen path comes with numerous issues that must be considered, as there will always be a bit of variance (hopefully small) due to the imprecise nature of sensors and vehicle controls. Following an intended path exactly is less than ideal. Imagine a situation when the car realizes it is 2 feet to the right of its current intended route; how should it attempt to get back on track? Immediately cranking the steering to full-lock and then snapping it back to straight as soon as it returns to the path causes unnatural movement and could be unsafe on the road. Safe execution entails the world of motion control, where you may imagine a secondary, more short-term path in addition to the main routing path that leads to your destination.
The image below is a visual representation of the long-term planned navigation path (blue) and the short-term vehicle control path (red). The red line will take into consideration the car’s abilities, passenger comfort, and overall safety.
The distance between current location relative to the center of the lane (ideal position) is called the cross-track error (CTE). This is one of the costs for which the above red line will be attempting to keep its distance from the blue line as small as possible. Other errors that deserve consideration include the heading and velocity errors, and these can be summed up with different weightings to come up with an overall cost that is best kept arithmetically minimal with certain techniques and constraints. One obvious consideration may be braking force, where you can theoretically apply greater than 1 g of negative acceleration in most vehicles, though it would certainly be most uncomfortable for the passengers and unsafe for other drivers. Therefore, when coming to a stop sign, you would ideally compute a long and smooth deceleration, rather than what may be the technically optimal situation of maintaining the speed limit followed by full braking when you arrive at the intersection.
Another consideration includes the second derivative of velocity, jerk. This is the sudden change in acceleration that can be felt when standing on a train at the moment it begins moving, or the whiplash from quick turns on roller coasters. This can be solved with simulated shock absorbers for the controls, where the rate of change for the steering angle or throttle/brake are limited to a specific amount. Lingering too much when changing lanes could impede the natural flow of traffic, but cars swinging about quickly also do not provide a comfortable ride. Constraints can be implemented that state the steering angle can only be changed up to ~30 degrees per second, so as to minimize the overall jerking sensation.
Ideally, you will end up with a final control model that involves all constraints and computes the total cost for each action, and then minimizes that cost algorithmically. A common method for this is referred to as an MPC Controller, which is a system of process control that is commonly used in industries requiring automated control of production processes, such as oil refineries or power stations.
Repeat the loop
The subsystems within an autonomous vehicle work in a closed-loop fashion involving all of the above four steps. At each discrete time step, the sensors recapture and process the car’s position and then create an updated path plan for the next steps. Below you can see the complete picture of the flow of data from beginning to end. Typically, each component of the system will not operate at the same cadence or frequency, sensor data will be analyzed as it arrives, and path planning may run at a specified rate, for example, 10 ms, where it will take in the most recent data from each sensor and compute the optimal path for the next 5 seconds.
About the Author
David Rose is a machine learning engineer at StrategyWise where he spends most of his time developing predictive models for clients. Outside the office he explores the frontiers of deep-learning in relation to image recognition, and tinkering with robotic systems and controls. Despite hopes of the coming autonomous vehicle revolution he enjoys the personal freedoms and thrills of manual control, and spends weekends exploring the back roads on his motorcycle and racing it at the track.
Further knowledge is available from courses through Udacity, including the Self-Driving Cars, Machine Learning, and ROS programs. Click on the links below for an overview of some of these programs:
The linked paper below provides a more rigorous mathematical background to many of the topics discussed above. It was published at MIT a couple years ago and has a strong focus on motion planning.
Microsoft has been developing a strong simulator for testing different models and techniques in autonomous vehicles. It is open-sourced on GitHub and can model both ground vehicles and drones.