In a clear demonstration of why AI leadership demands the best compute capabilities, NVIDIA today unveiled the world’s 22nd fastest supercomputer — DGX SuperPOD — which provides AI infrastructure that meets the massive demands of the company’s autonomous-vehicle deployment program.
The system was built in just three weeks with 96 NVIDIA DGX-2H supercomputers and Mellanox interconnect technology. Delivering 9.4 petaflops of processing capability, it has the muscle for training the vast number of deep neural networks required for safe self-driving vehicles.
Customers can buy this system in whole or in part from any DGX-2 partner based on our DGX SuperPOD design.
AI training of self-driving cars is the ultimate compute-intensive challenge.
A single data-collection vehicle generates 1 terabyte of data per hour. Multiply that by years of driving over an entire fleet, and you quickly get to petabytes of data. That data is used to train algorithms on the rules of the road — and to find potential failures in the deep neural networks operating in the vehicle, which are then re-trained in a continuous loop.
“AI leadership demands leadership in compute infrastructure,” said Clement Farabet, vice president of AI infrastructure at NVIDIA. “Few AI challenges are as demanding as training autonomous vehicles, which requires retraining neural networks tens of thousands of times to meet extreme accuracy needs. There’s no substitute for massive processing capability like that of the DGX SuperPOD.”
Powered by 1,536 NVIDIA V100 Tensor Core GPUs interconnected with NVIDIA NVSwitch and Mellanox network fabric, the DGX SuperPOD can tackle data with peerless performance for a supercomputer its size.
The system is hard at work around the clock, optimizing autonomous driving software and retraining neural networks at a much faster turnaround time than previously possible.
For example, the DGX SuperPOD hardware and software platform takes less than two minutes to train ResNet-50. When this AI model came out in 2015, it took 25 days to train on the then state-of-the-art system, a single NVIDIA K80 GPU. DGX SuperPOD delivers results that are 18,000x faster.
While other TOP500 systems with similar performance levels are built from thousands of servers, DGX SuperPOD takes a fraction of the space, roughly 400x smaller than its ranked neighbors.
And NVIDIA DGX systems have already been adopted by other organizations with massive computational needs of their own — ranging from automotive companies such as BMW, Continental, Ford and Zenuity to enterprises including Facebook, Microsoft and Fujifilm, as well as research leaders like Riken and U.S. Department of Energy national labs.
DGX SuperPOD tackles data with peerless performance for a supercomputer its size.Reference Architecture to Build Your Own SuperPOD
A DGX SuperPOD isn’t just lightning fast for running deep learning models.
It’s also remarkably quick to deploy due to its modular, enterprise-grade design.
While systems of this scale often take 6-9 months to deploy, the DGX SuperPOD took just three weeks, with engineers following a prescriptive, validated approach.
Building supercomputers like the DGX SuperPOD has helped NVIDIA learn how to design our systems for large-scale AI machines. It marks an important evolution in supercomputing technology, that’s bringing massive power out of academia and into transportation companies and other industries that want to use high performance computing to accelerate their initiatives.
Some need an illustrious career capped with a knighthood or royal wedding to climb the ranks of centuries-old English society. Last week, self-driving cars only needed a winding uphill road through the countryside to make their entrance.
NVIDIA partners set records on the racetrack and showed off new vehicles for spectators gathered at the Goodwood Festival of Speed, a global racing event that attracts 200,000 attendees.
The four-day event takes place on the grounds of the historic Goodwood House in West Sussex, England. Throughout the event, cars race along a 1.2-mile uphill course within the estate, featuring everything from classic race cars to the latest in automotive technology.
Last year, NVIDIA partner Roborace made its Goodwood debut with the first ever autonomous hillclimb. The autonomous racing startup came back to this year’s event with yet another driverless feat.
Using its latest vehicle, the DevBot 2.0, which can drive autonomously but also includes a cockpit for a human driver, Roborace completed the hillclimb with a combination of human and machine.
YouTube star Seb Delanney drove the DevBot partway up the course. In front of a large crowd of spectators, Delanney then exited the vehicle and waved it on its way as it completed the rest of the climb without a human driver.
Relying on sensors and the NVIDIA DRIVE platform, the DevBot seamlessly navigated the course’s twists and turns, traveling at speeds up to about 60 miles an hour.
Roborace's Devbot's amazing first autonomous Goodwood hillclimb - YouTube
The world firsts didn’t stop there. From debuts to self-driving splits, this year’s Goodwood Festival of Speed was the talk of the racing town.
To kick off the week, champion drifter Vaughn Gittin Jr. wowed fans — without even leaving his desk chair.
Photo by Dave Benett/Getty Images for Samsung
Driving for teleoperation startup Designated Driver, Gittin remotely drove a Lincoln MKZ using a virtual reality headset at a separate location on the estate. Using video transmitted from the car’s sensors via a 5G connection, he was able to drive the car using a desk setup of surrounding screens.
With NVIDIA technology onboard, the car was able to process data at high speeds with a latency under 70 milliseconds, making it possible for Gittin to burn rubber on high-speed drifts and take the car up the hillclimb.
Courtesy of Samsung
All About EVs
Even with a driver at the helm, the Porsche Taycan electric vehicle turned heads on the Goodwood course. The NVIDIA partner’s first EV went through its paces with the same grace and performance as its 911 cousin, a fixture at racetracks worldwide.
Porsche Taycan world debut at Goodwood Festival of Speed - YouTube
Autonomous delivery startup Kar-go showcased its battery-powered vehicle for Goodwood attendees. The futuristic vehicle leverages the NVIDIA DRIVE platform to process deep learning algorithms for driverless operation.
Photo by Sam Stephenson
Polestar, the performance car spinoff from Volvo Cars, kicked off the U.K. tour of its latest vehicle at Goodwood. The Polestar 2 is the company’s first all-electric vehicle, combining legendary performance with the next generation of powertrains.
Photo by Dominic James
After four days of racing and demos, the latest in automotive technology closed out this year’s Goodwood Festival of Speed with champagne dreams and caviar wishes.
Editor’s note: This is the latest post in our NVIDIA DRIVE Labs series, which takes an engineering-focused look at individual autonomous vehicle challenges and how NVIDIA DRIVE addresses them. Catch up on our earlier posts, here.
Lane markings are critical guides for autonomous vehicles, providing vital context for where they are and where they’re going. That’s why detecting them with pixel-level precision is fundamentally important for self-driving cars.
To begin with, AVs need long lane detection range — which means the AV system needs to perceive lanes at long distances from the ego car, or the vehicle in which the perception algorithms are operating. Detecting more lane line pixels near the horizon in the image adds tens of meters to lane-detection range in real life.
Moreover, a lane detection solution must be robust — during autonomous lane keeping, missed or flickering lane line detections can cause the vehicle to drift out of its lane. Pixel-level redundancy helps mitigate missed or flickering detections.
Deep neural network processing has emerged as an important AI-based technique for lane detection. Using this approach, humans label high-resolution camera images of lanes and lane edges. These images are used to train a convolutional DNN model to recognize lane lines in previously unseen data.
Preserving Precision in Convolutional DNNs
However, with convolutional DNNs, an inevitable loss of image resolution occurs at the DNN output. While input images may have high resolution, it’s largely lost as they are incrementally down-sampled as the convolutional DNN processes them.
As a result, individual pixels that precisely denoted lane lines and lane edges in the high-resolution input image become blurred at the DNN output. Critical spatial information for inferring the lane line/edge with high accuracy and precision becomes lost.
NVIDIA’s high-precision LaneNet solution encodes ground truth image data in a way that preserves high-resolution information during convolutional DNN processing. The encoding is designed to create enough redundancy for rich spatial information to not be lost during the downsampling process inherent to convolutional DNNs. The key benefits of high-precision LaneNet include increased lane detection range, increased lane edge precision/recall, and increased lane detection robustness.
Left: frame-by-frame, pixel-level lane detections from high-precision LaneNet. Right: pixel-level detections post-processed into lane lines. A few extra pixel-level lane detections near the horizon in the image translate to several tens of meters of added lane detection range in real life.
The high-precision LaneNet approach also enables us to preserve rich information available in high-resolution images while leveraging lower-resolution image processing. This results in more efficient computation for in-car inference.
With such high-precision lane detection, autonomous vehicles can better locate themselves in space, as well as better perceive and plan a safe path forward.
Safety is what drives us. It’s why we’re working with the automotive industry around the world to ensure automated vehicles meet the highest quality standards.
Drawing on our own safety and engineering experience, NVIDIA has been tapped to lead the European Association of Automotive Suppliers (CLEPA) working group on highly connected automated vehicles. This group, which consists of European automotive suppliers, contributes to the development of new UNECE assessment methods for automated vehicles.
In addition to CLEPA, NVIDIA also works with key international organizations formulating standards and regulations for automated vehicles. These include the International Standards Organization (ISO), the United Nations Economic Commission of Europe (UNECE), National Highway Traffic Safety Administration (NHTSA), and the Association for Standardization of Automation and Measuring Systems (ASAM).
These organizations, which count major automakers, suppliers and startups as members, are critical in developing regulations and standards for autonomous vehicles.
NVIDIA has a rich history of simulation technologies and functional safety. Our autonomous vehicle team holds invaluable experience in automotive safety and engineering. And we’re open about how we apply these learnings — much of it is detailed in a comprehensive report submitted to the National Highway Safety Administration.
As leaders of various working groups and technical advisories within global organizations, we are able to share this expertise with the industry and help ensure the best possible standards are put in place.
CLEPA is examining four main areas of validation: audit assessment, track testing, real world testing and virtual testing, or simulation. Simulation has become a powerful tool in automated vehicle development, and with platforms like NVIDIA DRIVE Constellation, manufacturers can put their technology through many miles of driving — including rare and hazardous scenarios — in a fraction of the time it would take to travel those distances in the real world.
This capability is especially valuable for validation and verification. Regulators can design specific tests for edge cases or other situations that are difficult to recreate in the real world without putting other road users in danger.
By leading the CLEPA working group, NVIDIA can contribute its experience in developing the cloud-based DRIVE Constellation platform to formulate comprehensive standards for this type of virtual validation.
NVIDIA is also working with Association for Standardization of Automation and Measuring Systems (ASAM), a standardization organization based in Germany that includes experts from OEMs, Tier-1s, tool vendors, engineering service providers and research institutes, to update the language and testing standards for simulation testing.
Through this collaboration, we are leading one of the working groups defining an open standard for creating simulation scenarios, describe road topology representation, sensor models, world models, as well as the criteria and key performance indices for the industry to advance validation methods for autonomous vehicle deployment.
Automated vehicles don’t just require new forms of validation, they also need updated standards for safety itself.
ISO 26262 is the functional safety standard for today’s vehicles. It covers the car’s hardware and low levels of software, defining specifications for parts to avoid causing failures. Automated vehicles, which rely on much higher levels of software and machine learning, require an entirely new way of thinking when it comes to functional safety.
To address this gap, the industry is developing a new standard, ISO 21448, known as Safety of the Intended Functionality (SOTIF). It seeks to avoid unreasonable risks that may occur, even if all of the vehicle components are operating correctly.
For example, if the deep neural networks operating in the vehicle misidentify a traffic sign or object in the road, it could create an unsafe situation even though the software has not malfunctioned.
Defining these conditions is complex, but through close collaboration, we can address this challenge. Representatives from NVIDIA make up two of the seven official technical experts representing the U.S. for ISO 26262 and SOTIF, including chairing the U.S. Technical Advisory Group. NVIDIA is also the international lead for ISO 26262 Part 10, which provides guidance on how to implement the standard.
By working together on these international standards, we can both share our experience and learn from others, building a strong foundation for the safe deployment of autonomous vehicles.
Simple rule: If you can’t judge distances you shouldn’t drive. The problem: judging distances is anything but simple.
We humans, of course, have two high-resolution, highly synchronized visual sensors — our eyes — that let us to gauge distances using stereo-vision processing in our brain.
A comparable, dual-camera stereo vision system in a self-driving car, however, would be very sensitive. If the cameras are even slightly out of sync, it leads to what’s known as “timing misalignment,” creating inaccurate distance estimates.
NVIDIA DRIVE Labs: Perceiving a New Dimension - YouTube
That’s why we perform distance-to-object detection using data from a single camera. Using just one camera, however, presents its own set of challenges.
Before the advent of deep neural networks, a common way to compute distance to objects from single-camera images was to assume the ground is flat. Under this assumption, the three-dimensional world was modeled using two-dimensional information from a camera image. Optics geometry would be used to estimate the distance of an object from the reference vehicle.
That doesn’t always work in the real world, though. Going up or down a hill can cause an inaccurate result because the ground just isn’t flat.
Such faulty estimates can have negative consequences. Automated cruise control, lane change safety checks and lane change execution all rely on judging distances correctly.
A distance overestimate — determining that the object is further away than it is — could result in failure to engage automatic cruise control. Or even more critically, a failure to engage automatic emergency braking features.
Incorrectly determining an obstacle is closer than it is could result in other failures, too, such as engaging cruise control or emergency braking when they’re not needed.
Going the Distance with Deep Learning
To get it right, we use convolutional neural networks and data from a single front camera. The DNN is trained to predict the distance to objects by using radar and lidar sensor data as ground-truth information. Engineers know this information is accurate because direct reflections of transmitted radar and lidar signals pro precise distance-to-object information, regardless of a road’s topology.
By training the neural networks on radar and lidar data instead of relying on the flat ground assumption, we enable the DNN to estimate distance to objects from a single camera, even when the vehicle is going up or down hill.
Camera DNN distance-to-object detection in a highway tunnel environment. Green bounding boxes denote object detections. The number shown at the top of each box is the radial distance in meters between the center of the ego car’s rear axle and the detected object.
In our training implementation, the creation and encoding pipeline for ground truth data is automated. So the DNN can be trained with as much data as can be collected by sensors — without manual labeling resources becoming a bottleneck.
We use DNN-based distance-to-object estimation in combination with object detection and camera-based tracking for both longitudinal, or speeding up and slowing down, and lateral control, or steering. To learn more about distance-to-object computation using DNNs, visit our DRIVE Networks page.
Volvo Group and NVIDIA are delivering autonomy to the world’s transportation industries, using AI to revolutionize how people and products move all over the world.
At its headquarters in Gothenburg, Sweden, Volvo Group announced Tuesday that it’s using the NVIDIA DRIVE end-to-end autonomous driving platform to train, test and deploy self-driving AI vehicles, targeting public transport, freight transport, refuse and recycling collection, construction, mining, forestry and more.
By injecting AI into these industries, Volvo Group and NVIDIA can create amazing new vehicles and deliver more productive services.
The two companies are co-locating engineering teams in Gothenburg and Silicon Valley. Together, they will build on the DRIVE AGX Pegasus platform for in-vehicle AI computing and utilize the full DRIVE AV software stack for 360-degree sensor processing, perception, map localization and path planning. They will also test and validate these systems using the NVIDIA DRIVE hardware-in-the-loop simulation platform.
Speaking a few hours later to 150 investors and media at the company’s annual event for the capital-markets community, Volvo Group CEO Martin Lundstedt praised its long partnership with NVIDIA, which continues to develop.
“Partnership is the new leadership,” he said, with NVIDIA founder and CEO beside him on stage. “If we are to succeed in the future with the speed, quality and safety, and to gain benefits of autonomous driving, we need to partner up with the best guys. In this world of unknowns, you need a partnership built on trust.”
Huang called the Volvo Group partnership a landmark for the trucking industry — the largest, most far-reaching of its kind — which will point the way to the future of transportation infused with technology.
“Our two industries have been separate since their founding,” he said. “Today we announced one partnership to develop the future together. For the first time, we can imagine supplying AI for something that wasn’t possible before, the automation of transportation.”
The Volvo Group will introduce solutions based on NVIDIA technology that enable fully autonomous vehicles and machines.
Apply AV technology to an entire lineup of trucks operating around the world, and the potential benefits become enormous. Industries from public and freight transport to forestry and construction become more efficient, with vehicles that can work longer and travel farther.
Truckload of Demand
The demands of today’s online shopping are putting even greater stress on the world’s transport systems. Expectations for overnight or same-day deliveries create challenges that can be addressed by autonomous trucks.
Already, more than 35 million packages worldwide are delivered each day, which is growing up to 28% percent annually. By 2040, delivery services will have to travel another 78 billion miles each year to handle goods ordered online, according to consultancy firm KPMG.
Autonomous trucks are arriving just in time to meet this demand. They can operate 24 hours a day, improving delivery times, and with increased efficiency, can bring down the annual cost of logistics in the U.S. by 45% — between $85 billion and $125 billion, according to experts at McKinsey.
From automating short, routine trips like the loading and unloading of containers on cargo ships and managing port operations, to autonomously driving on the highway, Volvo’s new generation of vehicles can dramatically streamline the shipping industry.
By combining the high-performance, end-to-end NVIDIA DRIVE solutions with the scale of the second-largest truck maker globally, NVIDIA and Volvo Group can bring the efficiencies of autonomous trucking to the world’s markets sooner.
And before those vehicles reach the road, Volvo Group will be utilizing NVIDIA DRIVE Constellation to test and validate AVs, ensuring they can handle diverse operating challenges all over the world.
By leveraging hardware-in-the-loop simulation, the companies can test the autonomous driving systems on the same hardware and software that will run in the vehicle, at a significantly greater scale.
With a partnership capable of delivering crucial autonomy at a global level, NVIDIA and Volvo Group are ready for the long haul.
Thanks to GPUs, practical vehicle design doesn’t have to be boring.
Last month, Volkswagen used the latest in simulation and compute technologies to tailor its design process for both fuel efficiency and style.
Striking the elusive balance between form and function called for a collaborative effort. Altair, a global technology company, developed a computational fluid dynamics (CFD) solver using NVIDIA GPUs running on Amazon Web Services’ cloud, allowing the automaker to test the aerodynamics of vehicle designs in simulation.
With a GPU-based system, Volkswagen could save up to 70 percent of its hardware costs, according to Altair. And it could execute design cycles in a fraction of the time it currently takes to simulate and measure aerodynamics.
This makes it possible to design for both fuel efficiency and design elements simultaneously.
Clean Lines, Cleaner Emissions
To help curb overall emissions, the auto industry is working toward improving fuel efficiency across their lineups. By next year, the U.S. Environmental Protection Agency mandates that automakers’ average fuel efficiency of their entire fleet must be 36.9 miles per gallon or higher.
While breakthroughs in hybrid and battery electric vehicle technology have helped move the industry toward these greener targets, other strategies are helping extend the time between trips to the pump.
Primary among these is aerodynamics. By improving the way a vehicle moves through the air (reducing the coefficient of drag), automakers can save significantly on fuel usage.
However, an aerodynamic design doesn’t always agree with style and function. Think of a racecar, which relies on exceptional aerodynamics for speed, but may not make for the most comfortable daily commute.
Testing in the Virtual World
Simulation has become a powerful tool for automakers to test out new designs and technology before a new model hits the road. However, many traditional forms of design simulation can be slow to implement, making it harder to successfully engineer for optimized aerodynamics and style at the same time.
With Altair’s ultraFluidX CFD offering, Volkswagen solved some of the major hurdles to designing sleek new vehicles efficiently.
The automaker trialed the collaborative system on one of its most popular models, the Volkswagen Jetta. The CFD solver — which leverages NVIDIA V100 Tensor Core GPUs in the cloud — made it possible to predict aerodynamic performance in real time. This in turn accelerated the design simulation process and the ability to test out a variety of designs faster.
“We were able to run 200 car shape variants in a time frame that would normally correspond to only a few runs with our current operational tools,” said Henry Bensler, head of computer-aided engineering at Volkswagen Group Research.
With the performance of GPU technology, Volkswagen engineers were able to improve their simulation results, producing impressive robustness and efficiency, without sacrificing style.
To learn more about the vehicle design trial, read the full case study here.
Snail mail has a new set of futuristic, faster wheels.
NVIDIA DRIVE partner and autonomous trucking startup TuSimple has been hauling mail more than 1,000 miles between Phoenix and Dallas as part of a two-week pilot with the U.S. Postal Service.
Halfway through the test, the self-driving prototypes from TuSimple — which is also an NVIDIA Inception member — have been arriving at the delivery hubs earlier than expected.
“In just a week, we’ve been able to operate safe and efficient deliveries autonomously,” said Chuck Price, chief product officer at TuSimple.
The pilot consists of five round trips, each consisting of nearly 2,200 miles along the I-10, I-20 and I-30 corridors. The commonly traveled route typically takes human drivers about 48 hours to complete.
This trip length creates a logistical challenge for shipping companies like the USPS. Regulations limit truckers to 11 hours at a stretch and there’s a growing driver shortage. The American Trucking Association estimates the industry is short 50,000 drivers, a number that is expected to more than triple to 175,000 by 2024.
By incorporating autonomous driving technology into these long-haul trips, shippers can improve efficiency, ease the strain on drivers and deliver more goods faster.
Partnership for the Long Haul
While TuSimple’s trucks can operate on surface streets and highways, for this project it was agreed to start with highway only, with two human operators supervising the system. This type of geofenced autonomous driving is known as Level 4.
Achieving this level of autonomy requires high-performance, energy-efficient compute capability, which enables the vehicle to process sensor data and perceive objects in real time.
TuSimple’s trucks use NVIDIA technology to perform onboard processing as well as to train its deep learning algorithms to recognize specific objects like traffic signs and emergency vehicles.
“NVIDIA is the only company able to deliver the technology we need to achieve these milestones,” Price said.
The Roads More Traveled
The USPS pilot is just the start for TuSimple’s autonomous trucks delivering goods across state lines.
The startup already has 15 contracts with shipping companies and travels routes around Tucson, Ariz. Sixty percent of economic activity in the U.S. lies in the freight that travels along the I-10 corridor, which connects the southwestern states and makes up a significant portion of the current pilot.
With the help of NVIDIA DRIVE technology, TuSimple plans to expand its efficient autonomous trucking technology to every corner of the U.S.
“We wouldn’t be able to do it without NVIDIA,” Price said.
When it comes to autonomous vehicle sensor suites, one size does not fit all.
From Level 2+ automated driving to fully autonomous robotaxis, and from cars to shuttles to trucks, the wide range of autonomous vehicle types requires a variety of different sensors to safely operate. These can include different sensor types and models from a number of different manufacturers.
That’s why NVIDIA built DRIVE AGX as an open platform to seamlessly incorporate new sensors for more efficient autonomous driving development.
Sensors are a key component to making a vehicle driverless. Cameras, radar and lidar enable an autonomous vehicle to visualize its surroundings, detect objects and implement interior features such as driver monitoring and customized passenger experiences.
Typically, autonomous driving developers tailor algorithms to a specific sensor suite. This means any time a new sensor is added or the configuration changes, developers must rewrite the software.
DRIVE AGX helps speed up this development with the DriveWorks sensor abstraction layer (SAL), which allows developers to change or add sensors without having to make major software changes.
Keep on Plugin
The DriveWorks software development kit (SDK) runs on the DRIVE AGX AI supercomputer and, with the addition of the DriveWorks SAL, can enable plugins for sensors that aren’t already incorporated on the platform.
The plugins consist of a few lines of code that translate the new sensor’s data into the DriveWorks generalized data structures.
For example, an algorithm running on the DRIVE AGX platform will be able to use point cloud data from a variety of different lidar sensors — which use lasers to sense the environment and help build a 3D understanding of the vehicle’s surroundings — without having to rewrite the software.
To integrate new sensors into the DriveWorks SDK and DRIVE AGX platform, developers only need to implement a few functions to create a plugin. However, there is flexibility to make the plugin as complex as needed. Once the sensors are integrated with DriveWorks, developers have all the optimized algorithms and tools right at their fingertips.
A Growing Ecosystem
Because the DRIVE AGX platform is open, a wide ecosystem of automakers, suppliers, software startups, mapmakers, trucking companies and even sensor manufacturers can leverage it for flexible and efficient autonomous vehicle development.
To learn more about integrating new sensors into the DriveWorks SDK, register for our upcoming webinar, and read more about DRIVE AGX.
Editor’s note: This is the latest post in our NVIDIA DRIVE Labs series. With this series, we’re taking an engineering-focused look at individual autonomous vehicle challenges and how the NVIDIA DRIVE AV Software team is mastering them. Catch up on our earlier posts, here.
MISSION: Predicting the Future Motion of Objects
APPROACH: Recurrent Neural Networks (RNNs)
From distracted drivers crossing over lanes to pedestrians darting out from between parked cars, driving can be unpredictable. Such unexpected maneuvers mean drivers have to plan for different futures while behind the wheel.
If we could accurately predict whether a car will move in front of ours or if a pedestrian will cross the street, we could make optimal planning decisions for our own actions.
Autonomous vehicles face the same challenge, and use computational methods and sensor data, such as a sequence of images, to figure out how an object is moving in time. Such temporal information can be used by the self-driving car to correctly anticipate future actions of surrounding traffic and adjust its trajectory as needed.
The key is to analyze temporal information in an image sequence in a way that generates accurate future motion predictions despite the presence of uncertainty and unpredictability.
To perform this analysis, we use a member of the sequential deep neural network family known as recurrent neural networks (RNNs).
What Is an RNN?
Typical convolutional neural networks (CNNs) process information in a given image frame independently of what they have learned from previous frames. However, RNN structure supports memory, such that it can leverage past insights when computing future predictions.
RNNs, thus, feature a natural way to take in a temporal sequence of images (that is, video) and produce state-of-the-art temporal prediction results.
With their capacity to learn from large amounts of temporal data, RNNs have important advantages. Since they don’t have to only rely on local, frame-by-frame, pixel-based changes in an image, they increase prediction robustness for motion of non-rigid objects, like pedestrians and animals.
RNNs also enable the use of contextual information, such as how a given object appears to be moving relative to its static surroundings, when predicting its future motion (that is, its future position and velocity).
Using Cross-Sensor Data to Train RNNs
Radar and lidar sensors are very good at measuring object velocity. Consequently, in our approach, we use data from both to generate ground truth information to train the RNN to predict object velocity rather than seeking to extract this information from human-labeled camera images.
White boxes indicate current object locations predicted by the RNN, while the yellow boxes are the RNN’s predictions about where these objects will move in the future.
Specifically, we propagate the lidar and radar information into the camera domain to label camera images with velocity data. This lets us exploit cross-sensor fusion to create an automated data pipeline that generates ground truth information for RNN training.
The RNN output consists of time-to-collision (TTC), future position and future velocity predictions for each dynamic object detected in the scene (for example, cars and pedestrians). These results can provide essential input information to longitudinal control functions in an autonomous vehicle, such as automatic cruise control and automatic emergency braking.
With RNNs’ ability to learn from the past, we’re able to create a safer future for autonomous vehicles.