Reinforcement learning has been a cornerstone of the latest developments in artificial intelligence applications. Researchers have been leveraging reinforcement learning algorithms to bring avant grade models in robotics, gaming (AlphaGo), and self-driving vehicles in the past few years.
Reinforcement learning aims to direct how machine learning models, also known as agents, should act in a given environment. The scope of its use is expanding, attracting more interest from the scientific community.
However, the primary problem with most reinforcement learning algorithms is that they can only tackle the particular task they were trained on and cannot generalize across tasks or domains. This is because most reinforcement learning agents are trained on limited or single application-specific data. As a result, these agents tend to become overly reliant on the single extrinsic reward, reducing their capacity to generalize in the real world. Hence, scientists are working on building new RL models that can also provide satisfactory results in real-world scenarios. They are also working on devising an RL model that takes comparatively less amount of time to find out the best solution that yields maximum rewards.
One of the most exciting opportunities for reinforcement learning research has been motion planning in self-driving vehicles. A self-driving vehicle (or an autonomous car) is a vehicle that travels between locations without the assistance of a human driver using a mix of sensors, cameras, radar, and artificial intelligence (AI). To be considered entirely autonomous, a vehicle must be able to go to a predefined location without human intervention on roads that have not been redesigned for its usage.
The most important task for a self-driving vehicle is interacting with the surroundings. The first phase is perception, in which you must assume that the vehicle is traveling in an open context environment and train your model with all potential scenes and scenarios in the actual world. This is where a reinforcement learning agent comes in handy, taking environmental data and moving from one state to the next based on a set of rules to maximize rewards. These incentives can be either short-term, such as safe driving, or long-term, such as arriving at the destination early.
According to a report published on arXiv last Wednesday by scientists at the University of California, Berkeley, the team constructed a wheeled robot that can traverse kilometers across residential terrain. The robot stays on pathways and avoids barriers it hasn’t encountered before. It is critical to note that it does not map its environment, as some other systems have done, such as in AI algorithms for autonomous driving.
Instead of a detailed map, it uses heuristics gleaned from thirty hours of footage of prior trips and some overhead landscape maps to generate an enhanced schematic of how stations along the route connect to one another. Dhruv Shah, a Ph.D. candidate, and Sergey Levine, an assistant professor at UC Berkeley, co-authored the study titled “ViKiNG: Vision-Based Kilometer-Scale Navigation with Geographic Hints.” Last year, Shah and Levine presented a predecessor method named “RECON,” which stands for “Rapid Exploration Controllers for Outcome-driven Navigation” from the sound of system names, it is obvious that both ViKiNG and RECON heavily draw inspiration from reinforcement learning.
Over the course of 18 months, RECON was trained by having the wheeled robot, a Clearpath Robotics Jackal autonomous ground vehicle, do “random walks” across various locations such as parking lots and fields, capturing hours of footage via mounted RGB cameras, LiDAR, and GPS. RECON learned “navigational priors” thanks to a neural network that compressed and uncompressed picture input as an “information bottleneck,” a signal processing method first proposed by Naftali Tishby and colleagues in 2000.
Read More: DeepMind Trains AI Agent in a New Dynamic and Interactive XLand
During the test phase, RECON was presented with an image of a destination, e.g., a specific building, and tasked to figure out how to travel to that new location. RECON created an improvised map out of a graph of steps along a path to that destination. The Jackal robot was able to navigate up to 80 meters toward a destination in unfamiliar settings it had never experienced before using these tactics. It was able to do so even though every other method of robot navigation had failed to achieve the desired result.
Next, the University of California, Berkeley team expand RECON in one specific hint in ViKiNG, i.e., they provide either overhead satellite photos of the new landscape or overhead maps to Jackal’s software. Unlike RECON, which conducts an uninformed search, ViKiNG includes geographic hints in the form of estimated GPS locations and overhead maps, according to Shah. When exploring a new area, this allows ViKiNG to achieve faraway goals up to 25 times farther away than the farthest goal given by RECON, and to accomplish targets up to 15 times faster than RECON. When outfitted with ViKiNG, Jackal travels much beyond RECON’s 80 meters, traversing over 3 kilometers (nearly two miles) from start to finish.
Sources note that the ViKiNG program has included a further 12 hours of film from “teleoperated” trips, in which a human-led the Jackal to explore pathways like sidewalks or hiking trails to build up those preceding instances.
Further effort and trial-and-error testing are required to deal with a vehicle driving at high speeds and with unseen elements such as jay-walking people. The team is hopeful that the present study will lay the groundwork for full-scale autonomous cars. For now, the University of California describes, ViKiNG as the first step toward a “sidewalk delivery robot.” Simultaneously, this is a major win in the application of reinforcement learning in self-driving vehicles.