Friday, December 20, 2024
ad
HomeNewsCMU and Meta AI Researchers Propose Reinforcement Learning Approach HACMan

CMU and Meta AI Researchers Propose Reinforcement Learning Approach HACMan

HACMan's first technical innovation suggests an object-centric, spatially anchored, and temporally abstracted action representation.

A method has been put forth by researchers from Carnegie Mellon University and Meta AI to carry out challenging non-prehensile manipulation tasks and generalize across item geometries with flexible interactions. They offer a reinforcement learning (RL) method for non-prehensile manipulation based on point cloud data called Hybrid Actor-Critical Maps for Manipulation (HACMan). 

HACMan’s first technical innovation suggests an object-centric, spatially anchored, and temporally abstracted action representation. The agent selects a set of motion parameters to guide its subsequent behavior after deciding where to make contact. The location of the contact is determined by the point cloud of the seen object, providing a firm spatial foundation for the conversation. They separate the most contact-rich portions of the action for learning, but this unintendedly results in the robot’s decisions being more abstract in terms of time. 

The suggested action representation is implemented using an actor-critic RL framework, which is the second technological development produced by HACMan. Since motion parameters are specified over a continuous action space, the action representation is in a hybrid discrete-continuous action space. 

Read More: OpenAI Closes $300 Million Funding Round Between $27-$29 billion Valuation

Contrarily, the position of a contact is determined over a discrete action space (by selecting a contact point from the object point cloud’s points). HACMan’s critic network forecasts Q-values at each pixel over the object point cloud while the actor-network creates continuous motion parameters for each pixel. 

In contrast to conventional continuous action space RL algorithms, the per-point Q-values are used to update the actor and score while selecting the contact position. They modify a conventional off-policy RL algorithm’s updating method to take into account this new hybrid action space. With random initial and target postures and different item forms, they employ HACMan to perform a 6D object pose alignment assignment. 

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our WhatsApp Channel and Discord Server to be a part of an engaging community.

Sahil Pawar
Sahil Pawar
I am a graduate with a bachelor's degree in statistics, mathematics, and physics. I have been working as a content writer for almost 3 years and have written for a plethora of domains. Besides, I have a vested interest in fashion and music.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular