Wednesday, May 29, 2024
HomeDeveloperGoogle Launches Two New Datasets For Building Superior Conversational AI Models

Google Launches Two New Datasets For Building Superior Conversational AI Models

Google launches TimeDial and Disfl-QA datasets for making NLP models more natural. As natural conversations include interruptions and temporal relationships, it becomes complex for NLP models to engage in an interactive conversation. 

While interruptions are interjections, repetitions, restarts, and corrections, the temporal relationships include relationships between events — whether an event precedes or follows another.

According to Google, natural language processing models struggle to deliver superior results when confronted with interruptions and temporal relationships. Over the years, the development of NLP models to overcome this challenge has lacked due to several reasons. One of the most prominent problems is the absence of datasets that represents natural conversations.

Read more: IBM,‌ ‌MIT,‌ ‌and‌ ‌Harvard‌ ‌release‌ Common‌ ‌Sense‌ ‌AI ‌Dataset‌ ‌at‌ ‌ICML‌ 2021‌

To address the pressing challenge in Conversational AI, developers and researchers can leverage the newly released datasets — TimeDial and Disfl-QA — by Google. While TimeDial is for temporal commonsense reasoning in dialogue, Disfl-QA focuses on contextual disfluencies.

With an annotated test set of over 1.1k dialogs, TimeDial will allow you to test language models’ capabilities for temporal reasoning. Even large language models like GPT-3 struggle with temporal reasoning, making it difficult to accomplish better NLP models. 

“Disfl-QA is the first dataset containing contextual disfluencies in an information seeking setting, namely question answering over Wikipedia passages, with ~12k human annotated disfluent questions. These benchmark datasets are the first of their kind and show a significant gap between human performance and current state of the art NLP models,” mentioned Google’s researchers in a blog post.

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our WhatsApp Channel and Discord Server to be a part of an engaging community.

Ratan Kumar
Ratan Kumar
Ratan is a tech content writer who amasses inspiration from science fiction, cartoons, and psychology. Apart from writing, you can find him playing mobile games and depicting humans.


Please enter your comment!
Please enter your name here

Most Popular