Google Launches Two New Datasets For Building Superior Conversational AI Models

August 8, 2021

Google launches TimeDial and Disfl-QA datasets for making NLP models more natural. As natural conversations include interruptions and temporal relationships, it becomes complex for NLP models to engage in an interactive conversation.

While interruptions are interjections, repetitions, restarts, and corrections, the temporal relationships include relationships between events — whether an event precedes or follows another.

According to Google, natural language processing models struggle to deliver superior results when confronted with interruptions and temporal relationships. Over the years, the development of NLP models to overcome this challenge has lacked due to several reasons. One of the most prominent problems is the absence of datasets that represents natural conversations.

To address the pressing challenge in Conversational AI, developers and researchers can leverage the newly released datasets — TimeDial and Disfl-QA — by Google. While TimeDial is for temporal commonsense reasoning in dialogue, Disfl-QA focuses on contextual disfluencies.

With an annotated test set of over 1.1k dialogs, TimeDial will allow you to test language models’ capabilities for temporal reasoning. Even large language models like GPT-3 struggle with temporal reasoning, making it difficult to accomplish better NLP models.

“Disfl-QA is the first dataset containing contextual disfluencies in an information seeking setting, namely question answering over Wikipedia passages, with ~12k human annotated disfluent questions. These benchmark datasets are the first of their kind and show a significant gap between human performance and current state of the art NLP models,” mentioned Google’s researchers in a blog post.

Google Launches Two New Datasets For Building Superior Conversational AI Models

LEAVE A REPLY Cancel reply

Most Popular

Unlocking Tomorrow: The Future of Artificial Intelligence and Its Impact on Our Lives

Data Structures: A Beginner’s Guide to Organizing Information Efficiently

Google Launches Two New Datasets For Building Superior Conversational AI Models

Subscribe to our newsletter

RELATED ARTICLES

Concepts and Workflow of MLOps

Machine learning projects with source code

List of Bug Bounty Platforms for Cyber Security

LEAVE A REPLY Cancel reply

Most Popular

Unlocking Tomorrow: The Future of Artificial Intelligence and Its Impact on Our Lives

Data Structures: A Beginner’s Guide to Organizing Information Efficiently