Google has unveiled a new tool called “Google-Extended,” offering website publishers the option to opt out of using their data to train Google’s AI models while still remaining accessible through Google Search. This initiative aims to strike a balance between data accessibility and privacy concerns.
With Google-Extended, websites can continue to be scraped and indexed by web crawlers like Googlebot while avoiding the utilization of their data for the ongoing development of AI models. This tool empowers publishers to have control over how their content contributes to enhancing Google’s AI capabilities.
Google emphasizes that Google-Extended enables publishers to “manage whether their sites help improve Bard and Vertex AI generative APIs.” Publishers can use a simple toggle to control access to their site’s content.
Google had previously confirmed its practice of training its AI chatbot, Bard, using publicly available data scraped from the web. The introduction of Google-Extended aligns with the company’s commitment to balancing data usage for AI development and respecting publishers’ preferences.
Google-Extended operates through robots.txt, the file that informs web crawlers about site access permissions. Google also indicates its intention to explore additional machine-readable methods for granting choice and control to web publishers as AI applications continue to expand. Further details on these approaches will be shared in the near future.