Google announced today is launching a public discussion on developing new protocols and guidelines for how AI systems access and use website content.
In a blog entryGoogle wants to explore “technical and ethical standards to enable web publisher choice and control for emerging AI and research use cases.”
The announcement follows Google’s recent I/O conference, where the company discussed new AI products and its AI Principles, which aim to ensure that AI systems are fair, transparent and responsible.
Google’s blog post says:
“We believe everyone benefits from a vibrant content ecosystem. The key is for web publishers to have meaningful choice and control over their content, and opportunities to gain value from participating in the web ecosystem.”
Google acknowledges that technical standards like robots.txt were created almost 30 years ago and were developed before modern AI technologies that can analyze web data at scale.
Robots.txt allows publishers to specify how search engines crawl and index their content. However, it lacks mechanisms to address how AI systems can use data to train algorithms or develop new products.
Google invites members of the web and AI communities, including web publishers, academics, civil society groups, and its partners, to join a public discussion on the development of new ethical protocols and guidelines.
Google claims:
“We want this to be an open process and hope a wide range of stakeholders will engage to discuss how to balance AI progress with privacy, agency and data control.”
The discussion reflects a growing recognition that AI technologies can leverage web data in new ways that raise ethical challenges around data use, privacy, and bias.
By starting an open process, Google aims for a collaborative solution that addresses the interests of technology companies and content creators.
The outcome of these discussions could shape how AI systems interact with and use website data for years to come.
“The web has enabled a lot of progress, and AI has the potential to build on that progress,” says Google. “But we have to do it right.”
Criticism of Google’s data collection methods
Google’s announcement comes as it faces criticism over the amount of data it has already collected from around the web to train its AI systems and language models.
These data collection practices are outlined in an update to Google’s privacy policy.
Some in the SEO community argue that Google’s effort is too little, too late.
Barry Adams mocked the announcement on Twitter, saying:
“Now that we’ve trained our LLMs on all of your proprietary and copyrighted content, we’re finally going to start thinking about giving you a way to opt out of any of your future content being used to make us rich.”
Others argue that Google needs to do more to collect feedback in this process.
Nate Hake, travel trader, he tweeted:
“‘Starting a discussion’ really requires letting the other party SAY something. This is just an email capture form. There’s no field to give feedback. Not even a confirmation message.”
AI is data-driven, but how much is too much?
AI systems need large amounts of data to function, improve and benefit society. However, the more data AI has access to, the greater the risks to personal privacy.
There are difficult trade-offs between enabling AI progress and protecting people’s information.
There is debate over whether people should be able to opt out of AI using their public social media data. Some say that individuals should control their data, while others say that this holds back the advancement of AI.
Both sides present valid arguments, and we are far from a consensus on the correct policy approach.
looking ahead
Google’s call for discussion is a step in the right direction, but the company needs to move forward with implementing the feedback it receives.
Google is not alone in facing these challenges. All tech companies developing AI rely on data collected from the web. The discussion should involve the entire tech industry, not just Google.
Featured image: JDres/Shutterstock
[ad_2]
Source link