Google won’t comment on a potentially massive leak of its search algorithm documentation

Google won't comment on a potentially massive leak of its search algorithm documentation

Google’s search algorithm is perhaps the most consequential system on the Internet, dictating which sites live and die and what web content looks like. But how exactly Google ranks websites has long been a mystery, held together by journalists, researchers and people working in search engine optimization.

Now, an explosive leak that’s meant to show off thousands of pages of internal documents appears to offer an unprecedented look under the hood of how search works, and suggests that Google hasn’t been entirely truthful about it for years. Google has so far not responded to multiple requests for comment on the legitimacy of the documents.

Rand Fishkin, who worked in SEO for more than a decade, says a source shared 2,500 pages of documents with him in hopes that reporting the leak would counter “lies” Google employees had shared about how the search algorithm works. The documents describe Google’s search API and break down what information is available to employees, according to Fishkin.

The details Fishkin shares are dense and technical, probably more readable for developers and SEO experts than for laypeople. The content of the leak is also not necessarily proof that Google uses the specific data and signals it mentions for search rankings. Rather, the leak describes what data Google collects from web pages, sites and search engines and offers indirect advice to SEO experts about what Google seems to be concerned about, like SEO expert Mike King. he wrote in your overview of the documents.

The leaked documents touch on topics such as what kind of data Google collects and uses, which sites Google elevates for sensitive topics like elections, how Google handles small websites, and more. According to Fishkin and King, some information in the documents appears to conflict with public statements by Google representatives.

“‘Lied’ is harsh, but it’s the only accurate word used here,” King writes. “While I don’t necessarily want to fault Google’s public representatives for protecting their proprietary information, I do take issue with their efforts to actively discredit people in the worlds of marketing, technology, and journalism who have come up with reproducible discoveries.” .

Google did not respond to The Verge’s requests for comment on the documents, including a direct request to refute their legitimacy. Fishkin told The Verge in an email that the company has not disputed the veracity of the leak, but that an employee asked him to change some language in the post about how an event was characterized.

Google’s secret search algorithm has spawned an entire marketing industry that closely follows Google’s public instructions and executes them for millions of businesses around the world. The widespread, often annoying, tactics have led to a general narrative that Google Search results are getting worse, full of junk that website operators feel compelled to produce to get their sites seen. In response to The Verge’s previous reports on SEO-driven tactics, Google representatives often resort to a familiar defense: That’s not what Google Guidelines to say.

But some details in the leaked documents call into question the accuracy of Google’s public statements about how Search works.

An example cited by Fishkin and King is if Google Chrome data is used in ranking. Google representatives have repeatedly indicated which doesn’t use data from Chrome to rank pages, but Chrome does specifically mentioned in the sections about how websites appear in Search. In the screenshot below, which I captured as an example, the links below the main vogue.com URL may be built in part with data from Chrome, according to the docs.

Chrome is mentioned in a section on how to create additional links. Image: Google

Another question that arises is what role, if any, the EEAT plays in the ranking. EEAT stands for experience, expertise, authority and trust, a Google metric used to evaluate the quality of results. Google representatives have It has been said above that EEAT is not a ranking factor. Fishkin notes that he hasn’t found much in the documents that mentions EEAT by name.

King, however, detailed how Google appears to collect author data from a page and has a field to tell if an entity on the page is the author. Part of the documents shared by King says the field was “primarily developed and tuned for news articles … but is also occupied for other content (eg scientific articles).” While this doesn’t confirm that bylines are an explicit ranking metric, it does show that Google at least tracks this attribute. Google representatives have has previously been insisted upon that author write-ups are something website owners should do for readers, not for Google, because it doesn’t affect rankings.

While the documents aren’t exactly a smoking gun, they do offer a deep, unfiltered look at a well-protected black box system. The US government’s antitrust case against Google, which revolves around search, has also made internal documents public, offering more insight into how the company’s core product works.

Google’s general awareness of how search works has made websites look the same as SEO marketers trying to beat Google based on the suggestions the company provides. Fishkin also calls out posts that gullibly endorse Google’s public claims as truth without much further analysis.

“Historically, some of the search industry’s loudest voices and most prolific publishers have been happy to uncritically repeat Google’s public statements. They write headlines like ‘Google says XYZ is true’ instead of ‘Google states XYZ; The evidence suggests otherwise,” Fishkin writes. “Please do better. If this leak and the DOJ trial can only create one change, I hope this is it.”

[ad_2]

Source link

You May Also Like

About the Author: Ted Simmons

I follow and report the current news trends on Google news.

Leave a Reply

Your email address will not be published. Required fields are marked *