Google’s latest Search Off The Record podcast discussed examples of disruptive incidents that can affect crawling and indexing and looked at the criteria for deciding whether or not to disclose the details of what happened.
Complicating the issue of making a statement is that there are times when SEOs and publishers report that search is broken when, from Google’s point of view, it works the way it’s supposed to.
Google search has a high uptime
The interesting part of the podcast started with the observation that Google Search (the home page with the search box) has an “extremely” high uptime and rarely goes down and becomes inaccessible. Most of the reported problems were due to routing problems in the Internet network itself rather than a failure of Google’s infrastructure.
Gary Illyes commented:
“Yes. The service that hosts the home page is the same service that hosts the status bar, the Google search status bar, and it has an insane number of uptimes. … the number is like 99,999 whatever.
John Mueller jokingly responded with the word “nein” (pronounced like the number nine), which means “no” in German:
“Nein. Never come down. Nein.”
The Googlers admit that the rest of Google Search in the backend is experiencing outages and explain how it’s being handled.
Google crawling and indexing incidents
Google’s ability to crawl and index web pages is critical to SEO and earnings. Disruption can have catastrophic consequences, especially for time-sensitive content such as ads, news, and sales events (to name a few).
Gary Illyes explained that there is a team at Google called Site Reliability Engineering (SRE) that is responsible for making sure public systems run smoothly. There is a whole one Google subdomain dedicated to site reliabilityand where they explain that they approach the task of maintaining operating systems in a similar way to software issues. They monitor services like Google Search, ads, Gmail and YouTube.
The SRE page explains the complexity of their mission as being very granular (fixing individual things) to solving large-scale problems that affect “continental service capacity” for billions of users.
Gary Ilyes explains (at minute 3:18):
“The Site Reliability Engineering organization publishes their handbook on how they handle incidents. And many of the incidents are caught by incidents that are problems with any system. They catch them with automated processes, meaning that there are probes, for example, or there are certain rules that are set in the monitoring software that looks at the numbers.
And then if the number exceeds any value, an alert is triggered which is then captured by software like incident management software.”
February 2024 Indexing problem
Gary then explains how the February 2024 indexing issue is an example of how Google monitors and responds to incidents that could affect users in search. Part of the answer is figuring out if this is a real problem or a false positive.
He explains:
“This also happened on February 1. Basically, some numbers went wrong, and then that automatically opened an incident internally. We then have to decide if this is a false positive or something we really need to look at, like us SRE people.
And, in this case, they decided that yes, this is valid. And then they raised the priority of the incident to one more step whatever it was.
I think it was a minor incident at first and then they escalated it to a medium one. And then when it becomes medium, it ends up in our inbox. So we have a threshold for medium or higher. Yes.”
Minor incidents are not publicly announced
Gary Ilyes then explained that they don’t report every little incident that happens because most of the time it won’t even be noticed by users. The most important consideration is if the incident affects users, who will automatically receive an updated priority level.
An interesting fact about how Google decides what’s important is that issues that affect users are automatically escalated to a higher priority level. Gary said he didn’t work at SRE, so he couldn’t comment on the exact number of users that must be affected before Google decides to make a public announcement.
Gary explained:
“SRE would investigate everything. If they get a polling alert, for example, or an alert based on any number, they’ll look at it and try to explain it to themselves.
And if it’s something that’s affecting users, that almost automatically means they need to increase the priority because users are actually affected.”
Disappearing images incident
Gary shared another example of an incident, this time involving images not being displayed to users. It was decided that while the user experience was affected, it was not affected to the extent that it prevented users from finding what they were looking for, the user experience was degraded but not to the point that Google became unusable. So it’s not just whether users are affected by an incident that will trigger priority escalation, but how the user experience will be affected.
The case of the images not being displayed was a situation where they decided not to make a public statement because users could still find the information they needed. Although Gary didn’t mention it, it seems like a problem that recipe bloggers have run into in the past where images stopped showing.
Explained:
“Like, for example, recently there was an incident where some images were missing. If I remember correctly, I chimed in and said, “This is stupid and we shouldn’t externalize it because the user impact isn’t actually bad,” right? Users will literally not receive the images. It’s not like anything is broken. They just won’t see certain images on search results pages.
And, to me, that’s just, well, 1990 or 2008 or something. It’s like it could still be used and yet everything is elegant except for some images.”
Are publishers and SEOs considered?
Google’s John Mueller asked Gary if the threshold for making a public announcement was if the user experience was degraded or if it was the case that the experience of publishers and SEOs was also considered.
Gary replied (around the 8 minute mark):
“So it’s about Search Relationships, not Site Owner Relationships, from a search perspective.
But by extension, as site owners, they would also care about their users. So if we care about its users, it’s the same group of people, right? Or is it too positive?”
Gary apparently sees his role primarily as Search Relations in a general sense for his users. This may surprise many in the SEO community because Google’s documentation for its Search Off The Record podcast explains the role of the Search Relations team different:
“As the Google Search Relations team, we’re here to help site owners succeed with their websites on Google Search.”
Listening to the entire podcast, it’s clear that Googlers John Mueller and Lizzi Sassman are very focused on engaging with the search community. So maybe there’s a language issue that makes his comment interpretable differently than he intended?
What do search relationships mean?
Google explained that they have a process for deciding what to disclose about outages in search, and it’s a 100% sensible approach. But one thing to keep in mind is that the definition of “relationships” is a connection between two or more people.
Search is a relationship. It is an ecosystem where two partners, the creators (SEOs and site owners) create content and Google makes it available to its users.
Featured image by Shutterstock/Khosro
[ad_2]
Source link