How to control Googlebot’s interaction with your website

Google Search Relations answered several questions about web page indexing on the latest episode of the “Search Off The Record” podcast.

Topics covered were how to block Googlebot from crawling specific sections of a page and how to prevent Googlebot from accessing a site.

Google’s John Mueller and Gary Illyes answered the questions examined in this article.

Blocking Googlebot from specific sections of the web page

Mueller says yes impossible when asked how to prevent Googlebot from crawling specific sections of web pages, such as “also purchased” areas on product pages.

“The short version is that you can’t block a specific section of an HTML page from being crawled,” Mueller said.

He went on to offer two potential strategies to address the problem, neither of which, he stressed, are ideal solutions.

Mueller suggested using the data-nosnippet HTML attribute to prevent the text from appearing in a search snippet.

Alternatively, you can use an iframe or JavaScript with the source blocked by robots.txt, although he warned that this is not a good idea.

“Using a bot or JavaScript iframe can cause crawling and indexing issues that are difficult to diagnose and resolve,” Mueller said.

He assured everyone listening that if the content in question is being reused on multiple pages, it’s not a problem that needs to be fixed.

“There is no need to prevent Googlebot from seeing this type of duplication,” he added.

Blocking Googlebot from accessing a website

In response to a question about how to prevent Googlebot access none part of a site, Illyes provided an easy-to-follow solution.

“The easiest way is robots.txt: if you add a ban: / for the Googlebot user agent, Googlebot will leave your site alone as long as you maintain this rule,” Illyes explained.

For those looking for a more robust solution, Illyes offers another method:

“If you want to block even access to the network, you have to create firewall rules that load our IP ranges into a deny rule,” he said.

See Google’s official documentation to get a list of Googlebot IP addresses.

To sum up

While it’s impossible to prevent Googlebot from accessing specific sections of an HTML page, methods such as using the data-nosnippet attribute can provide control.

When considering blocking Googlebot from your site entirely, a simple disallow rule in your robots.txt file will do the trick. However, more extreme measures are also available, such as creating specific firewall rules.

Featured image generated by the author via Midjourney.

source: Google search off the record

[ad_2]

Source link

You May Also Like

About the Author: Ted Simmons

I follow and report the current news trends on Google news.

Leave a Reply

Your email address will not be published. Required fields are marked *