Google reminds websites to use Robots.txt to block action URLs

In a LinkedIn post, Google analyst Gary Illyes reiterated a long-standing guide for website owners: use robots.txt to prevent web crawlers from accessing URLs that trigger actions like adding items to carts or wish lists.

Illyes highlighted the common complaint of overloading servers with unnecessary crawler traffic, which often stems from search engine bots crawling URLs targeted for user actions.

he he wrote:

“Looking at what we’re crawling from complaint sites, too often these are action URLs like ‘add to cart’ and ‘add to wishlist’. They’re useless to crawlers, and it’s you probably don’t want to track them.”

To avoid this wasted server load, Illyes advised blocking access to the robots.txt file for URLs with parameters like “?Add to cart” or “?add to wish list.”

As an example, he suggests:

“If you have URLs like:
https://example.com/product/scented-candle-v1?add_to_cart
i
https://example.com/product/scented-candle-v1?add_to_wishlist

You should probably add a disallow rule for them in your robots.txt file.”

While using the HTTP POST method can also prevent these URLs from being crawled, Illyes noted that crawlers can still make POST requests, so robots.txt is still recommended.

Reinforcing decades of best practices

Thread contributor Alan Perkins noted that this guide echoes web standards introduced in the 1990s for the same reasons.

Quote from a 1993 document titled “A Standard for Bot Opting Out”:

“In 1993 and 1994 there have been occasions when robots have visited WWW servers where they were not welcome for various reasons … robots traversed parts of WWW servers that were not suitable, for example, very deep virtual trees, information duplicate, temporary information or cgi-scripts with side effects (such as voting).

The robots.txt standard, which proposed rules to restrict well-behaved crawler access, emerged as a “consensus” solution among web stakeholders in 1994.

Obedience and exceptions

Illyes stated that Google’s crawlers fully obey robots.txt rules, with rare, well-documented exceptions for scenarios involving “contractual or user-triggered fetches.”

This adherence to the robots.txt protocol has been a mainstay of Google’s web crawling policies.

Why SEJ cares

While the advice may seem rudimentary, the resurgence of this decades-old best practice underscores its relevance.

By leveraging the robots.txt standard, sites can help tame bandwidth-hungry crawlers with unproductive requests.

How this can help you

Whether you have a small blog or a major e-commerce platform, following Google’s advice on leveraging robots.txt to block crawler access to action URLs can help in several ways:

Reduced server load: You can reduce unnecessary server requests and bandwidth usage by preventing crawlers from reaching URLs that invoke actions such as adding items to carts or wishlists.
Improved crawler efficiency: Giving more explicit rules in the robots.txt file about which URLs crawlers should avoid can result in more efficient crawling of the pages/content you want to index and rank.
Better user experience: With server resources focused on actual user actions instead of wasted crawler visits, end users are likely to experience faster load times and smoother functionality.
Stay aligned with standards: Implementing the guide makes your site compliant with the widely adopted robots.txt protocol standards, which have been industry best practices for decades.

Revising robots.txt directives could be a simple but impactful step for websites looking to exert more control over crawler activity.

Illyes’ messaging indicates that the old robots.txt rules are still relevant in our modern web environment.

Featured Image: BestForBest/Shutterstock

[ad_2]

Source link

Pages

Categories

Google reminds websites to use Robots.txt to block action URLs

Reinforcing decades of best practices

Obedience and exceptions

Why SEJ cares

How this can help you

About the Author: Ted Simmons

Leave a Reply Cancel reply

Reinforcing decades of best practices

Obedience and exceptions

Why SEJ cares

How this can help you

You May Also Like

Las Vegas SEO Co announces first SEO consulting services

Get the highest rank with a small online SEO tool

The importance of local marketing for home service businesses

MAKE AN IMPACT WITH ADVANCED GENERATIVE ENGINE OPTIMIZATION

Google’s unconventional tips for fixing broken backlinks

All the SEO Secrets: The CEO’s Advice on Becoming an SEO Expert Transforms Businesses 10,000 Readers

About the Author: Ted Simmons

Leave a Reply Cancel reply