3 reasons not to block GPTBot from crawling your site

3 reasons not to block GPTBot from crawling your site

The next phase of ChatGPT’s meteoric rise is the adoption of GPTBot. This new iteration of OpenAI’s technology involves crawling web pages to dig deeper into the output that ChatGPT can provide.

The AI ​​improvement seems positive, but it’s not that clear. Legal and ethical issues surround the technology.

The arrival of GPTBot has highlighted these concerns as many big brands are blocking it instead of realizing its potential.

But I really think there’s a lot more to gain than to lose by fully (and responsibly) adopting GPTBot.

Why do AI bots like GPTBot crawl websites?

Understanding why bots like GPTBot do what they do is the first step to embracing this technology and harnessing its potential.

Simply put, bots like GPTBot are crawling websites to gather information. The main difference is that instead of an AI platform being passively fed data to learn from (the “training set,” if you will), a bot can actively search the web for information by crawling multiple pages.

Large Linguistic Models (LLMs) scour these websites to try to understand the world around us. Google’s C4 dataset constitutes a large portion (15.7 million sites) of the body of learning for these LLMs. They also crawl other informative and authoritative sites like Wikipedia and Reddit.

The more sites these bots can crawl, the more they learn and the better they can become. So why are companies blocking GPTBot from tracking?

Do brands blocking GPTBot have valid fears?

When I first read about companies blocking GPTBot from crawling their websites, I was confused and shocked.

It seemed incredibly short-sighted to me. But I thought there was a lot to consider that I didn’t think about enough.

After researching and speaking with agency professionals with legal training, I found the most important reasons.

Lack of compensation for your proprietary training data

Many brands prevent GPTBot from crawling their site because they don’t want their data used to train their models without compensation. Although I can understand wanting a piece of them billion dollar pieI think this is a short-sighted view.

ChatGPT, like Google and YouTube, is an answer engine for the world. Preventing your content from being crawled by GPTBot may limit your brand’s reach to a smaller set of internet users in the future.

Security concerns

Another reason behind anti-GPTBot sentiment is security. While it’s more valid than greedily hoarding data, it’s still a largely unfounded concern in my view.

Top reasons why organizations ban ChatGPT

By now, all websites should be very secure. Not to mention, the content that GPTBot is trying to access is public and non-sensitive content. The same things that Google, Bing and other search engines track on a daily basis.

What caches of confidential information do you think CEOs, CEOs, and other company leaders will access in GPTBot during its crawl? And with the right security measures in place, this shouldn’t be a problem?

From a legal standpoint, the argument is that any crawling on a brand’s site should be covered by its privacy disclaimer. All websites should have a privacy disclaimer that states how they use the data collected by their services. Advocates say that language must also state that a third-party generative AI platform could track the data collected.

If not, any personally identifiable information (PII) or customer data could still be “public” and expose brands to an FTC Section 5 claim for unfair and deceptive trade practices .

I have this concern to some extent. If you’re the legal department of a big-name brand, one of your main goals is to keep your company out of hot water. But this legal concern applies more to what is entrance a ChatGPT instead of what GPTBot tracks.

Anything entered into OpenAI’s platform becomes part of its data bank and has the potential to be shared with other users, causing a data leak. However, this would likely only happen if users asked questions about the stored information.

This is another unwarranted concern for me because everything can be solved through responsible internet use. The same data principles we’ve used since the dawn of the web still ring true: Don’t enter any information you don’t want shared.

A push to save humanity from the advance of AI

I can’t help but think that the leaders of some of these brands blocking GPTBot have a bias against the advancement of AI technology.

We often fear what we don’t understand, and some panic at the idea of ​​artificial intelligence acquiring too much knowledge and becoming too powerful.

While AI is rapidly evolving and beginning to “think” more deeply, humans are still largely in control. Furthermore, the legislation that will govern AI will grow along with the technology.

When we finally reach a world of “autonomous” AI platforms, their functionality will be guided by years of human innovation and legislation.

Get the daily search newsletter marketers trust.

3 reasons not to block ChatGPT’s GPTBot

So why should you allow GPTBot to crawl your site? Let’s look at the bright side with these top three benefits of embracing OpenAI’s bot technology.

1. 100 million people use ChatGPT every week

By not allowing GPTBot to crawl your site, there is a 100 million audience you’re missing out on maximizing brand visibility.

Sharing access to your website content can help ensure that your brand is represented in a real and positive way to ChatGPT users.

This means that there is a higher chance that ChatGPT will recommend your brand, which will lead to more traffic and leads.

Some brands report getting 5% of your total leads, or $100,000 in monthly income per ChatGPT subscription. I know our agency has also gotten some leads from ChatGPT.

Another way to think of it is as a positive digital public relations (DPR) play. You should leverage DPR strategies like brand mention campaigns in today’s landscape.

Allowing GPTBot to crawl your site only adds to these efforts by allowing ChatGPT to access your brand information directly from the source and distribute it positively to 100 million users.

2. Generative engine optimization (GEO)

Whether or not you fear AI, we can all agree that it is changing the marketing landscape. Like all new technologies and trends in our industry, those slow to embrace AI as a channel for new business and brand exposure will miss the proverbial boat.

GEO is gaining traction as a sub-practice of SEO. You’re missing out on a major opportunity if you don’t target some of your marketing efforts to be in this market. Competitors can collect it after letting it escape through the cracks.

We know it’s easy for brands to get left behind in today’s fragmented and ever-growing marketing landscape. If your competitors spend years working in GEO, maximizing the visibility of LLM and developing skills and knowledge in this area, they will be years ahead.

Now, GEO reporting capabilities have yet to hit value, which means it will be difficult to measure an ROI, but that doesn’t mean it’s something to ignore and get behind.

Brands and marketers need to start embracing LLMs like ChatGPT as an emerging acquisition channel that shouldn’t be ignored.

3. OpenAI’s Commitment to Minimize Harm

A healthy distrust of AI technologies is important for their legal and ethical growth. But we also need to be open-minded and realize that we cannot be effective as marketers if we resist and choose not to grow and innovate in the direction of things.

OpenAI clearly states “minimize harm” as one of the guiding principles of its platform. They also have policies to respect copyright and intellectual property and have stated that GPTBot filters out sources that violate their policies.

By allowing GPTBot to crawl the content of your site, you are contributing to the clean and accurate training data that OpenAI uses to improve and improve the accuracy of its information.

As AI technology advances, it can be easy to get caught up in skepticism, fear, and noise. Those who struggle to embrace and maximize it will be left behind.

The views expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.

[ad_2]

Source link

You May Also Like

About the Author: Ted Simmons

I follow and report the current news trends on Google news.

Leave a Reply

Your email address will not be published. Required fields are marked *