Brave announces AI search engine

Brave AI Search Engine

Brave announced its new privacy-focused AI search engine called Answer with AI that powers its own search index of billions of websites. Its current search engine already serves 10 billion search queries per year, which means that Brave’s AI-powered search engine is now one of the largest AI search engines online.

Many in the search marketing and ecommerce communities have expressed anxiety about the future of the web due to AI search engines. Brave’s AI search engine still shows links, and most importantly, it doesn’t default to responding to commercial or transactional queries with AI, which should be good news for SEOs and online businesses. Brave values ​​the web ecosystem and will monitor website visitation patterns.

Search Engine Journal spoke with Josep M. Pujol, head of search at Brave, who answered questions about the search index, how it works with AI, and most importantly, shared what SEOs and business owners need to know to improve the ranking

Answer with AI is powered by Brave

Unlike other AI search solutions, Brave’s AI search engine is powered entirely by its own search index of crawled and ranked websites. All the underlying technology, from the search index to the large language models (LLMs) and even the Retrieval Augmented Generation (RAG) technology, is developed by Brave. This is especially good from a privacy perspective and also makes Brave search results unique, further distinguishing them from other me-too search engine alternatives.

Search technology

The search engine itself is made in house. According to Josep M. Pujol, head of search at Brave:

“We have query-time access to all of our indexes, over 20 billion pages, which means we’re pulling arbitrary information in real-time (schemas, tables, snippets, descriptions, etc.). In addition, we are very detailed about what data to use, from paragraphs or entire texts on a page to individual sentences or rows in a table.

Given that we have an entire search engine at our disposal, the focus is not on retrieval, but on selection and classification. Also, on our index pages, we have access to the same information used to rank, such as ratings, popularity, etc. This is vital to help select which sources are most relevant.”

Recovery Augmented Generation (RAG)

The way the search engine works is that it has a search index and large language models plus Retrieval Augmented Generation (RAG) technology in between that keeps answers fresh and fact-based. I asked about the RAG and Josep confirmed that it works like this.

He answered:

“You are right that our new feature uses RAG. In fact, we have already been using this technique in our previous summary feature published in March 2023. However, in this new feature, we are expanding both the quantity and quality of data used in the content of the request.

Large language models are used

I asked about the language models used in the new AI search engine and how they are deployed.

“Models are deployed to AWS p4 instances with VLLM.

We use a combination of Mixtral 8x7B and Mistral 7B as our main LLM model.

However, we also run several custom transformer models for ancillary tasks such as semantic matching and question answering. These models are much smaller due to strict latency requirements (10-20ms).

These auxiliary tasks are crucial to our function, as they are the ones that do the data selection that will end up appearing in the final LLM indicator; this data can be query-dependent text snippets, schemas, tabular data, or internal structured data from our rich snippets. It’s not about being able to retrieve a lot of data, it’s about selecting the candidates that will be added to the prompt context.

For example, the query “presidents of France by party” processes 220 KB of raw data, including 462 rows selected from 47 tables and 7 schemas. The message size is about 6500 tokens and the final response is only 876 bytes.

In short, you could say that with ‘AI Answer’ we go from 20 billion pages to a few thousand listings.”

How AI works with local search results

I then asked how the new search engine will appear local search. I asked Josep if he could share some example scenarios and queries where the AI ​​answer engine would come up in local businesses. For example, if I query for the best burgers in San Francisco, will the AI ​​answer engine provide an answer for that and link to it? Will it be useful for people making business or vacation travel plans?

Joseph answered:

“The Brave Search index has over a billion location-based schemas, from which we can pull over 100 million businesses and other points of interest.

Answer with AI is a general term for search + LLM + multiple specialized machine learning models and services to retrieve, classify, clean, combine and represent information. We mention this because LLMs do not make all the decisions. As of now, we mainly use them to synthesize unstructured and structured information, which happens both in offline operations and at query time.

Sometimes the final result feels heavily influenced by LLM (this is the case when we think the answer to the user’s question is a single point of interest, for example “checkin faro cuisine”), and other times the their work is more subtle (eg “best burgers”) sf”), generating a description of the company across different web references or consolidating a category for the company into a coherent taxonomy.”

Tips for ranking well

I then asked if using structured data from Schema.org was helpful in helping a site rank better on Brave and if he had any other tips for SEO and online business.

He answered:

“We definitely pay special attention to schema.org structured data when creating the LLM message context. It’s best to have structured data about your business (schema.org standard schemas). The more comprehensive the more these schemes are, the more precise the answer will be.

That said, our AI Answer will be able to display data about the business not also in these schemas, but it is always advisable to repeat information in different formats.

Some businesses rely solely on aggregators (Yelp, Tripadvisor, Yellow Pages) for their business information. There are benefits to adding schemas to your business website, even if it’s just for crawlers.

Plans for AI Search in Brave Browser

Brave shared that at some point in the near future they will integrate the new AI search functionality directly into the Brave browser.

Joseph explained:

“We plan to integrate the AI ​​reply engine with Brave Leo (the AI ​​assistant built into the Brave browser) very soon. Users will have the option to send the reply to Leo and continue the session there “.

Other facts

Brave’s announcement also shared these facts about the new search engine:

“Brave Search’s generative answers are not just text. The deep integration between the index and the model allows us to combine inline, contextual, and named entity enrichment (a process that adds more context to a person, place, or thing ) as the answer is generated.This means that answers combine generative text with other types of media, including flashcards and images.

The Brave Search answer engine can even combine index data and geolocal results to provide rich information about points of interest. To date, the Brave Search index has over a billion location-based schemas, from which we can extract over 100 million businesses and other points of interest. These lists, larger than any public dataset, mean the answer engine can provide rich, instant results for hotspots around the world.”

Try the new AI Search a



Source link

You May Also Like

About the Author: Ted Simmons

I follow and report the current news trends on Google news.

Leave a Reply

Your email address will not be published. Required fields are marked *