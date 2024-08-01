Reddit has been blocking search engines, including Microsoft’s Bing, from accessing the site’s posts and comments unless they sign commercial agreements, following a Reddit policy change that took effect last month.

“We can no longer be completely open because we have to be very considerate of where our data ends up and what it’s used for,” chief executive officer Steve Huffman said in an interview.

“Any crawler that we don’t have a formal agreement with, we’re now blocking.”

Last month, the social media company changed its policy to prevent companies and individuals from crawling the Reddit site without authorization.

The change requires companies to sign agreements with Reddit in order to use the site’s data, including to surface Reddit posts and comments in web searches.

As a result of the change, Alphabet’s Google is currently the only major search engine that can access Reddit content.

The search giant signed a $60 million (R1.1 billion) deal with Reddit in February. Reddit’s partnership with Google isn’t exclusive, and doesn’t prevent Reddit content from appearing in rival search engines, the company said.

Google and other search engines have long been important sources of traffic for Reddit.

People often search Google for Reddit posts and comments by adding “r/Reddit” to the end of their searches.

Over the past two years Google has accounted for as much as 40% to 50% of Reddit’s traffic in a single day, Huffman said.

Reddit used to allow search engines to access its site for free, because of how many people they sent back to Reddit.

“When it was used for simple search, to create simple links that would send us traffic from search engines, that was fine,” Huffman said.

“But now folks are using Reddit data for training, they’re reselling it, they’re doing search summaries instead of linking to us.”

Reddit has been in talks to enter commercial agreements with other search engines, including Bing, and AI companies Anthropic and Perplexity, but said those companies were unwilling to comply with the site’s content policies.

Microsoft said the company honours “the directions provided by websites that do not want content on their pages to be used with our generative AI models.”

Bing stopped crawling Reddit on 1 July after the company implemented its updated robots.txt file, which prohibits all crawling of their site, said a company spokesperson.

Perplexity, which said it doesn’t license content for the purpose of training AI, uses information from news sites and other web pages to answer users’ questions.

The company currently has partnerships with TIME, Fortune, WordPress.com and other sites. “Perplexity previously offered Reddit the opportunity to join our Publishers’ Program and the invitation is still open,” a Perplexity spokesperson said.

Anthropic said the company respects Reddit’s signal for blocking web crawling.

“Reddit has been on our block list for web crawling since mid-May and we haven’t added any URLs from Reddit to our crawler since then,” said an Anthropic spokesperson.

Microsoft-backed OpenAI did sign a partnership with Reddit in May, which allows Reddit results to appear in the chatbot ChatGPT.

As of May, Reddit had signed data licensing deals worth $203 million (R3.7 billion) in total over the next two to three years.