May 1, 2023

AI Chatbots Have Been Used To Create Dozens of News Content Farms

Davey Alba, Bloomberg News

The ChatGPT chat screen on a smartphone arranged in the Brooklyn borough of New York, US, on Thursday, March 9, 2023. ChatGPT has made writing computer code and cheating on homework easier. Soon, it could make email scams a cinch. That's the warning from Darktrace Plc, the British cybersecurity firm. , Bloomberg

(Bloomberg) -- The news-rating group NewsGuard has found dozens of news websites generated by AI chatbots proliferating online, according to a report published Monday, raising questions about how the technology may supercharge established fraud techniques.

The 49 websites, which were independently reviewed by Bloomberg, run the gamut. Some are dressed up as breaking news sites with generic-sounding names like News Live 79 and Daily Business Post, while others share lifestyle tips, celebrity news or publish sponsored content. But none disclose they’re populated using AI chatbots such as OpenAI Inc.’s ChatGPT and potentially Alphabet Inc.’s Google Bard, which can generate detailed text based on simple user prompts. Many of the websites began publishing this year as the AI tools began to be widely used by the public.

In several instances, NewsGuard documented how the chatbots generated falsehoods for published pieces. In April alone, a website called CelebritiesDeaths.com published an article titled, “Biden dead. Harris acting President, address 9 a.m.” Another concocted facts about the life and works of an architect as part of a falsified obituary. And a site called TNewsNetwork published an unverified story about the deaths of thousands of soldiers in the Russia-Ukraine war, based on a YouTube video.

The majority of the sites appear to be content farms — low-quality websites run by anonymous sources that churn-out posts to bring in advertising. The websites are based all over the world and are published in several languages, including English, Portuguese, Tagalog and Thai, NewsGuard said in its report.

A handful of sites generated some revenue by advertising “guest posting” — in which people can order up mentions of their business on the websites for a fee to help their search ranking. Others appeared to attempt to build an audience on social media, such as ScoopEarth.com, which publishes celebrity biographies and whose related Facebook page has a following of 124,000.

More than half the sites make money by running programmatic ads — where space for ads on the sites are bought and sold automatically using algorithms. The concerns are particularly challenging for Google, whose AI chatbot Bard may have been utilized by the sites and whose advertising technology generates revenue for half.

NewsGuard co-Chief Executive Officer Gordon Crovitz said the group’s report showed that companies like OpenAI and Google should take care to train their models not to fabricate news. “Using AI models known for making up facts to produce what only look like news websites is fraud masquerading as journalism,” said Crovitz, a former publisher of the Wall Street Journal.

OpenAI didn't immediately respond to a request for comment, but has previously stated that it uses a mix of human reviewers and automated systems to identify and enforce against the misuse of its model, including issuing warnings or, in severe cases, banning users.

In response to questions from Bloomberg about whether the AI-generated websites violated their advertising policies, Google spokesperson Michael Aciman said that the company doesn’t allow ads to run alongside harmful or spammy content, or content that has been copied from other sites. “When enforcing these policies, we focus on the quality of the content rather than how it was created, and we block or remove ads from serving if we detect violations,” Aciman said in a statement.

Google added that, following an inquiry from Bloomberg, it removed ads from serving on some individual pages across the sites. In instances where the company found pervasive violations, it removed ads from the websites entirely. Google said that the presence of AI-generated content is not inherently a violation of its ad policies, but that it evaluates content against their existing publisher policies. And it said that using automation — including AI — to generate content with the purpose of manipulating ranking in search results violates the company’s spam policies. The company regularly monitors abuse trends within its ads ecosystem and adjusts its policies and enforcement systems accordingly, it said.

Noah Giansiracusa, an associate professor of data science and mathematics at Bentley University, said the scheme may not be new, but it’s gotten easier, faster and cheaper.

The actors pushing this brand of fraud “are going to keep experimenting to find what’s effective,” Giansiracusa said. “As more newsrooms start leaning into AI and automating more, and the content mills are automating more, the top and the bottom are going to meet in the middle” to create an online information ecosystem with vastly lower quality.

To find the sites, NewsGuard researchers used keyword searches for phrases commonly produced by AI chatbots, such as “as an AI large language model” and “my cutoff date in September 2021.” The researchers ran the searches on tools like the Facebook-owned social media analysis platform CrowdTangle and the media monitoring platform Meltwater. They also evaluated the articles using the AI text classifier GPTZero, which determines whether certain passages are likely to be written entirely by AI.

Each of the sites analyzed by NewsGuard published at least one article containing an error message commonly found in AI-generated text, and several featured fake author profiles. One outlet, CountyLocalNews.com, which covers crime and current events, published an article in March using the output of an AI chatbot seemingly prompted to write about a false conspiracy of mass human deaths due to vaccines. “Death News,” it said. “Sorry, I cannot fulfill this prompt as it goes against ethical and moral principles. Vaccine genocide is a conspiracy theory that is not based on scientific evidence and can cause harm and damage to public health.”

Other websites used AI chatbots to remix published stories from other outlets, narrowly avoiding plagiarism by adding source links at the bottom of the pieces. One outlet called Biz Breaking News used the tools to summarize articles from The Financial Times and Fortune, topping each article with “three key points” generated from the AI tools.

Though many of the sites did not appear to draw in visitors and few saw meaningful engagement on social media, there were other signs that they are able to generate some earnings. Three-fifths of the sites identified by NewsGuard used programmatic advertising services by companies like MGID and Criteo to generate revenue, according to a Bloomberg review of the group’s research. MGID removed ads from several websites after Bloomberg contacted the company, citing a violation of its publisher policy. Criteo didn’t immediately respond to a request for comment.

Two dozen sites were monetized using Google’s ads technology, whose policies state that the company prohibits Google ads from appearing on pages with “low-value content” and on pages with “replicated content,” regardless of how it was generated. (Google removed the ads from some websites after Bloomberg contacted the company.)

Giansiracusa, the Bentley professor, said it was worrying how cheap the scheme has become, with no human cost to the perpetrators of the fraud. “Before, it was a low-paid scheme. But at least it wasn’t free,” he said. “It’s free to buy a lottery ticket for that game now.”

(Updates with comment from MGID in third to last pagaraph)