New AI model allows copyright holders to retain control of their data

10/07/2025 | WIRED

Developers of artificial intelligence (AI) may soon gain greater control over the data used for training their models. Researchers at the Allen Institute for AI (Ai2) have developed FlexOlmo, a novel large language model (LLM) that enables data owners to control how their training data is used even after a model has been built. The innovation challenges the industry norm where vast amounts of data are ingested, frequently without clear ownership considerations. Any subsequent data extraction is akin to "trying to recover the eggs from a finished cake."

The CEO at Ai2, Ali Farhadi, explains that FlexOlmo's approach allows data owners to contribute by copying a public "anchor" model, training a second model with their own data, and then merging the results with the anchor. This method ensures raw data is never directly transferred and can be extracted later if necessary, for example, in case of legal disputes or objections to model usage. The result is more like how to "have your cake—and get your eggs back, too."

£ - This article requires a subscription.


Training Announcement: Freevacy offers a range of independently recognised professional AI governance qualifications and AI Literacy short courses that enable specialist teams to implement robust oversight, benchmark AI governance maturity, and establish a responsible-by-design approach across the entire AI lifecycle. Find out more.

Read Full Story
Artificial intelligence, AI training data

What is this page?

You are reading a summary article on the Privacy Newsfeed, a free resource for DPOs and other professionals with privacy or data protection responsibilities helping them stay informed of industry news all in one place. The information here is a brief snippet relating to a single piece of original content or several articles about a common topic or thread. The main contributor is listed in the top left-hand corner, just beneath the article title.

The Privacy Newsfeed monitors over 300 global publications, of which more than 6,250 summary articles have been posted to the online archive dating back to the beginning of 2020. A weekly roundup is available by email every Friday.