Following the holiday headlines late last month that The New York Times would be suing OpenAI and Microsoft for copyright infringement, the AI organization is pushing back, stating in a blog response yesterday that the lawsuit “is without merit” and that it views this “as an opportunity to clarify our business, our intent, and how we build our technology.”
The NYT lawsuit claims that millions of its published articles were “used to train automated chatbots that now compete with the news outlet as a source of reliable information,” and that the defendants should be held responsible for billions of dollars in damages and forced to destroy their models and data that were trained on or use their copyrighted material.
Needless to say, a lot is on the line.
A Response to the NYT Lawsuit — “OpenAI And Journalism“
OpenAI’s response essentially took the form of a four-part position, that:
- OpenAI collaborates with news organizations to “explore opportunities, discuss their concerns, and provide solutions”
- Training is fair use, but providing an opt-out option (which NYT exercised in August 2023) is “the right thing to do”
- Memorization and regurgitation are both rare, but OpenAI is “working to drive to zero”
- “The New York Times is not telling the full story”
The first three points are arguments and talking points we’ve all heard before, but the fourth reveals behind-the-scenes details that speak to OpenAI’s “surprise and disappointment” with the lawsuit.
OpenAI says the discussions and negotiations it had with NYT well into December “focused on a high-value partnership around real-time display with attribution in ChatGPT.” Similar partnerships with news publisher Axel Springer had already been announced, though a report last week claims OpenAI is only offering media companies “as little as between $1 million and $5 million annually to license their news articles,” which, small publisher or not, could “make it difficult for OpenAI to strike deals.”
During the NYT discussions, OpenAI says the publisher “mentioned seeing some regurgitation of their content but repeatedly refused to share any examples.” Open AI gave an example of a ChatGPT feature that had been taken down in the past once the company was made aware of its “unintended” ability to reproduce real-time content.
OpenAI claims the regurgitation examples that NYT did induce “appear to be from years-old articles that have proliferated on multiple third-party websites,” which OpenAI calls out as “intentionally manipulated prompts.”
“Even when using such prompts, our models don’t typically behave the way The New York Times insinuates,” OpenAI says, “which suggests they either instructed the model to regurgitate or cherry-picked their examples from many attempts.”
Already, NYT (or, at least NYT’s lead counsel, Ian Crosby) has responded to OpenAI’s explanation, saying in a statement that “the blog concedes that OpenAI used The Times’s work, along with the work of many others, to build ChatGPT.”
“As The Times’s complaint states, ‘Through Microsoft’s Bing Chat (recently rebranded as ‘Copilot’) and OpenAI’s ChatGPT, Defendants seek to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment.’ That’s not fair use by any measure.”
This obviously isn’t affecting everyday AI users yet. (After all, a number of companies, including Microsoft and Adobe, have “promised to pay the legal bills for any users facing copyright lawsuits,” The Neuron reminded its newsletter subscribers on January 2. “Hakuna Matata!”)
But the NYT lawsuit could be — or, as Alberto Romero says, “will be” — the “straw that broke the camel’s back.”
In his The Algorithmic Bridge Substack last week, Romero says the core argument for OpenAI and those defending AI are that it doesn’t matter whether NYT is legally or technically right and that we “ should not put barriers to the progress that AI companies are making on behalf of all of humanity.” But Romero ultimately lands on a middle-ground where “just as it’s a mistake to oppose all technology by default, it’s also a mistake to embrace it blindly at all costs.”
“We like to say that AI is coming, that it’s inevitable,” Romero writes. “And it may very well be (I think so), but it matters how we make the transition to that inevitable future. We’re not walking a fixed path. As humanity, we’re making decisions all the time. Decisions about which we have control. Decisions that shape the future we’re heading to. … To me, it makes no moral sense to yearn so much for progress in some aspects if in others, just as important, we regress instead.”
Not to preach to the choir here, but who knows that better than publishers?