Millions of Copyrighted Songs Were Fed to AI Music Generators – Now There’s Proof

Atlantic databases name 21 million tracks fed to Suno and rivals as Sony, UMG, and Warner seek $150,000 per song in damages

Al Landes Avatar
Al Landes Avatar

By

Image: Gadget Review

Key Takeaways

Key Takeaways

  • Searchable databases verify roughly 21 million copyrighted songs trained AI music generators.
  • Sony, UMG, and Warner lawsuits seek up to $150,000 per song from Suno and Udio.
  • HarmonyCloak tool lets artists protect songs by adding inaudible AI-blocking audio perturbations.

Millions of copyrighted songs — including chart-topping hits — verifiably trained AI music generators, and now there are searchable databases to prove it. The Atlantic, through an investigation by staff writer Alex Reisner, published four catalogs documenting exactly which music fed these models:

  • The largest contains roughly 12 million tracks
  • The second holds about 9 million
  • Two smaller sets clock in around 100,000 each

These aren’t obscure SoundCloud demos. Taylor Swift is in there. Bad Bunny is in there. The catalog of modern popular music, scraped and swallowed whole.

Suno, one of the most prominent AI music generators, acknowledged in court filings that it trained on “tens of millions” of recordings — later admitting unlicensed copyrighted material was included, according to Heavy Lifting, citing court filings.

The legal picture that emerges is striking. Sony, UMG, and Warner have filed lawsuits against Suno and Udio seeking up to $150,000 per song in statutory damages. A parallel book-industry case framed mass scraping as piracy rather than simple copyright infringement and reached an initial $1.5 billion settlement figure, according to Engadget. Meanwhile, the U.S. Copyright Office stated in January 2025 that AI-generated music often cannot itself be copyrighted without sufficient human authorship — meaning these tools can potentially infringe existing works while producing outputs that carry no protection of their own.

The AI companies call it fair use. They argue models learn abstract patterns, not specific songs. Labels call it piracy with a pitch deck. Courts are still deciding who’s right.

“Trained on copyrighted recordings without permission” — that’s how label plaintiffs have characterized the practice in filings, as summarized in industry commentary.

Researchers at the University of Tennessee developed HarmonyCloak, a tool that adds inaudible audio perturbations to recordings, making songs effectively unlearnable by AI models while sounding identical to human ears — a rare artist-controlled option in a landscape where most protections remain theoretical.

What Comes Next

The scrape-everything era may be ending as labels, lawmakers, and researchers build workable alternatives.

The fight is already shifting from courtrooms to contracts. Warner Music Group and Universal Music Group have reportedly struck deals with Udio and Suno respectively, moving toward licensed AI music models that actually compensate rightsholders. Tennessee passed a law protecting musical artists’ voices from unauthorized AI cloning. Streaming platforms are deploying AI-detection tools to flag and limit generative imitations — though results have been mixed, according to Engadget, with AI-generated copycats continuing to slip through and monetize.

Whether you’re an indie artist wondering if your EP got scraped, or someone who generated a birthday jingle on Suno last week, The Atlantic’s databases aren’t just journalism. They’re evidence. The Napster debate is back — this time wearing a licensing agreement and filing for fair use — and it just got a searchable answer. These are among the most consequential tech scandals to emerge from the AI era, reshaping how creators think about ownership in the digital age.

Share this

At Gadget Review, our guides, reviews, and news are driven by thorough human expertise and use our Trust Rating system and the True Score. AI assists in refining our editorial process, ensuring that every article is engaging, clear and succinct. See how we write our content here →