The Data Pipeline You Never Consented To

Unmissable Patterns #1. Our first issue of the series. On mass surveillance you didn't opt into and how the apps on your phone did it for you.

Jan 08, 2026

“Can you do facial?”

One ICE agent to another, standing over a 16-year-old on a bike. Masked agents had jumped out of an SUV and surrounded him on a sidewalk near his high school. They asked for his citizenship. American. They asked for ID. He didn’t have one. He’s sixteen, on his bike.

The second agent pointed a phone at the kid’s face.

That video went viral. The coverage focused on the agents, as it should have. But the tech that made that moment possible wasn’t built by the government. It was built by family safety apps selling location data to brokers. By fertility trackers sharing pregnancy information with Facebook. By prescription discount apps piping your medication searches to advertisers. By companies that sold a week of abortion clinic visitor data for less than a grocery run.

These weren’t accidents. They weren’t “unintended consequences.” They were business decisions made by people who could have chosen differently and didn’t.

And it was shaped, in part, by you and me.

I've built products that use location data to find nearby pet shelters, parks to explore, Goodwills in your area. I've designed experiences that ask about your mood, goals, interests and preferences to serve you something actually useful. Used well, this data makes products better. I know what that looks like. I also know what it looks like when the same infrastructure gets pointed somewhere else—when a profile stops being something you build and becomes something companies build about you, without you, and sell to whoever's buying.

The Collection Layer: Thousands of Apps Harvesting Data Because the Business Model Rewarded It

The pipeline starts on your phone. In your car. On your wrist. In your living room.

Life360 markets itself to parents as a safety app. A way to know where your kids are. What the marketing doesn’t mention: the company was selling precise location coordinates, updated every few minutes, to a dozen data brokers. When journalists at The Markup exposed this in 2021, the CEO acknowledged the company couldn’t control what happened to the data once it was sold. As of August 2025, they’re still at it. A Capitol Forum investigation found Life360 selling user data through LiveRamp’s marketplace, segmented by age, gender, household income, and parental status.

The product that promises to protect your family is the same product that sells your family's movements to whoever's buying.

Your car does the same thing. In January 2025, the FTC took action against General Motors for collecting precise location data from OnStar users every three seconds and selling it to LexisNexis and Verisk, consumer reporting agencies that feed insurance pricing models. Drivers didn’t know until their premiums spiked. One customer told GM support: “When I signed up for this, it was so OnStar could track me. They said nothing about reporting it to a third party. Nothing. You guys are affecting our bottom line.”

GM is banned from sharing this data with consumer reporting agencies for five years. The data they already sold is still out there.

Flo, the period tracker with over 280 million registered users, promised users their health data would stay private. It didn’t. The app was sharing information about users’ menstrual cycles, pregnancy intentions, and symptoms with Facebook and Google through embedded analytics tools. The FTC settlement required Flo to notify affected users and instruct third parties to destroy the data. But here’s the thing about data: once it’s out, it doesn’t come back.

GoodRx, the prescription discount site and app with 55 million visitors since 2017, made the same promise. Fifty-five million users searching for antidepressants, heart medication, birth control, all of it shared with Facebook, Google, and Criteo through tracking pixels. The FTC’s first-ever enforcement under its Health Breach Notification Rule hit GoodRx with a $1.5 million penalty in 2023. A class action settlement followed: $32 million, finalized in late 2025. The company says it fixed the issue years ago. The infrastructure that made it possible has become industry standard, with tracking pixels remaining invisible to users, but tracking their IP address, location, page views, clicks and conversions.

SafeGraph sold location data on visits to Planned Parenthood clinics. For $160, Vice’s Motherboard bought a week of data covering more than 600 locations, detailed enough to show where visitors came from and where they went afterward. This was 2022, weeks before Roe fell. SafeGraph eventually stopped selling this specific data after the story broke. But the infrastructure that made the sale possible is still running. The business model hasn’t changed.

Strava, the fitness app, inadvertently revealed the locations of secret military bases because soldiers were tracking their runs. The heatmap the company published in 2017 lit up U.S. outposts in Syria, Afghanistan, and Somalia. The Pentagon launched a review. The lesson should have been obvious: data collected for one purpose doesn’t stay there.

This is the collection layer. Thousands of apps, devices, and services harvesting data because the business model rewarded it. The harm was foreseeable. No one in charge chose to stop it.

Thanks for reading. This post is public so feel free to share it with someone you know who would love to read this.

The Aggregation Layer: Where Your Data Becomes a Product

Individual data points are cheap. A single location ping is worth a fraction of a cent. But link millions of data points to a persistent identity, and the economics change completely.

This is where data brokers operate. There are at least 750 data broker companies registered in the U.S. and that only counts the four states that require registration. Estimates put the global number closer to 5,000. The biggest ones are names you might recognize: Experian, Equifax, Acxiom, Oracle. Experian alone holds data on 300 million Americans. Epsilon claims to have records on “every marketable U.S. household.” They’re billion-dollar companies operating in plain sight.

LexisNexis holds 282 million identity profiles. Not individual data points. Profiles. Each one can include Social Security numbers, criminal records, credit history, property records, vehicle registrations, employment history, and location patterns reconstructed from app data. The company has a $22 million contract with ICE, running through 2028. In the first seven months of that contract, ICE searched the LexisNexis database over 1.2 million times.

Thomson Reuters runs a similar product called CLEAR. It’s been used by law enforcement to build what officials describe as “patterns of life” analyses. The British Columbia General Employees’ Union, which holds stock in Thomson Reuters, has been pushing the company to conduct human rights due diligence on how these tools get used. So far, the pressure hasn’t stopped the contracts.

The data broker market is worth $270 billion annually. Information about individual Americans sells for a tenth of a cent to five dollars per person, depending on how sensitive it is. Data brokers don't collect data. They buy it, merge it, enrich it, and resell it.

The product is a searchable version of your life, available to anyone who can pay.

And the buyers aren't just advertisers. Congressional investigations and a Federal Trade Commission report show that data brokers sell highly sensitive consumer segments, including political and protest-related interests, financial vulnerability categories such as “subprime households,” health and fertility indicators, and immigration-related inferences.

The Inference Layer: Predicting What You Haven't Told Anyone

The AI that’s reshaping surveillance doesn’t write essays or generate images, it makes predictions. And it’s disturbingly good at predicting things you’ve never explicitly shared.

In 2013, researchers at Cambridge published a study showing that Facebook likes alone could predict a user’s sexual orientation with 88% accuracy for men, and distinguish Democrats from Republicans 85% of the time. That was twelve years ago. The models have only gotten better, and the data they’re trained on has only gotten richer.

Today, inference engines can estimate your health conditions, your pregnancy status, your creditworthiness, your political leanings, your emotional state. None of this requires you to disclose anything. It gets inferred from the patterns in your behavior, from the digital exhaust you leave behind without thinking about it.

Insurance companies use data from driving apps and wearables to score your risk profile. Customer service platforms analyze call transcripts and assign emotional scores. Meta’s tracking pixel follows you across millions of websites, building a behavioral model that predicts what you’ll respond to before you know yourself.

The NSA and CISA issued joint guidance acknowledging the problem, warning about 'insecure data pipelines' and recommending organizations treat AI training data as high-risk infrastructure. The message: AI inherits the risks of whatever data it's trained on.

Palantir's ELITE tool doesn't just aggregate data, it scores it. Each potential target gets an "address confidence score" based on how certain the system is about where they live. The tool uses what Palantir calls "advanced analytics" to prioritize targets. An ICE agent testified that they use the density of pins on the map to decide where to send teams: "You're going to go to a more dense population rather than... if there's one pin at a house and the likelihood of them actually living there is like 10 percent... you're not going to go there."

The simplest version: AI safety is data safety. You cannot build responsible AI on top of irresponsibly collected and sold data. The surveillance problem and the AI problem are part of the same problem.

The Enforcement Layer: No Warrant Required

In 2022, Georgetown Law’s Center on Privacy and Technology published a report called “American Dragnet.” The findings were stark. ICE had scanned the driver’s license photos of one in three American adults. ICE had access to the driver’s license data of three in four adults. ICE could locate three in four adults through their utility records alone.

That was 2022. Since then, the infrastructure has expanded.

As of January 2026, ICE is conducting the largest enforcement operation in its history. Agents have been deployed to Minneapolis, Maine, Houston. Two people have been killed. And the data tools powering these operations have only grown more sophisticated.

ICE has contracted with Palantir to build an investigative case management system that combines license plate scans, utility records, property data, biometrics, and social media activity into what officials call a “searchable portrait of a person’s life.“ They’ve expanded contracts with LexisNexis and Thomson Reuters. They’ve revived a previously frozen contract with Paragon Solutions for spyware that can infiltrate encrypted messaging apps like WhatsApp and Signal. They’ve published a request for information seeking contractors to conduct 24/7 social media monitoring across Facebook, Instagram, TikTok, Reddit, and LinkedIn.

The 24/7 monitoring program would have contractors scrape public posts, correlate them with commercial datasets, and produce dossiers for field offices. For high-priority cases, the turnaround time is 30 minutes.

And it’s not just commercial data anymore. In January 2026, 404 Media reported on a Palantir tool called ELITE—Enhanced Leads Identification and Targeting for Enforcement. The tool pulls data from the Department of Health and Human Services, including Medicaid records covering 80 million patients, and maps potential deportation targets. Each person gets a dossier. Each address gets a confidence score out of 100. ICE agents use it to decide which neighborhoods to raid.

ICE signed a data-sharing agreement with the Centers for Medicare and Medicaid Services in 2025. The agreement was first reported by the Associated Press and confirmed through documents released in a lawsuit brought by 404 Media and the Freedom of the Press Foundation. Health data that Americans assumed was protected is now being used to generate leads for enforcement actions.

Almost none of this requires a warrant.

The Fourth Amendment protects Americans from unreasonable government searches. But the government found a workaround: if a private company collects the data and sells it, agencies can just buy it. No warrant. No subpoena. No court oversight.

Senator Ron Wyden called it “a backdoor to throw the Fourth Amendment in the trash can.”

And it’s not just federal agencies. In California, law enforcement agencies in Los Angeles, San Diego, Orange County, and Riverside County were caught sharing license plate reader data with ICE and Border Patrol, violating state law more than 100 times in a single month. The data came from Flock Safety, a company whose cameras are deployed in over 5,000 communities nationwide. Local police were running searches on behalf of federal immigration agents, giving ICE access to a surveillance network it doesn’t officially have a contract with.

The pipeline connects your grocery store loyalty card to a fusion center. Your period tracker to a prosecutor. The parks app on your phone to an enforcement action you’ll never see coming. The distance is shorter than you think, maybe three or four handoffs. None of them require your permission.

The Feedback Loop: Why "Delete Your Data" Doesn't Work

Here’s what makes this so hard to address: no single actor is responsible.

App developers collect data because it makes products better and because that’s how free products get monetized. Data brokers aggregate it because there’s a market. AI companies train on it because prediction is valuable. Law enforcement buys it because it’s available and it works.

Each player points to the others. Each one is just using what already exists.

“Everyone’s just doing their job”. The pipeline grows anyway.

Consumers are told to protect themselves. Use a VPN. Read the terms & conditions. Opt out of tracking. But academic research shows that most “delete your data” services don’t actually work. They miss partner brokers, 40% of whom fail to respond to deletion requests. They remove partial records. They create a false sense of control over a system designed to be untraceable and uncontrollable.

Meanwhile, the regulatory landscape is fractured. Twenty U.S. states have comprehensive privacy laws. No federal standard. The EU's GDPR has teeth on paper, but fines amount to rounding errors for the companies paying them. By the time California finalizes its rules on automated decision-making, the system will have moved on.

The data pipeline doesn’t wait for regulators.

In December 2024, the FTC banned Gravy Analytics from selling Americans’ location data without consent. The company had been tracking visits to health clinics, places of worship, and domestic abuse shelters, then selling that data to advertisers and government contractors. Three weeks later, Gravy Analytics got hacked. Seventeen terabytes of location data stolen. Thirty million records from 3,400 apps like Tinder, Grindr, MyFitnessPal, transit apps, games exposed on a Russian cybercrime forum.

The FTC acted. It didn’t matter. The data was already out there, collected over years, and now it belongs to whoever downloaded it before the forum post came down. That’s the thing about data: enforcement is always retrospective and there’s no way of pulling the data back. Somewhere in that 17 terabytes is someone’s visit to a domestic abuse shelter. That data is never coming back.

What Good Looks Like: Proof That Different Choices Exist

I refuse to end on despair, because that won’t help us know how to act. We are not helpless. People are building alternatives. You and I can choose differently.

Project Liberty recently published a report on data cooperatives, models where the people who generate data actually own and benefit from it collectively. The premise is simple: individual consent frameworks were never going to work because data’s real value comes from aggregation. If aggregation is inevitable, the question is who controls it. Cooperatives are one answer.

Decentralized identity systems are maturing. Instead of your data living on a company's server, you hold it in a wallet you control. You share only what's necessary, verify it cryptographically, and expose nothing else. Microsoft and several governments are investing seriously.

Shareholder activism is putting pressure on data brokers. The BC General Employees’ Union has been using its stake in Thomson Reuters to demand human rights assessments of ICE contracts. They haven’t won yet. But the fact that a union pension fund is forcing these conversations at a corporate board level is something.

And inside some companies, teams are writing explicit principles about what they will and won’t do with data. Not just privacy policies. Actual commitments — lines they won’t cross, uses they’ll walk away from, even when there’s money on the table.

These aren’t solutions. They’re experiments. But they point toward a different way of thinking about the problem. One where how we handle data is a design problem, not just a checkbox.

The Question We Have to Keep Asking

The 16-year-old in Aurora was a U.S. citizen. He had every right to be on that sidewalk, riding his bike near his school. The system that put a phone in his face didn’t care about his rights. It cared about closing cases, faster, cheaper, at scale.

That system was built by people making decisions that seemed reasonable at the time. Collect this data, it’ll improve the product. Sell access to this dataset, it’ll fund the business. Buy this tool, it’ll help close cases faster. Each decision made sense in isolation. Together, they built something no one voted for.

I keep returning to that line from the NSA’s own guidance: AI safety is data safety. The futures we’re building with AI are downstream of the data we collect today. Every dataset, every integration, every feature that collects a little more than it needs. It all flows somewhere.

I can't control where it flows. But I can control what I contribute. I can ask harder questions in the rooms I'm in. I can say no to collection that doesn't have a defensible purpose. I can stop telling myself my piece of the pipeline is fine just because I can't see the end of it.

That’s not enough to fix the system. I know that. But it’s what I have.

What do you have? What are you willing to say no to?

I’d love to hear what this surfaced for you. Reply to this email or drop a comment below.

Resources

On the Aurora incident and ICE surveillance expansion:

On data collection and broker practices:

On the Gravy Analytics breach:

On California police sharing data with ICE:

CalMatters: California police illegally sharing license plate data

On alternatives:

Discussion about this post

Ready for more?