Saturday, April 30, 2011

Data Privacy, Put to the Test

BIG Oil. Big Food. Big Pharma.

By Natasha Singer, NYTimes, April 30, 2011

To the catalog of corporate "bigs" that worry a lot of us little people, add this: Big Data. It was not a good week for those who guard their privacy. First, we learned that Apple and Google have been using our smartphones to collect location data. Then Sony acknowledged that its PlayStation network had been hacked — the latest in a string of troubling data breaches. You'd have to be living off the grid not to realize that just about everything there is to know about you — what you buy, where you go — is worth something to someone. And the more we live online, the more companies learn about us.

But to what extent do others have a right to share and sell that information? That is the crux of a data-mining case that had arguments last Tuesday before the Supreme Court. The case, Sorrell v. IMS Health, is ostensibly about medical privacy: Vermont passed a law in 2007 that lets each doctor decide whether pharmacies can, for marketing purposes, sell prescription records linking him or her by name to the kinds and amounts of drugs prescribed. State legislators passed the law after the Vermont Medical Society said that such marketing intruded on doctors and could exert too much influence on prescriptions.

But three health information firms, including IMS Health and Verispan, along with a pharmaceutical industry trade group, challenged the law, saying it restricted commercial free speech. Access to prescription records, IMS Health says, helps pharmaceutical companies market efficiently to doctors whose patients would most benefit from specific drugs. Now the justices are to decide whether the Vermont law is constitutional.

But with the recent headlines about privacy invasion — the PlayStation hack followed a recent breach at the online marketing company Epsilon that exposed e-mail addresses of customers of Citibank, Walgreens, Target and other companies — the Vermont case is tapping into a much broader conversation about consumer protection and informed consent.

The case raises questions about who is collecting, managing, storing, sharing and selling all that data. Just as important, privacy advocates say, it raises questions about whether data brokers are adequately safeguarding it.

People generally don't have much control over who collects and sells information about them. Moreover, says Christopher Calabrese, a legislative counsel at the American Civil Liberties Union, they also don't even know the names of the data brokers who compile those electronic profiles. And, so, consumer advocates are setting their sights on Big Data.

"Without government intervention, we may soon find the Internet has been transformed from a library and playground to a fishbowl," Mr. Calabrese testified in March during a Senate hearing on consumer privacy, "and that we have unwittingly ceded core values of privacy and autonomy."

There are a few laws, like the Video Privacy Protection Act, that prohibit businesses from releasing personally identifiable records, like video rental histories, without customer consent. The Digital Advertising Alliance, a coalition of online marketing groups, introduced a program last year that notifies consumers about online tracking and allows them to opt out of advertising tailored to them. The Vermont law amounts to a kind of do-not-call option for doctors who may welcome visits from pharmaceutical sales reps but don't want drug marketing based on their own prescription records.

That marketing practice is possible because pharmacies, which are required by law to collect detailed information about prescriptions they fill, can sell doctor-specific prescription records to data brokers. (According to federal privacy regulations, personal information about patients, like names and addresses, must be removed before the records can be sold for marketing.) Firms like IMS Health then combine the records, and pharmaceutical reps often use them to tailor presentations to individual doctors.

The central concern is privacy — of both doctors and their patients. While pharmacies remove the names of patients before selling the records, those names are replaced with unique codes that track patients over time from doctor to doctor, according to the Vermont complaint. That means data firms could create a profile that includes a person's prescriptions as well as the names of the pharmacies and dates at which the person picked up the medications, says Latanya Sweeney, a visiting professor of computer science at Harvard.

"It ends up building a detailed prescription profile of individuals," says Professor Sweeney, whose research on data re-identification was cited by several briefs in the case. "Those extended profiles tend to be very unique."

The concern, she says, particularly in a small state like Vermont, is that a nameless prescription record could theoretically be enough to identify someone who might not want others to know that he takes, say, anti-depressants. Moreover, Professor Sweeney argues, data miners could collate those files with public information, like voter registration and hospital discharge records, to link prescriptions to specific people.

Federal health privacy regulation, she says, does not protect patient records once they have been de-identified. Nor does the law prohibit re-identification. But IMS Health says it isn't aware of any case of re-identifying patients whose prescription records were de-identified in accordance with federal rules. The company says it doubly encrypts each patient's identity and gives the encryption keys to several third parties — meaning that no single entity can decode a file by itself, says Kimberly Gray, chief privacy officer at IMS Health.

The company typically sells combined reports that show how many patients received a certain drug from a certain doctor, but not the specific drugstores those patients frequent, Ms. Gray says. IMS never uses public information or outside data sets to try to re-identify patients, she says, and when it does provide encoded patient histories to others for research purposes, it prohibits those third parties from making such attempts. "We would never want to re-identify someone," Ms. Gray says. "No good can come from that."

Still, it is hard to prevent people from trying to re-identify patients, says Lee Tien, a staff lawyer at the Electronic Frontier Foundation, a digital civil liberties group that filed a brief in support of Vermont. It would be easier, he says, if Congress passed a law that went further than Vermont's, giving people the right to consent before their encrypted prescription records were sold for marketing purposes. "In Vermont, the doctor can decide," Mr. Tien says. "But we'd prefer it if the patient were able to say, 'Don't sell my data.' " 

Friday, April 29, 2011

Why the Online Identity & Data Ownership Debate Matters

There has been quite a bit of media attention the past week around the news that iPhones and iPads are recording and storing location data in an unencrypted manner. Apple replied that it’s not tracking iPhone location, it’s maintaining a database of surrounding Wi-Fi hotspots and cell towers so the iPhone can calculate its location when requested.Anyway, the little window of raised awareness and interest in data mining and privacy compelled me to want to write a bit about it.

I’ve been exploring many angles over the past few years of how humanity and our technologies are co-evolving, – how social media tools are offering us new ways to collaborate, to see ourselves through different lenses, to intentionally evolve our consciousness, and to explore new forms of value exchange.

I was invited to participate in the Internet Identity Workshop in Silicon Valley next week, and the Privacy Identity Innovation conference later in May, so my new learning objective has been to get a grasp on online identity and personal data ownership.  It’s really quite fascinating, and there is a real sense of urgency for awareness to be raised around what’s happening and what it means.

The Big Picture
We’re aware that the data we generate is “owned” (or at least maintained) by someone else – the government issues us our identification, the doctor’s office has our health records, the credit agencies know our financial history. We assume our information is private and secure.
But now with so much activity happening online and increasingly on mobile devices, we’re generating a digital representation of ourselves that not only expresses our interests, desires, needs, purchasing behaviors, and the range of social connections and relationships, but also the contextual information of our location in physical space and time.This is important because we’re generating a detailed profile of ourselves that reveals much more about us that we may realize.

What is Revealed: Macro Level
A recent article in the Wall Street Journal, The Really Smart Phone, discusses research conducted by scientists, and the interesting patterns of human behavior they were able to abstract from data collected from smartphones. For example, by analyzing people’s movement records, they were able to predict someone’s future whereabouts with 93.6% accuracy. They’re able to notice symptoms of mental illness, predict stock market fluctuations, and even chart the spread of ideas throughout society, revealing a “god’s-eye view of human behavior.” With billions of people on the planet now carrying a mobile device, we’re able to access data about human complexity that was simply not possible before.

What is Revealed: Micro Level
In a New York Times piece from the other day, Show Us the Data. (It’s Ours, After All.), professor of economics and behavioral science Richard Thaler writes about the vast amount of personal data that is being aggregated about us and sold to third parties.

In terms of consumption, this data is useful for companies in order to target you with highly personalized recommendations, advertising and offers. On a personally empowering level, it could potentially offer us a wealth of information about ourselves to assist us with intelligent decision-making. For example, by looking at medical records and family history, we might receive tailored recommendations for exercise plans or food choices. The problem is – we often don’t have access to this data.

What’s at Stake
There’s a lot of talk about “privacy” on the web right now, and I’m still not completely sure I understand the extent of the argument. If by privacy we mean security, and wanting protection of sensitive data like financial records or social security numbers, I completely agree. But if privacy concerns are around the fear of someone finding out about that bizarre fetish we have or the flavor of porn we prefer, I wonder how much that matters. While that information may be taboo in some circles, it’s actually infinitely less interesting than the data we reveal about ourselves publicly that’s being mined and sold online every day.

(check out this tongue in cheek video by The Onion – “CIA’s Facebook Program Dramatically Cut Agency’s Costs)

Most of the activity done online, from browsing websites to chatting with friends, is being recorded by someone. Your “private” conversations in Facebook are mined, as are your shopping habits on Amazon, or your preferences or personal connections on any number of services.

The issue with these things, moreso than that they are happening, is that we don’t have access to that data that we generate. Challenging this unfortunate reality was the big thrust that led to the formation of the Personal Data Ecosystem Consortium, a coalition of individuals and organizations who realize what’s at stake if we don’t reclaim the data that is ours.

Essentially, by third parties locking in our “digital self” into each of their services, we are losing massive collective intelligence opportunites for innovation, value creation, knowledge building, and citizen engagement as a global society.

We have multiple accounts and multiple levels of relationships within and across those social networks. When we click around on sites we are leaving a trail of ‘digital exhaust’, defining our habits, preferences, curiosities, and explorations. We don’t have control/access/ownership of this data, but 3rd parties do. Each of these pieces, and all the contextual information around it, is INCREDIBLY VALUABLE, but currently fragmented, fractured, and scattered. Shouldn’t we have access to it ALL, so we can connect the dots and make effecitve and meaningful choices?

Why can’t I just export my data, activity, and relationships from each service, and be in control of who gets to see it, which parts they get to access, and how they use it once I give them permission?

Why isn’t there an easy way for me to have an overview of everything about me, and be able to selectively share information about myself, my interests, my capacities, my needs, or my resources?

The Future We Deserve
At the moment, commercial entities know more about our preferences and behaviors online than we do. With all the services out there that facilitate social interaction, there is still no easy way to connect with people with whom we share affinities, and then to effectively exchange information with them or collaborate in a meaningful way.

Our online identity and data *should* be our right to control, so that we are empowered to make better decisions about our lives and well-being, find potential collaborators or kindred spirits, or generally create more meaningful and valuable relationships. It’s worth asking:

What would a people-centric web look like?

What if it felt more like walking through a town commons and less like walking through a shopping mall?

How could identity and trust be built into the architecture of the internet?

To contain the length here, I’ll flesh out some ideas about all this in an upcoming post -“A Framework for Building Online Intelligence”

In the meantime, I’d love to hear your thoughts about identity and personal data ownership.
see also:

Personal leverage for personal data - doc searlsDatabuse: Digital Privacy and the Mosaic

Wednesday, April 27, 2011

"Data trading is the new information economy"

Welcome to the age of data
By: Molly Wood, CNET, April 25, 2011
 In Daniel Suarez's book "Freedom," he describes a world in which members of a revolutionary "darknet" use glasses with heads-up displays to literally visualize the publicly available information about every person on earth.

It floats above them as a callout: Social Security numbers, bank balances, cell phone numbers, addresses, purchasing history, baby pictures, social network posts. That data is visible by anyone with the means to harvest it, and it can be manipulated at will by malicious hackers (like Loki, the Suarez character who "data curse" on someone who annoys him), by governments, and by companies.

Hopefully, you've all realized that Suarez's vision is hardly one of the future: it's a vision of the present. Welcome to the age of data. It's time to get control of your assets.
Caption: Yeah, dude. They're watching you.

Yeah, dude. They're watching you.
This week's iPhone location tracking scandal is just the latest glaring spotlight on how much of your personal information is gushing out the door, whether unprotected on your own devices and ripe for the picking, or into corporate and botnet servers worldwide. And despite reports of a Steve Jobs e-mail declaring that Apple doesn't track anyone, Apple's general counsel told a congressional inquiry in June 2010 that "(t)o provide the high-quality products and services that its customers demand, Apple must have access to the comprehensive location-based information."
Apple is hardly alone in demanding this level of comprehensive personal information. The iOS location-tracking revelations come on the heels of a federal investigation into mobile application data sharing. Investigators charge that seemingly harmless apps like Pandora are, while they're streaming you highly customized media, are also sending "age, gender, location and phone identifiers to various ad networks," according to the Wall Street Journal. The Journal report found that the majority of the 101 apps it tested sent some personal information to a third-party data broker, largely without your knowledge.

Subsequent investigations found that most Android phones transmit some user information, including location data, back to the mother ship, as well, with Google saying only that the data wasn't "traceable to a specific user." (The merits of that argument are up for debate, to say the least.) Even Microsoft is gathering location data on Windows phones.

Sadly, this informational espionage should hardly come as a surprise.
Caption: The iPhone 4: talk about a Trojan Horse.
The iPhone 4: talk about a Trojan Horse.
(Credit: EMMANUEL DUNAND/AFP/Getty Images)
The new cost of "free"
Personal information is the currency of the post-technological age, and the cost of "free" has never been higher. Your data, on an increasingly minute and personal level, powers every Web or network-based company, from start-up to monolith.
Google maintains literally acres of servers dedicated to storing your communications--from e-mail to texts to the transcripts of your voice mail; your browsing and shopping habits; your blog posts; your photos; your calendar appointments; and of course, your intensely personal search histories. If you're logged in to a Google service, that information is all tied to your IP address. Only the thinnest of artificial technical barriers--a sort of loose privacy honor system--keeps Google from combining the data into a scarily accurate digital version of you (like the first digital Cylon, if you will).
But pity poor Google, which must gather all this information by increasingly intrusive means, like the DoubleClick ad cookie that tracks your browsing all across the Web, surreptitious Wi-Fi sniffing, and sending location information about you back to its data centers even when you're not running location apps.

On the other side of the aisle lies Facebook, which has cleverly cajoled 500 million users (and growing) into giving up virtually all the same information for free. Profiles, Places, Deals, and of course, the ever-present Like button, which lets you easily record your preferences for everything from opinions to shoes to celebrities and can almost imagine Facebook whispering a little "thank you" every time you click that little blue button.

Want to understand why Google is so desperate to get into social that it's tied part of every employee's bonus to the success or failure of that strategy in 2011? It has nothing to do with helping you share your photos and restaurant check-ins, and everything to do with data collection--and data connections.
Caption: Connected, we stand.
Connected, we stand.
(Credit: Google)
The real magic of the new world of data collection is far more than just hoovering up reams of anonymous or semi-anonymous information. The real magic is in using that data to draw connections between action and reaction, consideration and purchase, brand and affinity, and to sip from the holiest of all commerce grails: recommendation.

The Web as real-time recommendation engine is the ultimate goal of initiatives ranging from the Amazon recommendation queue to Netflix's $1 million prize to the team who improved its recommendation algorithm by 10 percent or more to Facebook's original Beacon program.
Foursquare is working hard to integrate recommendations into its check-in service; Yahoo just spent a reported $20 million to $30 million on a TV check-in and recommendation service called IntoNow that's just 12 weeks old. It's a pretty simple equation: if they can figure out what you like, they can sell you more of what you like.

And the key to recommendation is scale. You can't do the math until you aggregate as many likes, dislikes, check-ins, one, two, and four stars as possible. All of these services depend, first and foremost, on you providing the data for them to crunch. And thanks to your life online, and, increasingly, the phone in your pocket, that data is as ever-present as the air we breathe.

Who's buying?
See, but Google, Facebook, and Apple are the companies we "trust," like we trusted that Pandora was just delivering great '80s tunes on my now-dusty Bon Jovi station. So, where's all our information going? To a silent but deadly collection of data brokers, marketers, and data aggregation services.

These ranks include Epsilon, recently the subject of what Computer World called "the hack of the century." No one knows how many e-mail addresses were exposed in the Epsilon breach, or the full scope of what else may have been revealed, but it has more than 2,000 clients and handles 40 billion e-mails a year. Its database of active shoppers (which included those who opted out but were retained in the database, if not actively emailed) was a gold mine for hackers and spear phishers, and there are 25 more companies where they came from--and that's just email marketing. Merlin Information Services: for all your massive personal information database needs
What should you do?
What can you do? The short answer is, not a lot. Sure, you can go opt out of every data broker on the list, you can stay off the grid, you can give false names and live on cash. But the real question is: do you need to? Or, should we accept that we're in the age of data and embrace--nay, demand--that the data transparency go both ways?

Take the time to own your own data and clear the Web of any information you'd rather not be out there--you can at least try to opt out of sites like Spokeo and other aggregators, if only to protect the most sensitive information. And you don't have to trust the cloud. Ironically, despite its aggregation of information at a scale that would make Skynet envious, Google has engineers in-house who've created the Data Liberation Front, which lets you freely export your own information from the big G. Facebook lets you download everything you've ever posted (surprising, right?).

If you just want to back up and retain your data, services like Backupify index your cloud data and back it up, while Greplin lets you index the cloud services and search them, too (yes, I'm aware that both sites may engage in the same kind of ad targeting or data brokering I'm complaining about: read your terms of service, folks!).

And hey, as long as start-ups are making money brokering data, I'd like to see one that lets you see, say, your Data Score. If Greplin or Backupify can index your cloud information, why can't a company index it and parse it? A Data Score could tell you how risky your overshares are: does it make you unemployable, or just questionable? It could tell you what data is unintentionally public, like the cell phone number you thought you were hiding behind Facebook's byzantine wall of privacy settings. It could even perform a TurboTax like audit, and warn you when publicly available information about you might lead to easy identity theft or obvious phishing attempts.

The best disaster mitigation is preparedness. At some point, data trading is the new information economy, our privacy expectations will adjust accordingly, and yes, there are benefits. But we shouldn't stumble blindly into it--we ought to at least be willing and informed partners in managing our digital identities. Then we can click the ad for those perfect nude pumps in relative peace. After all, they do go with everything.

Tuesday, April 26, 2011

What Does Your Phone Know About You? More Than You Think

Figuring that I've got nothing to hide or steal, I'd always privileged convenience over any privacy and security protocols. Not anymore.

I plugged my phone into my computer and opened an application called Lantern, a forensics program for investigating iPhones and iPads. Ten minutes later, I'm staring at everything my iPhone knows about me. 14,000 text messages, 1,350 words in my personal dictionary, 1,450 Facebook contacts, tens of thousands of locations pings, every website I've ever visited, what locations I've mapped, my emails going back a month, my photos with geolocation data attached and how many times I checked my email on March 24 or any day for that matter. Want to reconstruct a night? Lantern has a timeline that combines all my communications and photos in one neat interface. While most of it is invisible during normal operations, there is a record of every single thing I've done with this phone, which also happens to form a pretty good record of my life.

Figuring that I've got nothing to hide or steal, I'd always privileged convenience over any privacy and security protocols. Not anymore. Immediately after trying out Lantern, I enabled the iPhone's passcode and set it to erase all data on the phone after 10 failed attempts. This thing remembers more about where I've been and what I've said than I do, and I'm damn sure I don't want it falling into anyone's hands.

Last week, two separate news items highlighted the importance of what your phone knows. First, the American Civil Liberties Union in Michigan went public with its Freedom of Information Act request for data on how the state police are using a hardware system called Cellebrite UFED. The ACLU suggested that state troopers were using the UFED during routine traffic stops. While the $4,000-8,000 price tag of the systems would suggest it's unlikely that many cops have the systems in their cars, even the possibility of such a practice has got to set Fourth Amendment alarm bells ringing from here to 1791. Here's a word of advice: if a law enforcement official ever asks for your phone, just say no.

In a June 2008 article, Cellebrite bragged that it had sold 3,500 Cellebrite devices in the eleven months the UFED had been on the market. Throw in other common devices from companies like Cellebrite, Parabens, Micro Systemation and Katana Forensics, makers of Lantern, and you can begin to see the scale of mobile phone data extraction that must be occurring across the nation's law enforcement landscape.

I don't say that to suggest that the police are doing anything wrong. Like computers, phones certainly seem like fair game for investigators. They're scrambling like the rest of us to keep up with a rapidly changing mobile technology landscape that's forcing strange ethical choices onto them. Let's say someone was texting while driving, which may be against the law in your state. They might want that evidence, so they extract the data from the phone and when they look at it, lo and behold, there are several time-tagged photos of the person getting high earlier that day. Suddenly, a minor ticket gets turned into a DUI.

We're not sure how the courts are going to decide whether evidence like this is admissible because it's complicated. Doctrines like 'plain view' -- that cops can seize evidence without a warrant if they can see it -- require informational friction and human embodiment to make sense. With a searchable stash of a phone's data, what is in plain view? What isn't? It's just so easy to find out more than you asked.

The other big mobile data news last week came out of O'Reilly's Where 2.0 conference during which two researchers showed in dramatic fashion that the iPhone keeps a location log of where the phone has been, a fact which Apple had declined to tell anyone and which had first been discovered by the same guy who created the Lantern software that opened up my phone for inspection.

Alex Levinson built Lantern from his living room in Rochester, New York. He's still a student at Rensselaer Polytechnic Institute, but he tells me that his room is "basically an information security and forensic laboratory." He ticks off the equipment at his disposal: 4 MacBooks, a couple other laptops, two desktop boxes running different operating systems, two iPhones, a couple Droids, a Blackberry, all kinds of wireless and networking equipment and terabytes of storage. He may also know Apple's iOS as well as anyone in the world. A mere 48 hours after Apple released the iPhone 4, Levinson had patched Lantern to support the upgrade. He waited in line for ten hours and spent the next two days poking around the file system that sits underneath the ultraslick user experience.

That's one reason he was the first to notice that Apple had begun storing its location data in the new, more easily accessible way. But all that time spent rummaging around under the iPhone's hood also led him to develop an actual philosophy about the difference between mobile and computer forensics. In mobile, he said, no one directly interacts with the file system. You don't pull up documents and save and delete them the way we do with computer.

"How are those new interactions producing evidence that would be relevant to what I'm doing?" Levinson asked rhetorically.  For him, that means knowing every single thing a phone can output for him. "Take a basic phone, maybe a Razr," he said. "I would map out every single data point within the phone. We've got text messages. We've got pictures. We've also got picture messaging, which could be a subset. We've got call logs. We might have baseband logs." Then, he'd start to correlate one thing with another. If there are timestamps and locations, every message or photo can be fixed in space and time.

"You're beginning to create a forensic model of the human use of the device," he said. "The software's goal is to recereate a rich forensic timeline of how this device was used so the analyst can put their shoes in place of the user and see what happened with this device."
Indeed, using Lantern, it's remarkably easy to reconstruct what happened to me on, say, April 13, my birthday, and the next day, when I celebrated the release of my book at an Atlantic party.

I missed a call from my best friend at 12:30 a.m. wishing me a happy birthday. I got up at 7:04 a.m., which I know because I sent him back a text message. I got several more birthday greetings and phone calls. Then I had a meeting with Richard Florida and some other Atlantic people during which I Googled several things related to the meeting. Then I went on a radio show in Colorado, which I know both because my calendar shows it, but also because I searched the radio station. Then I took a cab to Union Station (I texted, "On my way to Union Station") and snapped a picture of a tour bus that we passed which claimed to be "American-Owned & Operated." I got to New York around 7:45 p.m., when I Googled my hotel's address. The next morning, I went to WNYC at 160 Varick Street to be interviewed by Brian Lehrer, all of which is obvious from my Internet history, text messages and photos. Then I met with a prospective job candidate at Le Pain Quotidien according to my calendar and spent an hour researching Then I went to my book party at a private home, and took some photos, which Lantern pinpointed perfectly.

You could export most of this sequence to a Google Earth layer and look at it plotted with a time slider. Without trying to, I'd left a trail spelling out exactly what I did for 48 hours. Mobile forensics and mobile privacy don't have to sit in opposition, but what you can find with the former should inform our views about the latter. And you can suddenly find a ton with relatively simple tools.

The big deal about location data isn't the data itself; rather, the location data makes all the other information that can be extracted exponentially more useful. That's why mobile forensics is different, and why our devices may be where the bubbling privacy concerns of the last decade come to a head.

If our phones have become our outboard brains, we've actually put ourselves in a very difficult privacy position. Even searching a suspect's house could never yield a full inventory of that person's friends and acquaintances, the entire record of their voice and text communications -- and all the web pages he'd ever looked at. Now, law enforcement or a government official can have all of that in two minutes and physical access to one's cell phone.

Or as Cellebrite USA's CEO Aviad Ofrat excitedly told a trade magazine a couple years ago, "mobile device forensics is the future. With the wealth of data even a casual user has stored in his or her cellphone, smartphone, or PDA, it is quickly becoming THE one piece of evidence that is interrogated immediately."

Because where we go, so go our phones.

I'll be publishing part two of this series tomorrow after I visit the National Institute of Standards and Technology's mobile forensic tool testing lab in Gaithersburg, Maryland.

Wednesday, April 20, 2011

CRS Report: "Privacy Protections for Personal Information Online"


There is no comprehensive federal privacy statute that protects personal information. Instead, a patchwork of federal laws and regulations govern the collection and disclosure of personal information and has been addressed by Congress on a sector-by-sector basis.

 Federal laws and regulations extend protection to consumer credit reports, electronic communications, federal agency records, education records, bank records, cable subscriber information, video rental records, motor vehicle records, health information, telecommunications subscriber information, children’s online information, and customer financial information. Some contend that this patchwork of laws and regulations is insufficient to meet the demands of today’s technology.

Congress, the Obama Administration, businesses, public interest groups, and citizens are all involved in the discussion of privacy solutions. This report examines some of those efforts with respect to the protection of personal information. This report provides a brief overview of selected recent developments in the area of federal privacy law. This report does not cover workplace privacy laws or state privacy laws.

Available at

National Identity Strategy Envisions a More Trustworthy

Guest blog post by Leslie Harris, President and CEO of the Center for Democracy & Technology

Today the Administration released an ambitious, long-term strategy document called the National Strategy for Trusted Identities in Cyberspace (NSTIC). The Strategy puts forth a vision where individuals can choose to use a smaller number of secure, privacy-preserving, and convenient online identities. This would be a shift away from today’s norm of numerous usernames, passwords, and online accounts scattered across the Web.

Importantly, the Administration has turned to the private sector to make this vision a reality. The Strategy is not a national ID program—in fact, it’s not an ID “program” at all. It is a call for leadership and innovation from private companies. The government’s role must now be to advocate for its citizens and to support the development of a fair and useful system.

Why should the American people care about a “strategy” for Internet identity?

First, a growing number of our Internet transactions require an identity. We’re continually prompted to create new accounts to participate in online social networking, shopping, banking, and forums. Most of us have no idea how our identifying information will be used or shared. It certainly doesn’t help that we have to offer a fresh set of information to every new service that comes along. Without a new approach, this trend will continue. We deserve better control over our identity and stronger assurances that it will not be misused. Innovation isn’t slowing down; we have to catch up.

Secondly, services that will make our lives easier and more convenient—sometimes involving highly sensitive information—are still waiting to come fully online.  Health care and government services are slowly staking out an Internet presence, but they will remain at the starting line until a reliable and trustworthy platform for establishing and confirming user identity exists.

We’re pleased to see the Strategy has made individuals its first priority. The Administration must remain firmly dedicated to an identity ecosystem that is voluntary, protective of privacy, affords users a wide variety of choices for whether and how they will convey their identity online, and compliant with a full set of Fair Information Practice Principles. This effort must also be built on the foundation of comprehensive privacy legislation. We encourage the Administration to incorporate its existing support for baseline privacy legislation with the Strategy’s implementation.

Finally, the Strategy recognizes that anonymity and pseudonymity—crucial elements of our privacy and First Amendment rights—are and must remain vital characteristics of the Internet alongside any new identity ecosystem.

The Strategy is the beginning of a long journey through complicated technology standards and policy rules. If its vision is realized, consumers, businesses, and governments all have a lot to gain. It will only succeed, however, with meaningful engagement from all stakeholders. We are eager to see the Strategy’s implementation plan and hope the Strategy leads to a productive partnership.

Monday, April 18, 2011

What's next for privacy on the Hill?

By Hayley Tsukayama  Wash Post   4/15/2011

New bills and discussion about the recent Epsilon data breach have made privacy a popular talking point on the Hill. But a lot of politics stands between the talk and actual movement on legislation.

Four major privacy proposals have been floated on the Hill this session. In February, Reps. Bobby Rush (D-Ill.) and Jackie Speier (D-Calif.) each introduced privacy legislation. Earlier this week, Sens. John Kerry (D-Mass.) and John McCain (R-Ariz.) and Reps. Cliff Stearns (R-Fla.) and Jim Matheson (D-Utah) offered privacy bills for each chamber.

The privacy bills have some key differences. Stearns’s bill promotes industry self-regulation and requires companies to notify consumers about privacy policies and data use. The bill from Kerry and McCain encourages self-regulation but also requires an opt-in measure to share sensitive personal information, or information that could harm a person if released, depending on the situation.

Rush’s reintroduced bill requires companies to provide an opt-out option before they can share data with other companies. Speier’s privacy package includes the only proposed legislation with a do-not-track measure; the other bill is aimed at protecting financial information.

Having bipartisan bills in the House and Senate is a key step forward, said Justin Brookman, a privacy expert from the Center for Democracy and Technology. That at least gives this week’s bills a chance to move along, he said.

Even that, though, might not be enough. “Both bills have an overwhelming amount of momentum, but my enthusiasm is tempered by the calendar, given the looming election season,” said Amy Mushahwar, a lawyer and privacy expert at Reed Smith law firm.

Staffers for Kerry and Rush have said both offices are trying to schedule privacy hearings. A person in Kerry’s office said that they are trying to schedule a hearing on the consumer privacy act as soon as possible, likely after the April recess.

“I think we’ll see action in the Senate sooner,” Brookman said, as the House’s new Republican majority hasn’t had as much time to work on privacy issues.

There are a lot of players in this debate. In the Senate, privacy issues have traditionally been the jurisdiction of the Senate Commerce, Science and Transportation committee. But in February, Sen. Al Franken (D-Minn.) was tapped to lead the chamber’s new Judiciary subcommittee on privacy, adding more voices to the mix.

And with two bills already proposed in the House, Rep. Mary Bono Mack (R-Calif.), who chairs the subcommittee with jurisdiction over consumer privacy issues, has also highlighted privacy issues as main concern.

Stearns has said he will work closely with Mack, but that the Kerry/McCain bill should not be viewed as a companion to his bill.

“I believe that our approach of greater consumer notice and choice balances the needs of privacy and innovation,” Stearns said in a statement. “Our bill provides the necessary flexibility and avoids one size fits all regulations and unnecessary government intervention.”

With all the high-minded, conceptual talk about privacy, Mushahwar said that it’s also important to concentrate on basic definitions in the bills, and not lose sight of how companies can actually apply the language to their own business practices.

“These bills have to be implemented by data centers and require a practical mindset,” she said.

Wednesday, April 13, 2011

Brookings Paper on Privacy

Databuse: Digital Privacy and the Mosaic
by Benjamin Wittes Senior Fellow, Governance Studies The Brookings Institution   •  April, 2011


The question of privacy lies at, or just beneath, the surface of a huge range of contemporary policy disputes. It binds together the American debates over such disparate issues as counter-terrorism and surveillance, online pornography, abortion, and targeted advertising. It captures something deep that a free society necessarily values in our individual relations with the state, with companies, and with one another. And yet we see a strange frustration emerging in our debates over privacy, one in which we fret simultaneously that we have too much of it and too little.

This tendency is most pronounced in the counter-terrorism arena, where we routinely both demand—with no apparent irony—both that authorities do a better job of “connecting the dots” and worry about the privacy impact of data-mining and collection programs designed to connect those dots.

The New Republic on its cover recently declared 2010 “The Year We Were Exposed” and published an article by Jeffrey Rosen subtitled “Why Privacy Always Loses.”[1] By contrast, in a book published earlier in 2010, former Department of Homeland Security policy chief Stewart Baker described privacy concerns as debilitating counter-terrorism efforts across a range of areas:

Even after 9/11, privacy campaigners tried to rebuild the wall [between intelligence and law enforcement] and to keep DHS from using [airline] reservation data effectively. They failed; too much blood had been spilled. But in the fields where disaster has not yet struck—computer security and biotechnology—privacy groups have blocked the government from taking even modest steps to head off danger.[2]

Both of these theses cannot be true. Privacy cannot at once be always losing—a value so at risk that it requires, for so Rosen contends, “a genuinely independent [government] institution” dedicated to its protection—and be simultaneously impeding the government from taking even “modest steps” to prevent catastrophes.

Unless, that is, our concept of privacy is so muddled, so situational, and so in flux, that we are not quite sure any more what it is or how much of it we really want.

In this paper, I explore the possibility that technology’s advance and the proliferation of personal data in the hands of third parties has left us with a conceptually outmoded debate, whose reliance on the concept of privacy does not usefully guide the public policy questions we face. And I propose a different vocabulary for that debate—a concept I call “databuse.” When I say here that privacy has become obsolete, to be clear, I do not mean this in the crude sense that we have as a society abandoned privacy in the way that, say, we have abandoned once-held moral anxieties about lending money for interest. Nor do I mean that we have moved beyond privacy in the sense that we moved beyond the need for a constitutional protection against the peacetime quartering of soldiers in private houses without the owner’s consent.[3] Privacy still represents a deep value in our society and in any society committed to liberalism.

Rather, I mean to propose something more precise, and more subtle: that the concept of privacy as we have traditionally understood it in law no longer describes well or completely the actual value at stake in the set of issues we continue to argue in privacy’s name. The notion of privacy was always vague and hard to pin down as an operational matter in law. But this problem has grown dramatically worse as a result of the proliferation of data about all of us and the ability to analyze and cross-reference that data systematically and instantly. To put the matter bluntly, the concept of privacy will no longer bear the weight we are placing upon it. And because the term covers such a huge range of ground, its imprecision with respect to these new problems creates great indeterminacy as to what the value we are trying to protect really is, whether it is gaining or losing ground, and whether that is a good thing or a bad.

In this paper, I examine privacy’s conceptual obsolescence with respect only to a single area, albeit one that is by itself hopelessly sprawling: data about individuals held in the hands of third parties. Our lives, as I have elsewhere argued, are described by a mosaic of such data—an ever-widening array of digital fingerprints reflecting nearly all of life’s many aspects. Our mosaics record our transactions, our media consumption, our locations and travel, our communications, and our relationships. They are, quite simply, a detailed portrait of our lives—vastly more revealing than the contents of our underwear drawers yet protected by a weird and incoherent patchwork of laws that reflect no coherent value system.[4]

We tend to discuss policy issues concerning control over our mosaics in the language of privacy for the simple reason that privacy represents the closest value liberalism has yet articulated to the one we instinctively wish in this context both to protect and to balance against other goods—goods such as commerce, security, and the free exchange of information. And there is no doubt an intuitive logic to the use of the term in this context. If one imagines, for example, the malicious deployment of all of the government’s authorities to collect the components of a person’s mosaic and then the use of those components against that person, one is imagining a police state no less than if one imagines an unrestricted power to raid people’s homes. If one imagines the unrestricted commerce in personal information about people’s habits, tastes, and behaviors—innocent and deviant alike—one is imagining an invasion of personal space as destructive of a person's privacy as the breaking into that person's home and the selling of all the personal information one can pilfer there.

Yet the construction of these issues as principally implicating privacy is not inevitable; indeed, privacy itself is not inevitable as a legal matter. It was, as I shall argue, created in response to the obsolescence of previous legal constructions designed to shield individuals from government and one another, and it was created because technological developments made those earlier constructions inadequate to describe the violations people were feeling. Ironically, today it is privacy itself that no longer adequately describes the violations people are feeling with respect to the mosaic—and it describes those violations less and less well as time goes on. Much of the material that makes up the mosaic, after all, involves records of events that take place in public, not in private; driving through a toll booth or shopping at a store, for example, are not exactly private acts.

Most mosaic data is sensitive only in aggregation; it is often trivial in and of itself—and we consequently think little of giving it, or the rights to use it, away. Indeed, mosaic data by its nature is material we have disclosed to others, often in exchange for some benefit, and often with the understanding, implicit or explicit, that it would be aggregated and mined for what it might say about us. It takes a feat of intellectual jujitsu to construct a cognizable and actionable set of privacy interests out of the amalgamation of public activities which one transacted knowingly with a stranger in exchange for a benefit. The term privacy has become a crutch—a description of many different values of quite-different weights—that does not usefully describe the harms we fear.

The more sophisticated privacy scholars and advocates appreciate this. In his exhaustive effort to create a “Taxonomy of Privacy,” Daniel Solove argues up front that “The concept of ‘privacy’ is far too vague to guide adjudication and lawmaking”[5] and that “it is too complicated a concept to be boiled down to a single essence.” Rather, he treats privacy as “an umbrella term, referring to a wide and disparate group of related things.”[6] Just how wide becomes clear over the course of his 84-page article. His taxonomy contains four principal parts, each consisting of multiple subparts—creating, all in all, a 16-part typology that ranges from blackmail to data “aggregation” and “decisional interference.” And he concedes in the end that although all of the privacy harms he identifies “are related in some way, they are not related in the same way—there is no common denominator that links them all.”[7] Solove’s heroic effort to salvage privacy’s coherence through comprehensive cataloguing has the unintended effect of revealing its unsalvagability.

My purpose here is to propose a different vocabulary for discussing the mosaic—in some ways a simpler, cruder one, but one that both more accurately describes than privacy our behavior with respect to the mosaic and that offers more useful guidance than the concept of privacy does as to what activities we should and should not tolerate. The relevant concept is not, in my judgment, protecting some elusive positive right of user privacy but, rather, protecting a negative right—a right against the unjustified deployment of user data in a fashion adverse to the user's interests, a right, we might say, against databuse.

 The databuse conception of the user’s equity in the mosaic is more modest than privacy. It doesn’t ask to be “let alone.” It asks, rather, for a certain protection against tangible harms as a result of a user’s having entrusted elements of his or her mosai c to a third party. Sometimes, to be sure, these tangible harms will implicate privacy as traditionally understood, but sometimes, as I will explain, they will not. Think of it as a right to not have your data rise up and attack you.

Thinking about mosaic questions we currently debate in the language of privacy in terms of databuse has a clarifying effect on a number of contemporary public policy disputes. In some cases, it will tend to suggest policy outcomes roughly congruent with those suggested by a more conventional privacy analysis. In other cases, by contrast, it suggests both more and less aggressive policy interventions and market developments on behalf of users. In some areas, it argues for a complacent attitude towards data uses and acquisitions that have traditionally drawn the skeptical eye of privacy activists. Yet it also suggests more intense focus on a subset of privacy issues that are currently under-emphasized in privacy debates—specifically, issues that genuinely implicate personal security.

Full paper at

Tragedy of the Data Commons

By Jane Yakowitz
Brooklyn Law School, March 18, 2011


Accurate data is vital to enlightened research and policymaking, particularly publicly available data that are redacted to protect the identity of individuals. Legal academics, however, are campaigning against data anonymization as a means to protect privacy, contending that wealth of information available on the Internet enables malfeasors to reverse-engineer the data and identify individuals within them.

Privacy scholars advocate for new legal restrictions on the collection and dissemination of research data. This Article challenges the dominant wisdom, arguing that properly de-identified data is not only safe, but of extraordinary social utility. It makes three core claims.

First, legal scholars have misinterpreted the relevant literature from computer science and statistics, and thus have significantly overstated the futility of anonymizing data.

Second, the available evidence demonstrates that the risks from anonymized data are theoretical - they rarely, if ever, materialize.

Finally, anonymized data is crucial to beneficial social research, and constitutes a public resource - a commons - under threat of depletion. The Article concludes with a radical proposal: since current privacy policies overtax valuable research without reducing any realistic risks, law should provide a safe harbor for the dissemination of research data.
Paper at

Tuesday, April 12, 2011

Paying For Privacy

Buying back your own privacy. The high cost of keeping your personal information personal.

For most people, every five minute stroll on the web, the Internet, is a five-minute undressing. Websites snatching, saving, selling information on every click you make, every bit of personal data they can grab.

Web companies say it’s in the interest of consumer convenience and personalization. Privacy advocates say it’s out of control.

Back in the day, we presumed privacy until we saw in “invaded”. Today, many people presume a kind of nakedness on the web. These days you can have to pay to buy your own privacy back.

This hour On Point: the new frontiers of privacy. Listen to this show at
- Tom Ashbrook
Nick Bilton, technology reporter and the lead writer for the Bits blog on  He is author of “I Live in the Future & Here’s How It Works.”
Esther Dyson, angel investor in internet startups, regular commentator on emerging digital technology and the former chairman of the Electronic Frontier Foundation.
Latanya Sweeney, visiting professor at Harvard University’s Center for Research on Computation and Society. She is also the founder and director of the Data Privacy Lab, which seeks to shape the evolving relationship between technology and the right to privacy.
Michael Fertik, CEO of and author of “Wild West 2.0.”

Commercial Privacy Bill of Rights

On April 12, 2011, Senator Kerry and Senator McCain introduced a Commercial Privacy Bill of Rights to establish a baseline code of conduct for how personally identifiable information and information that can uniquely identify an individual or networked device are used, stored, and distributed. 

This legislation would go a long way to increasing consumer trust in the market and generating additional activity as a result as well as protecting people from unscrupulous actors in the market by creating a set of basic rights to which all Americans are entitled.

These privacy rights include:

·       The right to security and accountability: Collectors of information must implement security measures to protect the information they collect and maintain.

·       The right to notice, consent, access, and correction of information:  Collectors of information must provide clear notice to individuals on the collection practices and the purpose for such collection.  Additionally, the collector must provide the ability for an individual to opt-out of any information collection that is unauthorized by the Act and provide affirmative consent (opt-in) for the collection of sensitive personally identifiable information.  Respecting companies existing relationships with customers and the ability to develop a relationship with a potential customers, the bill would require robust and clear notice to an individual of his or her ability to opt-out of the collection of information for the purpose of transferring it to third parties for behavioral advertising.  It would also require collectors to provide individuals either the ability to access and correct their information, or to request cessation of its use and distribution.

·       The right to data minimization, constraints on distribution, and data integrity:

Collectors of information would be required to collect only as much information as necessary to process or enforce a transaction or deliver a service, but allow for the collection and use of information for research and development to improve the transaction or service and retain it for only a reasonable period of time.  Collectors must bind third parties by contract to ensure that any individual information transferred to the third party by the collector will only be used or maintained in accordance with the bill’s requirements.  The bill requires the collector to attempt to establish and maintain reasonable procedures to ensure that information is accurate.

Other key elements of the Kerry-McCain Commercial Privacy Bill of Rights include:

·       Enforcement:  The bill would direct State Attorneys General and the Federal Trade Commission (FTC) to enforce the bill’s provisions, but not allow simultaneous enforcement by both a State Attorney General and the FTC.  Additionally, the bill would prevent private rights of action. 

·       Voluntary Safe Harbor Programs:  The bill allows the FTC to approve nongovernmental organizations to oversee safe harbor programs that would be voluntary for participants to join, but would have to achieve protections as rigorous or more so as those enumerated in the bill.  The incentive for enrolling in a safe harbor program is that a participant could design or customize procedures for compliance and the ability to be exempt from some requirements of the bill.

·       Role of Department of Commerce:  The Act directs the Department of Commerce to convene stakeholders for the development of applications for safe harbor programs to be submitted to the FTC.  It would also have a research component for privacy enhancement as well as improved information sharing. 

On the right of the page, you can download the text of the legislation, a short summary of the bill, a section-by-section summary of the bill, the text of Senator Kerry's introduction of the legislation, as well as the press release announcing the bill.