In the Western world, we tend to get very possessive about our identities and talk incessantly at conferences like these about how we want control over our own identities. That's a useful starting point for discussing the preservation of privacy, but it has ignored something key along the way. Your identity--in terms of how you represent yourself in communities, in commerce, or in public forums--is not your own. It's an index into some institution that you share with the people to whom you're giving your identity.
Because this idea is so abstract, consider it with a non-virtual example. Suppose I come to your small town and say, "I'm the mayor's brother; will you put me up for the night?" You're not likely to accept my claim at face value; I have not established my identity.
The situation is completely different if the mayor, or your friends, or some other authority in the town, says, "He's the mayor's brother; will you put him up for the night?" Your response will depend now on how much you like the mayor, or whether he approved your recent application for a construction permit--things that the identity researchers like to call attributes of the mayor. My own identity is settled however, because someone you invest with authority has verified it.
This principle turns up in the most everyday online activities. The identification andyo means nothing, but email@example.com is useful because the mail server at oreilly.com knows how to send mail to me using the andyo label. And this authority extends far beyond a single mail server at oreilly.com, even though it's the only server that needs to know me; successful mail delivery requires a whole network of servers connected by internet protocols.
The fundamentals of computing respect the principle of authority over identities. Suppose I go to an online store, and I'm their 73,267th customer. Their database may automatically assign a unique identifier of 73,267 to me. And that identifier may never be seen outside of the code accessing the database, but it is my identity as far as the store is concerned. Or, as Brands said at the conference, "the notion of a record [in a database at some agency or company] is what we call identity."
We can layer all sorts of powerful features and describe all manner of personal attributes in an identity, and develop ever more sophisticated protocols for exchanging the data securely, but all identities come down ultimately to the authorities we entrust with them. This means that identity management is not really the management of individual identities, but the management of institutions we trust.
As you tussle out the policy issues around online identity, keep one idea in mind: your identity is an entry in the database of the authority that authenticates you. Feel better? Whether you do or not, at least you will be guided down the right policy-making paths.
As I will explain, the identity development community has come together around some strategies to move power from the authorities to the individual applicants for identity; this is called user-centric identity.
Once we recognize that managing identity means managing authority, we can understand the source of many policy debates. Some of the themes in this section build on a pair of earlier articles of mine: From P2P to Web Services: Addressing and Coordination and From P2P to Web Services: Trust.
In the financial world, there are controversies over the accuracy and fairness of credit ratings that credit companies check. Online, we raise similar complaints when online sites collect personal information as a prerequisite for signing us up. Some researchers suggest that large numbers of users submit false information as an underhanded protest.
These controversies raise a cluster of questions about authority: what do authorities have the right to demand of the people to whom they give identities? Just because we have all granted the right to maintain identity to a particular authority, should that authority be 100% in control? Should the authority make all the rules at its sole discretion?
In the 1990s, several large computer companies--including Microsoft, IBM, Sun, and others--sought the Holy Grail of single sign-on. This technology allows a user to enter a password just once and then surf seamlessly from one site to another without having to go through the annoyance of re-entering the password. Single sign-on required a federated security model (letting one site validate a user and provide information about that user to another site), and therefore heightened needs for trust.
Single sign-on is more than a convenience to encourage users to visit more sites. It could provide an important boost to online security, because it uses forms of digital signatures that are more secure than simple password verification. It could also enforce good security practices, eliminating the two common problems that plague password systems: users choosing passwords that are easy to uncover and using the same password for multiple sites.
Single sign-on was technically feasible, but ran into real-world problems around trust and liability. If I let your site validate my users, how can I make sure that you maintain at least as good security as I do--that you keep your internet hosts well-patched, validate user-submitted information for fraud, screen your employees for criminal backgrounds, and so forth?
The next step, therefore, after the heady stage of creating federated standards, was investing an enormous amount of corporate time to set up standards for security and verification. Then, of course, institutions had to administer those standards. Then came more protocols (on top of the federated security protocols just released) to transfer information about trust and liability.
In short, the computer industry dealt with the issue of authority by trying to formalize authority in standards, institutions, and protocols. The system has made very limited headway.
Our credit ratings are a function of the companies that maintain the ratings; were the companies to go out of business and lose the expertise needed to maintain their databases, we'd lose our credit ratings. The same goes for online identities; they persist only as long as the institutions that offer them.
I don't really believe Equifax will go away (without some other responsible authority taking over its databases), so a more pertinent worry is that the government will take ownership of data that companies have promised--or at least, users have assumed--would be confidential. This fear reflects the reality that our online identities are owned by the authorities that grant them. We also fear that companies will mine our data and use it for purposes we haven't authorized.
Earlier, I said that somebody always has a choke point over my identity because a record of me has to be stored somewhere. Technical measures can minimize the control exercised by this choke point, but only by introducing yet another choke point.
The identity, privacy, and reputation communities have built enormously complex technical systems to support their goals, but the roots of the systems can be fairly simply described. They go back to the invention of public-key cryptography and digital signatures.
Public-key cryptography is a historic 1970s-era mathematical breakthrough that lets me encrypt data with one key and publicize a second key for others to decrypt the data. Because no one knows the first key, my use of the key is like a signature, identifying me as the data's owner. Just as you can't trust someone who phones you and says, "I'm calling from your bank," you can't trust a signed email from me unless you have independent confirmation that the signature belongs to the editor from O'Reilly.
There are several ways, varying in security and convenience, to prove who I am. I could publish a public key on a well-known website such as oreilly.com, which one hopes would not get compromised. I could share my key with friends and with friends of friends, in the hope that eventually one would be friends with you (the web of trust). The security community prefers a system where you and I go through some major organization that we both know. This site, like a bank that gives me checks and where you can take a check to cash, is a trusted third party.
Federated security systems can be wonderfully flexible and fine-grained. I may be able to submit a message to a bank with several parts, one signed by a trusted third party to prove I'm really me (that's an identity broker), and another signed by another trusted third party to prove I've paid up my mortgage. All these trusted third parties are choke points that raise the same privacy, trust, and persistence questions as the authorities I discussed earlier.
People have dignity and sometimes even good sense. While they are sometimes wantonly careless about their privacy, they seem to have a feeling for the risks of releasing control over their identities online. They've voted with their mouse buttons and rejected services, such as Microsoft's notorious Passport, that seem to want a finger in every transaction.
We are thus at a propitious moment where the major vendors have come together with the researchers around a seemingly viable plan that would preserve users' control over data. They can't violate the basic physical facts about authorities and their dominant role in identities that I laid out in the previous section. There are two roads to user-centric identity:
- Contractual: the authority promises the user not to misuse the data stored with the authority.
- Technical: the software used to store and transmit user data encrypts and digitally signs it in such a way that is hidden from everyone except the sites at the endpoints with a need to know.
Technical experts tend to like technical solutions, because they consider them harder to overpower. Both are necessary. Contracts are necessary so that the two sides can agree on what the technical measures accomplish, and as a fallback when technical measures fail.
The idea of hiding data may seem complex, but imagine going to a conference where you are entitled to certain goods--a copy of the proceedings, say, or a couple of drinks at the reception. When you check in, somebody gives you one red ticket that entitles you to a copy of the proceedings, and two blue tickets that entitle you to free drinks. The person giving you the proceedings or drinks doesn't have to know your name or credentials. Nobody ever has to find out whether you picked up the proceedings or had any drinks.
The analogy is a loose one (and there are still ways in the ticket system for data about you to leak), but online software systems are much more strict and mathematically rigorous in protecting privacy.
On the other extreme from the technical view I'm presenting here, identity is also viewable as a function of the community you're in. It's other people who accept or reject you, and who determine your reputation. Law professor Kim Taipale said, "Identity is a duality." Because other people demand to know things about you so they can decide how much to trust you, he called identity "a mechanism for managing risk," and continued, "We could claim that society owns your reputation more than you do." Professor Beth Noveck has spun out identity's source in community in legal and historical detail.