ABCs of PII

by Alexander C. Mueller

What is personally identifiable information, abbreviated P.I.I. or PII, and why is it important?

It’s easiest to break down backwards. First, it is Information, and typically the information so discussed is held by a large corporation of a government agency. Second, it Identifies some individual Person apart from the others. The term PII can sometimes refer by law to specific types of data, but the term is used broadly to refer to a broad category of data about everyday people that large organizations commonly end up storing.

Your name is the ultimate everyday example of PII. If you are standing next to someone else, a person who wanted your attention would say your name and not theirs - they’ve just used a small piece of information (your name) to identify you as one person apart from another.

Phone numbers are a bit more interesting. They do have a practical purpose, but they are also a good way to keep two people with the same name from getting confused in your database. Often, a business that collects this information on you is doing it for this sort of reason and not to actually try and call you. Phone number is thus another example of PII, information used to identify one person apart from another.

Thinking about data in this way is valuable because there are many white collar crimes and other misdeeds for which this sort of information is absolutely necessary to get started. Identity theft is the obvious and familiar example. However, there are many more scams you can only begin after you have enough information to target specific individuals and not groups of people. Imagine you are a foreign spy agency looking to recruit informants. Which is more helpful to you: 1) knowing that there are indebted people living in a particular city 2) a list of names, addresses, and phone numbers of indebted people in a particular city?

Zero-knowledge proofs and the total-knowledge status quo

There are also sorts of processes out there that are really about proof but rarely stated this way.  When you call the bank and verify your identity your mother's maiden name, they are not interested in the name per se but the proof that you know it.  Record linkage processes behind the scenes essentially operate on proof, done in the CPU of a computer, that a collection of records all refers to the same person in real life - just what a person's name is doesn't matter, but how it corresponds to other names in other records.  It's not a term in wide circulation, but you might call these total-knowledge proofs in that the information about the names is exposed.

There is a cryptographic technique called a zero-knowledge proof that allows these linkages and verifications to be performed without giving away anything about the data in question.  They are a natural fit for the P.I.I. (personally identifiable information) held by businesses about consumers, as this information is rarely of interest in it's own right but is instead used for the sort of matching and identification mentioned.  Capnion's position is that these zero-knowledge methods should replace their total-knowledge counterparts throughout the economy, eliminating the need for many businesses to ever hold unencrypted data on consumers.  

What is P.I.I. (a.k.a. PII or personally identifiable information)?

An important category of data is personally identifiable information, often referred to as P.I.I. or PII, and it's name is suggests accurately what it is: information that can be used to identify an individual person.  There are many very familiar examples like name, social security number, address, etc.  Some more arcane examples are the sorts of things one needs to supply as a secondary verification of identity at the bank, such as mother's maiden name.  P.I.I. is often an explicitly spelled-out regulatory category but there are a number of pieces of information that considered as P.I.I. across jurisdictions and the philosophy defining P.I.I. is consistent even when the level of inclusiveness is not.  (Is the name of your first pet personally identifiable information?)

P.I.I. is important privacy is not just about what information is available, but what information can be tied back to an individual.  Medical records provide a great example.  If someone steals your medical records, this is not perhaps so bad if they lack information to tie these records back to you - from their perspective, they don't have your medical records but only some unknown person's medical records.