What is a great collaborator?

What makes a great collaborator? Is it different than being great on your own? Does one come at the expense of the other? I have a couple examples from popular music I think about frequently...

Eric Clapton is about as celebrated as anyone in modern popular music, but many of his most famous songs are covers or collaborations with other luminaries. He wrote "Layla" with the immortal Duane Allman, several of his hits were covers of songs by the excellent but more obscure J.J. Cale, and even his early group Cream was thusly called for being a super-group.

Run the Jewels provides a different sort of case study because both its members have maintained vigorous parallel solo careers. This is only my opinion, but I think the two members are much better together than alone: Killer Mike is prone to bogging down in southern rap idiom while El-P gets lost in grumpiness and misanthropy. These vices seem to vanish when they are together.

What do you think? Is there such thing as a great collaborator?

Moving Parts Unknown

It is a major challenge for businesses of all sizes, and one that will only loom larger and larger, how information technology is increasingly complex, essential, and opaque. One can read almost every day about a firm that got both more and less than in bargained for from an IT contractor. These must only be the tip of the iceberg, as you are able to read about them in the media only where they boil over into lawsuits (like Hertz v. Accenture recently) or when they are intrinsically public (as in the case of Healthcare.gov).

A recent example involving Siemens, an independent contractor, and some subsequent criminal trouble is a great case study in these challenges.


The short story is that a contractor (allegedly) hid a bit of sabotage in their own code in hopes of generating more demand for follow-on work. Siemens noticed they had a problem but didn’t have a real great time figuring out what it was and they were greatly displeased when they did. This is all out in public only because of the ensuing criminal complaint against the contractor.

The idea of hiring a contractor for certain purposes, at least in spirit, is that you need some standard functionality and you don’t want to distract everyone in your organization with the details of how it is getting done under the hood. This presents some danger and requires some trust, however, as it leaves room for malicious action that will be quite difficult to detect - the metaphor “under the hood” only takes us so far and many of us are better equipped to recognize an extra widget bolted to our car engine than we are to sniff out malicious surplus code.

This an introduction to a subtle, structural challenge in cybersecurity: there are administrative and economic pressures driving decentralization in how code is generated, yet the end product can be very opaque and difficult to audit. And every indication is that these are trends which will continue for a while…

The "why" of our acronyms: PII vs. PHI

by Alexander C. Mueller

You might have a medical diagnosis you find embarrassing or just plain don’t want to talk about. If someone had your medical records, they would certainly find out… but if they had just some, any medical records, without reference to you or any other particular person, then your privacy is secure.

This silly fact highlights the difference between PII and PHI, and why PII is important. PHI, or personal health information, might be described as medical records that with enough data to tie them to people in the real world. This connecting data is then PII, or personally identifiable information, pretty much by definition. With your PII, a chart becomes your medical record and fits the definition of PHI, otherwise it is some medical record and this extra level of privacy motivates interest in deidentifying datasets.

PHI and PII overlap in usage a ton because PII is a key part of what makes PHI a privacy concern.


by Alexander C. Mueller

What is personally identifiable information, abbreviated P.I.I. or PII, and why is it important?

It’s easiest to break down backwards. First, it is Information, and typically the information so discussed is held by a large corporation of a government agency. Second, it Identifies some individual Person apart from the others. The term PII can sometimes refer by law to specific types of data, but the term is used broadly to refer to a broad category of data about everyday people that large organizations commonly end up storing.

Your name is the ultimate everyday example of PII. If you are standing next to someone else, a person who wanted your attention would say your name and not theirs - they’ve just used a small piece of information (your name) to identify you as one person apart from another.

Phone numbers are a bit more interesting. They do have a practical purpose, but they are also a good way to keep two people with the same name from getting confused in your database. Often, a business that collects this information on you is doing it for this sort of reason and not to actually try and call you. Phone number is thus another example of PII, information used to identify one person apart from another.

Thinking about data in this way is valuable because there are many white collar crimes and other misdeeds for which this sort of information is absolutely necessary to get started. Identity theft is the obvious and familiar example. However, there are many more scams you can only begin after you have enough information to target specific individuals and not groups of people. Imagine you are a foreign spy agency looking to recruit informants. Which is more helpful to you: 1) knowing that there are indebted people living in a particular city 2) a list of names, addresses, and phone numbers of indebted people in a particular city?

A Tale of Two Breaches

by Alexander Mueller

Much of our public conversation around cybersecurity and data loss in particular imagines one organization, usually a business, trying to defend its castle full of goodies from the barbarian hackers outside. The reality is that data gets passed around quite a bit, and in 2019 it is lost more often because of mistakes and bad practices around how it was circulated. The public has limited visibility into this circulation, and differences in regulation create drastic differences in who hears about what breach, what firms can be held liable for, and then inevitably in their information security practices and level of care.

On one end of the spectrum, industries without any regulation of their data are almost certainly breached more often than is public and more often than they know themselves. The damage of a breach is typically to consumers and not directly to the company breached, so there is a perverse incentive to avoid discovering breaches if you believe no one else will discover either. This dynamic is particular egregious around data collaboration with business partners - in principle, if I give my data to you and you lose it doing something stupid then I am liable as well, but in practice why does anyone want to maintain a bunch of records about who has what just so they can be a liability in court.

This may sound a bit jaded and conspiratorial, but the reality is that for many breaches no one can even say where the data came from originally. These breaches are also lightly publicized because there isn’t much constructive to say about them. There is this illicitly traded database of information on 200 million Americans with no clear provenance - many believe Experian lost this data originally, but this is disputed and to the knowledge of the author Experian has not been proven liable or held accountable in any way. Databases with huge amounts of personal information are often found derelict in the cloud (often with no password!) by security researchers, and invariably it is impossible to find owners for them. This database of medical information found unprotected is one of many examples.

At the other end of the spectrum, firms holding regulated data are in a really painful position because of the data they must share for unavoidable business needs and the difficulty of ensuring that 100% of their data partners are responsible. Good regulations often require firms to maintain records on to whom data is given (HIPAA requires this for example). It is becoming increasingly burdensome for many firms to find enough responsible partners - the nature of your business requires you to share data with partners and if someone else loses it, you are still liable. Cybersecurity in one organization is hard enough!

A great example from just the past few weeks was the data breach at Quest Diagnostics, or perhaps we should say the breach at AMCA. The first breach of the affair to be announced was at the laboratory testing company Quest, but unmentioned or buried but deep in many articles was that the breach had actually occurred at AMCA, a collections agency Quest employed. Days later, with considerably less publicity, a larger story emerged about the many firms caught up in the breach that centered on AMCA. Yet it will still be true going forward that Quest will get a big share of attention related to the incident as they are the largest firm involved, the most visible, and the one who originally collected the data from consumers.

At one end of the market, more regulation is sorely needed. At the other end, we must confront the unique and subtle challenges of securing data not just in a firm but across an ecosystem of many firms that must share data as an essential part of their operations. At Capnion we believe that emerging technologies like homomorphic encryption and zero-knowledge are a hand-in-glove solution to helping this latter group of firms collaborate - don’t share more than you need, don’t share anything in the clear, set up a system with just enough information in it for your business process and nothing else.

Save the Deal!

by John Senay

The modern business development manager’s greatest frustration? The inability to share data with a customer.

You spent months looking for new prospect that could benefit from your company’s product.

Your company has used internal resources at great cost to design and provide the ultimate solution for the prospect to turn them into a high margin customer.

Both your company and your soon-to-be high margin customer see the value of the relationship and need to move forward.

Let’s get the deal done!!

To get the deal done a great amount of data needs to be shared, exchanged, and tracked between your company and the new customer.  To complicate the work flow, the high-margin customer has stated that for the deal to work, certain data has to be shared with 3 different partners in the supply chain with accompanying security and compliance issues.

The above scenario is all too familiar to the business development manager.  In this day and age, business to business sales are complicated by the requirements of sharing of data.  What information do your partners have to have access to?  And who is going to control what, where, and how the 3 different partners use the data?

This situation is becoming the norm for contract acceptance and completion.   

To get the contract signed the someone has to find a way to provide the data needed for the contract terms.   

Is there a way to provide the required data in a safe, secure manner for all parties involved in the contract that all the companies IT groups can agree upon?

Yes there is!!

Capnion has a suite of cutting-edge encrypted data-in-use tools that allows specific, agreed upon data to be exchanged with all parties involved.   Using our specially generated Answer Keys the appropriate parties can verify or analyze specific data without any need of decrypting it.  At no time does the data ever need to be in plaintext!

Please contact sales@capnion.com for more information on how to meet contract clause for data sharing obligations.

Get that deal signed today!!

Thanks for reading.

The importance of being Random

by John Senay

What is all the fuss about Random Numbers and how they are generated? What do Random Numbers provide anyway? I know that’s how they pick the winning Lotto numbers.

Cryptography and Encryption use Random Numbers as their most basic building block. Without Random Numbers, Encryption would not be possible. Ghost PII would not be possible, and that would be very bad!

How can you get Random Numbers for the cryptography used in Ghost PII?

Well…you could use an algorithm to create “random” numbers, but research has shown that in certain instances, algorithms can be attacked and cracked. If the researchers have done it once, you know that the researchers will do it again, so Capnion does not use algorithms alone to generate Random Numbers!

What about hardware Random Number generators that utilize the “white noise” a PC produces while running? That is a possibility, but there are humans involved. We have all heard about the attempted and successful backdoors put into hardware by various unscrupulous parties. Not good enough for Ghost PII!!

Then how can you create true Random Numbers? One way is to use the white noise created by the earth’s atmosphere. That’s a great idea! Turn on the radio and feed the static (white noise) into the sound card and create Random Numbers. Maybe, but the earth is so finite. Not good enough for Ghost PII.

All of my life, I have wondered what is out “THERE.” You go out on a clear night, look up in any direction, and you are looking at infinity. Look through a telescope and what you see is a piece of the infinite, infinity. Wow! to this day I still cannot fully comprehend infinity of the Universe. It’s big, but it’s home!

How to use the infinite infinity to generate Random Numbers?

Capnion goes out to a Top Secret location and takes high resolution pictures of the night sky. Capnion calls this process Starlight.

The night sky is always changing due to changes in the atmosphere and even in the light that is arriving from the stars in the sky. The furthest star you can see with the naked eye is V762 Cas in Cassiopeia at 16,308 light-years away! When you look at V762, some of that twinkle you see is 16,308 years old!!

The high resolution pictures are digitized and the Random Numbers that Ghost PII uses for Encryption are generated from the tiny changes in this data.

Using Starlight to generate Random Numbers for Ghost PII, is out of this world, its truly COSMIC!!

Look for some photos on the website for a Starlight event.

Thanks for reading!!

Protect your data everywhere: at rest, in transit, in use, even in use by 3rd parties

by John Senay

Some good news for everybody: Capnion is proud to announce the private BETA release of Ghost PII. The goal of Ghost PII is to protect YOUR data while in use.

In the next few posts, I will give a high-level overview on how Ghost PII works and present a few application scenarios.

…So lets get started. Ghost PII actually uses a 2 step process to secure plaintext data. The first process used is called a One-Time Pad, or OTP for short. OTP was invented in 1882 by Frank Miller.  That’s right - Ghost PII is built on a process that is over 137 years old! Why? In 1949 (70 years ago!) Claude Shannon proved mathematically that OTP is unbreakable when truly random numbers are used to generate the key. Why is this important? Quantum computing is on the horizon. QC will provide unparalleled computational power that can break most existing encryption methods. These computers can run unique algorithms and their speed is increasing, with QC firm D-Wave recently announcing they had doubled the power of their previous generation of hardware.  No matter how much computational power is used, with truly random numbers, OTP is unbreakable.

How does Ghost PII generate truly random numbers? And how does Ghost PII make your encrypted data easy to work with?  (Hint: it includes an emerging technology called homomorphic encryption). Sounds like a great pair of lead-ins for another post.

To learn more, contact sales@capnion.com.

Whales and Proof-of-Stake

by Alexander Mueller

Proof-of-stake consensus algorithms are touted as the next big thing in blockchain and not for no reason - concerns about massive electricity usage in existing proof-of-work architecture have taken on macro-scale environmental relevance with the success of Bitcoin and others. Proof-of-stake does not require significant real world resources on the margin, so there are in fact much better prospects for limiting energy utilization. However, moving costs into the virtual sphere presents new dangers, particularly to the extent that this makes it easier for “whales” (big-time owners of huge amounts of cryptocurrency) to manipulate markets.

First, let’s rewind a bit: what is proof-of-stake and how does it differ from the original proof-of-work approach? What is commonly called mining is really participation in maintaining the shared, distributed ledger, and all cryptocurrencies work by creating formidable incentives to participate in this maintenance honestly. In the older proof-of-work model, reaping the rewards of mining requires solving a computationally difficult makework problem. An attempt to participate in mining maliciously might not work, but it would still require the expense of solving this makework problem. This system has worked well, but many have become justifiably uncomfortable directing so much energy to open makework.

Proof-of-stake, on the other hand, requires a “stake” of cryptocurrency rather than a solution to a makework problem. If you behave yourself, you get it back, while you lose it as punishment for bad behavior. It is obvious but important that this system is explicitly oriented towards those who already hold cryptocurrency.

There are concerns already that ownership in cryptocurrency is so concentrated - for example, 1,000 addresses control 40% of all Bitcoin, and many smaller currencies are even more dominated by these “whale” crypto one-percenters. (There is also reason to believe that single whales might control multiple addresses and thus more power than is apparent.) These big players have considerable ability to manipulate the price of these assets using the same techniques you would use to manipulate the value of of any other asset, and there is some evidence that they are willfully doing so. Unfortunately, there is potential for proof-of-stake to make this situation worse.

The key issue is the role played by transaction volume in evaluating the health and viability of a cryptocurrency. Networks that are processing lots of transactions, so the conventional wisdom goes, are really being put to work by someone out there and are thus likely to stick around. The danger here is that someone might find a way to submit a macro-scale volume of transactions to and from accounts they control. One can potentially create an appearance of traction and relevance that is not there, and typically this would mean an increase in price.

If I am both a big-time holder and a big-time miner, the fees attached to a transaction may be fees I end up paying to myself. Proof-of-work has an answer to this problem, as I am still on the hook for the electricity bill whether or not my manipulation scheme is successful. Proof-of-stake might not have an answer - the whole point is to eliminate real-world, physical costs, but these costs were also a principal barrier to manipulative self-dealing. Whale miners could very well find themselves in a position where they have every incentive to pay themselves to process transactions from themselves to themselves. On the outside, it looks like blossoming commerce, but the reality is that it is accounting manipulation on the books of a single large whale.

The environmental concerns attached to proof-of-work are real, but equally real are the concerns about how proof-of-stake might make a network more vulnerable to manipulation.

Introducing the Lay Person's PII Encryption Blog

by John Senay

In my forty plus years in the computer/network/Internet industry, I have always strived to deliver enabling solutions for both business and the individual. Starting with Visicalc and ending up with CPASS with lots of interesting technologies in between. In the back of my mind I have always wanted to help the whole world, grandioso, sure, but I am determined to do it.

This is my last rodeo before I ride off into the sunset, what to do to help the world? I think I have found it, no, I know I have found it.

There is a huge problem that threatens the very core of humanity. Almost every person on earth has their Personal Identity Information, (PII) in a database(s), in plain text, exposed to the Internet. Not a day goes by that one does not hear about a database breach somewhere in the world. Clearly this IS the opportunity to help almost every human being on earth.

I have found a company that shares my vision, Capnion. I have joined Capnion as COO (Chief Old Officer). Capnion has devised a method to protect PII in every database in the world. The short description is that Capnion can keep the PII encrypted in the database at all times, but allow companies to perform the needed updates and analysis, with the PII in the encrypted state. PII is NEVER in plain text.

In the coming weeks I will discuss how Capnion accomplishes keeping PII encrypted.

Thanks for reading.

John J. Senay




Humble Bundle: What counts as a breach?

A recent breach at Humble Bundle, marketer of discount computer games, exposes some interesting subtleties about what “data breach” should mean. You can get full details at the link below.


In this case, an attacker entered Humble Bundle’s system but was not able to carry off information wholesale. They did, however, exploit a flaw in Humble Bundle’s code that allowed them to answer a number of yes-or-no questions about Humble Bundle’s customers. The attackers essentially worked their way down a list of emails extracting information for each on whether that email was attached to an active subscription.

This provides a good illustration of a general principle: the more information the bad guys have, the more they can get. Absent other information, the hacker’s exploit of Humble Bundle is pretty useless… given a good guess at all the emails that might have a Humble Bundle subscription, the hacker’s exploit is as good as making off with the full list.

Every little bit of information that can be kept out of the hands of criminals is meaningful.

What is(n't) blockchain to infosec? Part Four: A Data Breach Liability

Blockchain is hyped as a security panacea but blockchains really create as many problems as they solve. Blockchain does nothing in particular to keep criminals from stealing sensitive data and to the extent that a blockchain duplicates data across multiple servers (which any blockchain by its nature must to some degree) it is really just creating more surface area and more breach risk. None of these problems are insoluble, but none of the solutions are uniquely blockchain solutions. They are solutions for the servers that make up the blockchain and work as they would be applied on a conventional centralized server architecture.

Bitcoin, as the original blockchain, is a natural case study. Anyone can download the software to set up a mining node on whatever computer they like - that computer is now both part of the blockchain and the same old-school centralized computer it was before. All of the data of the Bitcoin blockchain will be available on the file system of that machine as other files are, and if this were sensitive information it could be stolen by hackers just like any other information. The computer on which the node was set up is not instantly more secure for having this data on it and could be riddled with malware, worms, and you-name-it other security liabilities. It is actually pretty easy to see the consequences of this situation, as blockchain transactions are really quite public, Bitcoin is stolen all the time via other software lurking and interfering with users’ wallets, and so on.

Because a blockchain inevitably duplicates data across nodes, a blockchain is only as safe from breach as its least secure node - a theft from one node is as good as a theft from any other. The nodes are just centralized servers themselves, so there is nothing qualitatively different about blockchain breach security either. To make a blockchain secure from data breach, one needs to make its nodes secure from data breach.

It is here that subtle security liabilities become apparent. If you are considering moving data from a purely centralized system to a blockchain, you can only expect to maintain whatever level of breach security you had before, and you can only achieve this level of security by uniformly implementing your old security uniformly on each node. This sounds easy enough, but it will take you away from decentralization - a centrally mandated list of security practices for each node, if successfully implemented, is well down the road to putting blockchain as we saw earlier in this series that some of blockchain’s desirable properties actually depend on some level of distrust or at least non-collusion between nodes.

Blockchain does have some interesting security properties, notably that it provides strong tamper-resistance via immutability as discussed in Part One. However, these emerge from distrust between nodes and are weakened if all the nodes are within one organization and subject to centralized governance.

Coming up in Part Five, we’ll continue to explore this same theme, that a blockchain is a group of centralized servers and thus retains limitations of centralized servers, as we examine various (bogus) computing performance claims around blockchain. Also, be on the lookout for an upcoming post about Capnion’s approach to these breach liability problems and how they can be solved in blockchain and non-blockchain contexts.

What is(n't) blockchain to infosec? Part Three: Not quantum at all

Blockchain is often presented as a solution to the problems that quantum computing will pose for cryptography and these claims are false.  Blockchain has no particular relationship to quantum computing whatsoever.  Some very brief backstory: encryption methods are invariably built on a  math problem that is believed to be difficult and quantum computers are a sophisticated new technology that promises to make some of these math problems less difficult.  Although there is considerable propaganda out there to the contrary, blockchain is built on top of the same cryptographic techniques as everything else, it will be compromised by quantum computing like everything else, and if it gets patched up it will likely be in the same way as everything else.

As we discussed briefly in Part One, blockchain is not an innovation in cryptography per se but a distributed program built with heavy use of existing cryptography.  Notably, blockchain makes heavy use of cryptographic hash functions and asymmetric (public and private) key encryption.  Any frequent user of a cryptocurrency is implicitly familiar with the latter as it is important to keep track of one's private keys, which essentially give ownership of accounts with currency, while the corresponding public keys represent one's identity on the ledger.  The public vs. private key algorithms used in most blockchains are not unique at all but are well-tested algorithms, even down to the level of particular implementations, that are applied many other places.

It is the public vs. private key algorithms that are threatened by quantum cryptography and as blockchain is using the same algorithms it is also threatened.  It is well beyond the scope of this post to give the details of how quantum computing works, but for those readers who want to do their own Googling we will hit some of the high points.  Asymmetric key cryptography is typically built around some form of a type of math problem, the discrete logarithm problem, which involves certain computations that are efficient in one direction but very difficult to invert.  The security of the cipher is dependent on this difficulty, and the danger of quantum computing is that it allows new algorithms that make this inversion much easier.

For the moment, though, there is no danger.  The engineering difficulty around building a practical quantum computer is vast and existing (extraordinarily expensive) prototypes contain just a few quantum-analog logic gates.  Even as they improve, there will be a long period where very few actors have real access to them - they will be a sort of cryptographic nuclear weapon.  There are also new cryptographic techniques, notably lattice-based cryptography, that may prove more resistant to quantum attacks.  Blockchain's could easily be fixed by swapping in these new parts, but it would be the new components resisting quantum cryptography and not any aspect of the blockchain algorithm itself.

To recap, blockchain has nothing to do with quantum computing and won't do anything on its own to protect you from quantum attacks.  Next in Part Four, I will talk about how blockchain doesn't do anything itself to protect your data from theft and rather presents significant new liabilities.

What is(n't) blockchain to infosec? Part Two: Consensus, but maybe not what you wanted.

An important part of a blockchain is the process by which the nodes come to an agreement about what to add to the ledger - you might call this a consensus process.  This consensus process has implicitly gotten a lot of attention, both in the abstract context of the "Byzantine general's" problem and for it's application in business settings.  Unfortunately, much of this attention is not justified by the reality of the algorithm.

The core of the consensus algorithm, the part that actually makes a decision per se, is a simple, familiar majority vote.  If 51% of the nodes agree that the next block in the chain should look a particular way, that is the consensus verdict.  The 51% number gives it's name to the "51% percent attack" where a sufficiently large (51% or more) group of nodes ("miners" in the cryptocurrency context) can make the blockchain ledger anything they want, and this attack is thus often discuss in the context of centralization in cryptocurrency mining.  That there is danger in too much friendliness between nodes is a point we will visit again.

The original blockchain architecture hardens its consensus process by making suggesting a new block expensive, originally via the "proof-of-work" concept.  To submit a new block, a node must also solve a computationally expensive (and notably, totally useless otherwise) cryptographic problem.  This deters bad guys who might set up malicious nodes, as running nodes is expensive and running enough nodes to approach the 51% number is extremely expensive.

An important subtext in these last two paragraphs is that we hope our nodes distrust each other.  We hope they can't collude and that the are checking each other's "proof-of-work" homework.  Blockchain is not quite the "trustless" innovation advertised, but something powered by distrust.

Elsewhere, you can find a lot of skeptical commentary on "private" blockchains and it all relates to the distrust issue discussed above.  If you are trying to run a blockchain inside a single business, it is liability that your nodes might be run by people who all work for the same people and drink together after work.  You might be running an expensive, from a computational standpoint, architecture that really is just a simple majority vote without the distrust among nodes that makes it work.  Much of the interest in blockchain from large businesses revolves around it's functionality as a consensus process, but it is not a very good consensus process if implemented inside a single business.

Blockchain is also not the comprehensive solution to the "Byzantine general's problem" (a landmark type of problem in network communication) that it is sometimes proclaimed to be.  We have seen above that it requires a specific sort of human context to work appropriately.  It also has more subtle problems, too subtle to discuss here but embodied in debacles like the $70 million DAO hack.  The short story is this hack exploited confusion about just which nodes had what information and when, and this the essence of the problem facing the Byzantine generals.

Thus, blockchain is in part an interesting and novel consensus process, but this consensus process depends on human context and has technical limitations.  Commentary on blockchain is often hagiographic and careless with both of these. 

In Part One, we talked about what blockchain definitely offers: immutability.  Here in Part Two, we talked about what blockchain is and isn't on its own: a consensus protocol.  In the following installments, we'll start to examine the things blockchain definitely is not beginning with a discussion of how blockchain definitely does not offer any unique safe harbor from the security problems posed by quantum computing.

What is(n't) blockchain to infosec? Part One: Blockchain is Immutability

The central, novel property of blockchain is immutability.  This means that records can not be changed once they are accepted, and implicit here also is that records receive a timestamp that can not be changed once it is agreed.  Immutability was key to the success of the Bitcoin network as it guaranteed there would be no tampering with the older sections of the ledger.  Immutability is also the property by which blockchain can offer something genuinely new to information security, but to see clearly how this works we should first examine some basic concepts in cryptography and blockchain architecture.

The mysticism around blockchain imagines it as being everywhere and nowhere, but a blockchain is tangibly a group of databases on a number of different computers, commonly called nodes, communicating constantly via a cryptographic protocol in order to make sure they are all keeping records in the same way.  This protocol isn't really an innovation in cryptography in the pure sense, but really a bunch of old ingredients linked together including a very common ingredient called a cryptographic hash function.  Informally, the important properties of these hash functions are that 1) they are easy to compute, but very difficult to invert, 2) their output depends chaotically on every little bit of input, and 3) they (hopefully almost) never produce the same output given two inputs.  If I give you an output from a hash function, you are going to have a hell of a time finding an input that produces this output unless I give you mine.

So thus far we have databases sharing cryptographic information to try and stay on the same page.  This could be a big headache if we are working with a lot of data, and this is where the blocks, their arrangement in a chain, and our hash function get put to work.  All the data goes into blocks as we get it, and we put the data of the n-1 -th block into a hash function and include this output in the n -th block.  Because of property 2) listed above, this means that any change in a block will radically change all the hashes appearing in all later blocks, while properties 1) and 3) make it extremely difficult to cheat and come up with new, fake blocks that prevent these radical changes.  Thus, the efforts of our many databases to stay on the same page need focus on the most recent block as any violation of prior consensus will upset the hash appearing in this most recent block.  This resistance to changes in prior consensus, which we previously called immutability, is what makes the whole enterprise workable, both computationally and on level of human trust.  

Immutability is useful in information security because it guarantees tamper resistance.  We might want to ensure that malicious actors are not doctoring our records towards their own ends, and the transmission of hashes from one block to the next ensures that this is very difficult or impossible.  But...in the blockchain context immutability depends on the  collaboration of the nodes of our network and their consensus process.  The role played by this consensus process, and how the consensus process determines what blockchain can and can't do for information security, will be the topic of the next post in this series. 

Singapore, Healthcare Consolidation, and Data Security

Singapore Health Services was recently hit by a massive breach with 1.5 million records lost.  Although 1.5 million would still be an eye-popping number in the United States, in Singapore this breach affects one in four citizens - comparable to a breach affecting 80+ million plus Americans.  This 80 million numbers seems hard to imagine, but it is becoming more and more plausible as ongoing consolidation in healthcare drives ever greater centralization in data storage.

Even in the past few months, Cigna has purchased Express Scripts for $67 bn and CVS has bought Aetna for $69 bn while the widespread expectation is that the approval of the AT&T & Time Warner merger can only prompt more consolidation.  The time is coming when breaches at single firms, including healthcare firms holding medical information, will compromise big percentages or even majorities of American consumers.