Threshold Cryptography

ThresholdCryptography is the art of chopping a secret into little bits.

Only by possessing more than a threshold number of bits of the secret can the secret be determined.

For example if I had four copies of a key and cut each key into three pieces, distributing one piece each to twelve people then it would probably take about five people to use the key. The minimum threshold in this case would be three, and the maximum number of servers to contact would be 9. That's with random distribution. With algorithmic, non-random distribution based on name (e.g. the four lowest in the alphabet get piece1, the next four get piece2, etc.), this can be reduced to exactly three successful contacts necessary to find the whole key.

Algorithms exist to break any secret up such that at least and exactly M out of N holders of pieces of the secret must give approval (and their partial secret or key) in order to compute the total secret (e.g. 3 of 5, 3 of 12, 5 of 12, etc.). Removing probability has a cost, though... a secret must be broken into C(N,M-1) pieces and each holder carries (N-M+1)/N parts of the whole key... so '3 of 12' is more expensive per-node than '5 of 12'. (These numbers come from the pigeonhole principle and constraints: any piece 'pK' must be found on N-M+1 holders so that access to a full M secret-holders guarantees 'pK' will be known, whilst access to M-1 computers must guarantee that there is at least one piece not found, so 'pK' must NOT be with the other M-1 computers. The minimum number of component 'pK' elements to do this is (N) Choose (M-1). Individual pK elements can be made artificially large in order to subvert guessing of one or two missing pieces; the combinator function needn't be straightforward append. However, The computation and storage cost of this approach is high, and it may do well to combine it with some straightforward split-and-distribute as listed above; e.g. splitting the 'require 5 of 7 pieces' to 'more than 7' people is the natural extension to splitting 'require 3 of 3 pieces' to '12 people'. The combined effect can avoid the massive costs of splitting and storing, say '5 of 20' parts (C(20,4) unique parts, every node holding ~16/20ths of total secret, vs. C(7,4) parts with each node holding 3/7ths of total secret). The main advantage of mixing in this algorithmic division approach is in achieving better guarantees as to redundancy and survivability while simultaneously increasing the number of users one must access to possess the whole secret. E.g. for the other approach, to require 5 users would require splitting the key into 5 pieces and divvying that up among, say, 15 people; it would take access to 5 people to gain the secret, and the secret could be lost by losing 3 people. Splitting it to 5 of 7 first, then dividing the 7 chunks among 14 people results in 2 different people having a copy of any given chunk, and the secret won't be lost before losing 6 people (losing three whole chunks).

As a security measure, ThresholdCryptography requires that many systems must be compromised prior to taking control of a secret, inherently including resistance to snooping or abuse by any superusers of the computation resource (who would have the ability to do so if the secret were wholly on one system). It also provides inherent redundancy of the secret... e.g. if you can guarantee that it takes at least and at most 5 of 12 secret-holders to build the secret, you can guarantee that a failure of up to 7 systems is tolerable without failure. With a probabilistic split, you can easily calculate a percentage chance that the data is unavailable for each loss of node... and, with intelligent split of components, you can guarantee that at least some count of nodes must be lost before the data has any chance of being lost.

In the case of authorization to access a different system (e.g. to control a power plant), security can be increased further by demanding that a few parts of the approval come from -particular- people that are known to still be accessible... and by changing these people at regular intervals. This makes it much more difficult to gain access even by compromising the systems... because you can't easily know which particular systems ought to be compromised. E.g. in the split requiring 3 approvals from 12 people, one could conceivably gain access by compromising just 3 systems, and the larger number of systems would conceivably mean that there is less oversight and security on each one. However, if the final guard to the system demands that at least one piece come with signature from 'person X', with a different X at different times (e.g. changing once each 8-hour shift), would mean that compromising just 3 systems might not include person X at least 9/12ths of the time. By jumping upwards to a combination and in number of secret-holders, one reduces the window in a combinatorial manner (e.g. requiring 3 particular people of 14 would have C(14,3) possible combinations and thus compromising exactly 3 systems would only offer a window open 1/364ths of the time. Also, if some systems are known to be especially well guarded against compromise, one could require approval from one of those systems as part of the whole package. The strength of this defense is orthoganal to the number of parts into which the secret is split, but it can only help in cases of where the collective authorization-secret is examined ultimately by yet another secure agent (e.g. each person is giving approval and a partial key to access a system that will reject access until you have both the key and approval from -particular- people).

The strongest reason for using this mechanism over straightforward encryption is that a secret might need to be available to users that can only provide a -certificate- authorizing access to a file or service, and the primary encryption isn't against any key with which individuals share long-term access (there is no shared key). E.g. one can use ThresholdCryptography to encrypt files or split keys requiring, say, either 'Secret' clearance with 'Power Grid' specialization, or 'Top Secret' clearance, represented as a certificate signed by a government master key not in expiration, and any individual that can prove to M of N systems that he or she possesses the necessary clearances will be provided the capability to actually perform the task. Key distribution is a difficult problem, doubly so when you won't trust that any one key distribution server hasn't been compromised; ThresholdCryptography is one of the more elegant answers to that particular problem.

ThresholdCryptography can also be used in PolicingOnlineGames... e.g. to ensure that so many pieces of a puzzle are found before it is even -possible- to put them together to form something meaningful (no cheating!).

http://theory.lcs.mit.edu/~cis/cis-threshold.html
http://www.acsac.org/2001/papers/42.pdf
- doing it with Java
http://crypto.stanford.edu/~dabo/ITTC/
- IntrusionTolerance?
http://www.cs.cornell.edu/Courses/cs513/2000SP/SecretSharingCA.html
- calculating partial signatures on different servers then combining, key parts never in direct contact.
http://munitions2.xs4all.nl/dolphin.cgi?action=render&category=0108 {BrokenLink}
- software for splitting files requiring n people to reassemble
http://www.anr.mcnc.org/projects/Yalta/DC-PI-27Jul2001.pdf
- slides for Yalta a distributed PKI using Java RMI over SSL