Shamir's Secret Sharing

Shamir's Secret Sharing, formulated by Adi Shamir, is one of the first secret sharing schemes in cryptography. It is based on polynomial interpolation over finite fields.

High-level explanation[]

Shamir's Secret Sharing (SSS) is used to secure a secret in a distributed way, most often to secure other encryption keys. The secret is split into multiple parts, called shares. These shares are used to reconstruct the original secret.

To unlock the secret via Shamir's secret sharing, a minimum number of shares are needed. This is called the threshold, and is used to denote the minimum number of shares needed to unlock the secret. An adversary who discovers any number of shares less than the threshold will not have any additional information about the secured secret-- this is called perfect secrecy. In this sense, SSS is a generalisation of the one-time pad (which is effectively SSS with a two-share threshold and two shares in total).

Let us walk through an example:

Problem: Company XYZ needs to secure their vault's passcode. They could use something standard, such as AES, but what if the holder of the key is unavailable or dies? What if the key is compromised via a malicious hacker or the holder of the key turns rogue, and uses their power over the vault to their benefit?

This is where SSS comes in. It can be used to encrypt the vault's passcode and generate a certain number of shares, where a certain number of shares can be allocated to each executive within Company XYZ. Now, only if they pool their shares can they unlock the vault. The threshold can be appropriately set for the number of executives, so the vault is always able to be accessed by the authorized individuals. Should a share or two fall into the wrong hands, they couldn't open the passcode unless the other executives cooperated.

Mathematical formulation[]

Shamir's Secret Sharing is an ideal and perfect $\left(k,n\right)$ -threshold scheme. In such a scheme, the aim is to divide a secret $S$ (for example, the combination to a safe) into $n$ pieces of data $S_{1},\ldots ,S_{n}$ (known as shares) in such a way that:

Knowledge of any $k$ or more $S_{i}$ pieces makes $S$ easily computable. That is, the complete secret $S$ can be reconstructed from any combination of $k$ pieces of data.
Knowledge of any $k-1$ or fewer $S_{i}$ pieces leaves $S$ completely undetermined, in the sense that the possible values for $S$ seem as likely as with knowledge of $0$ pieces. The secret $S$ cannot be reconstructed with fewer than $k$ pieces.

If $n=k$ , then every piece of the original secret $S$ is required to reconstruct the secret.

One can draw an infinite number of polynomials of degree 2 through 2 points. 3 points are required to define a unique polynomial of degree 2. This image is for illustration purposes only — Shamir's scheme uses polynomials over a finite field, not representable on a 2-dimensional plane.

The essential idea of the scheme is based on Lagrange interpolation theorem, specifically that $k\,\!$ points is enough to uniquely determine a polynomial of degree less than or equal to $k-1\,\!$ . For instance, 2 points are sufficient to define a line, 3 points are sufficient to define a parabola, 4 points to define a cubic curve and so forth. We assume our secret $S$ can be represented as an element $a_{0}$ of a finite field $GF(q)$ . We choose at random $k-1$ elements, $a_{1},\cdots ,a_{k-1}\,\!$ , from $GF(q)$ and construct the polynomial $f\left(x\right)=a_{0}+a_{1}x+a_{2}x^{2}+a_{3}x^{3}+\cdots +a_{k-1}x^{k-1}\,\!$ . Let us construct any $n\,\!$ points out of it, for instance set $i=1,\ldots ,n\,\!$ to retrieve $\left(i,f\left(i\right)\right)\,\!$ . Every participant is given a point (a non-zero integer input to the polynomial, and the corresponding integer output). Given any subset of $k\,\!$ of these pairs, we can obtain $a_{0}$ using interpolation, with one possible formulation as below:

$f(0)=\sum _{j=0}^{k-1}y_{j}\prod _{\begin{smallmatrix}m\,=\,0\\m\,\neq \,j\end{smallmatrix}}^{k-1}{\frac {x_{m}}{x_{m}-x_{j}}}$ .

Usage[]

Example[]

The following example illustrates the basic idea. Note, however, that calculations in the example are done using integer arithmetic rather than using finite field arithmetic. Therefore the example below does not provide perfect secrecy and is not a true example of Shamir's scheme. So we'll explain this problem and show the right way to implement it (using finite field arithmetic).

Preparation[]

Suppose that our secret is 1234 $(S=1234)\,\!$ .

We wish to divide the secret into 6 parts $(n=6)\,\!$ , where any subset of 3 parts $(k=3)\,\!$ is sufficient to reconstruct the secret. At random we obtain $k-1$ numbers: 166 and 94.

(a_{0}=1234;a_{1}=166;a_{2}=94),\,\!

where

a_{0}

is secret

Our polynomial to produce secret shares (points) is therefore:

f(x)=1234+166x+94x^{2}\,\!

We construct six points $D_{x-1}=(x,f(x))$ from the polynomial:

D_{0}=(1,1494);D_{1}=(2,1942);D_{2}=(3,2578);D_{3}=(4,3402);D_{4}=(5,4414);D_{5}=(6,5614)\,\!

We give each participant a different single point (both $x\,\!$ and $f(x)\,\!$ ). Because we use $D_{x-1}$ instead of $D_{x}$ the points start from $(1,f(1))$ and not $(0,f(0))$ . This is necessary because $f(0)$ is the secret.

Reconstruction[]

In order to reconstruct the secret any 3 points will be enough.

Consider $\left(x_{0},y_{0}\right)=\left(2,1942\right);\left(x_{1},y_{1}\right)=\left(4,3402\right);\left(x_{2},y_{2}\right)=\left(5,4414\right)\,\!$ .

We will compute Lagrange basis polynomials:

\ell _{0}(x)={\frac {x-x_{1}}{x_{0}-x_{1}}}\cdot {\frac {x-x_{2}}{x_{0}-x_{2}}}={\frac {x-4}{2-4}}\cdot {\frac {x-5}{2-5}}={\frac {1}{6}}x^{2}-{\frac {3}{2}}x+{\frac {10}{3}}\,\!

\ell _{1}(x)={\frac {x-x_{0}}{x_{1}-x_{0}}}\cdot {\frac {x-x_{2}}{x_{1}-x_{2}}}={\frac {x-2}{4-2}}\cdot {\frac {x-5}{4-5}}=-{\frac {1}{2}}x^{2}+{\frac {7}{2}}x-5\,\!

\ell _{2}(x)={\frac {x-x_{0}}{x_{2}-x_{0}}}\cdot {\frac {x-x_{1}}{x_{2}-x_{1}}}={\frac {x-2}{5-2}}\cdot {\frac {x-4}{5-4}}={\frac {1}{3}}x^{2}-2x+{\frac {8}{3}}\,\!

Therefore

{\begin{aligned}f(x)&=\sum _{j=0}^{2}y_{j}\cdot \ell _{j}(x)\\[6pt]&=y_{0}\ell _{0}(x)+y_{1}\ell _{1}(x)+y_{2}\ell _{2}(x)\\[6pt]&=1942\left({\frac {1}{6}}x^{2}-{\frac {3}{2}}x+{\frac {10}{3}}\right)+3402\left(-{\frac {1}{2}}x^{2}+{\frac {7}{2}}x-5\right)+4414\left({\frac {1}{3}}x^{2}-2x+{\frac {8}{3}}\right)\\[6pt]&=1234+166x+94x^{2}\end{aligned}}

Recall that the secret is the free coefficient, which means that $S=1234\,\!$ , and we are done.

Computationally efficient approach[]

Considering that the goal of using polynomial interpolation is to find a constant in a source polynomial $S=f(0)$ using Lagrange polynomials "as it is" is not efficient, since unused constants are calculated.

An optimized approach to use Lagrange polynomials to find $f(0)$ is defined as follows:

f(0)=\sum _{j=0}^{k-1}y_{j}\prod _{\begin{smallmatrix}m\,=\,0\\m\,\neq \,j\end{smallmatrix}}^{k-1}{\frac {x_{m}}{x_{m}-x_{j}}}

Problem[]

Although the simplified version of the method demonstrated above, which uses integer arithmetic rather than finite field arithmetic, works fine, there is a security problem: Eve gains a lot of information about $S$ with every $D_{i}$ that she finds.

Suppose that she finds the 2 points $D_{0}=(1,1494)$ and $D_{1}=(2,1942)$ , she still doesn't have $k=3$ points so in theory she shouldn't have gained any more info about $S$ . But she combines the info from the 2 points with the public info: $n=6,k=3,f(x)=a_{0}+a_{1}x+\cdots +a_{k-1}x^{k-1},a_{0}=S,a_{i}\in \mathbb {N}$ and she :

fills the $f(x)$ -formula with $S$ and the value of $k:f(x)=S+a_{1}x+\cdots +a_{3-1}x^{3-1}\Rightarrow {}f(x)=S+a_{1}x+a_{2}x^{2}$
fills (i) with the values of $D_{0}$ 's $x$ and $f(x):1494=S+a_{1}1+a_{2}1^{2}\Rightarrow {}1494=S+a_{1}+a_{2}$
fills (i) with the values of $D_{1}$ 's $x$ and $f(x):1942=S+a_{1}2+a_{2}2^{2}\Rightarrow {}1942=S+2a_{1}+4a_{2}$
does (iii)-(ii): $(1942-1494)=(S-S)+(2a_{1}-a_{1})+(4a_{2}-a_{2})\Rightarrow {}448=a_{1}+3a_{2}$ and rewrites this as $a_{1}=448-3a_{2}$
knows that $a_{2}\in \mathbb {N}$ $a_2\in\mathbb{N}$ so she starts replacing $a_{2}$ $a_{2}$ in (iv) with 0, 1, 2, 3, ... to find all possible values for $a_{1}$ $a_{1}$ :
- $a_{2}=0\rightarrow {}a_{1}=448-3\times 0=448$
- $a_{2}=1\rightarrow {}a_{1}=448-3\times 1=445$
- $a_{2}=2\rightarrow {}a_{1}=448-3\times 2=442$
- $\,\,\,\,\,\,\,\,\,\vdots$
- $a_{2}=148\rightarrow {}a_{1}=448-3\times 148=4$
- $a_{2}=149\rightarrow {}a_{1}=448-3\times 149=1$
After $a_{2}=149$ $a_2=149$ she stops because she reasons that if she continues she would get negative values for $a_{1}$ $a_{1}$ (which is impossible because $a_{1}\in \mathbb {N}$ $a_1\in\mathbb{N}$ ), she can now conclude $a_{2}\in [0,1,\dots ,148,149]$ $a_2\in[0,1,\dots,148,149]$
replaces $a_{1}$ by (iv) in (ii): $1494=S+(448-3a_{2})+a_{2}\Rightarrow {}S=1046+2a_{2}$
replaces in (vi) $a_{2}$ by the values found in (v) so she gets $S\in [1046+2\times 0,1046+2\times 1,\dots ,1046+2\times 148,1046+2\times 149]$ which leads her to the information:

S\in [1046,1048,\dots ,1342,1344].

She now only has 150 numbers to guess from instead of an infinite number of natural numbers.

Solution[]

This is a polynomial curve over a finite field—now the order of the polynomial has seemingly little to do with the shape of the graph.

Geometrically this attack exploits the fact that we know the order of the polynomial and so gain insight into the paths it may take between known points. This reduces possible values of unknown points since it must lie on a smooth curve.

This problem can be fixed by using finite field arithmetic. A field of size $p\in \mathbb {P} :p>S,p>n$ is used. The graph shows a polynomial curve over a finite field, in contrast to the usual smooth curve it appears very disorganised and disjointed.

In practice this is only a small change, it just means that we should choose a prime $p$ that is bigger than the number of participants and every $a_{i}$ (including $a_{0}=S$ ) and we have to calculate the points as $(x,f(x){\bmod {p}})$ instead of $(x,f(x))$ .

Since everyone who receives a point also has to know the value of $p$ , it may be considered to be publicly known. Therefore, one should select a value for $p$ that is not too low.

For this example we choose $p=1613$ , so our polynomial becomes $f(x)=1234+166x+94x^{2}{\bmod {1613}}$ which gives the points: $(1,1494);(2,329);(3,965);(4,176);(5,1188);(6,775)$

This time Eve doesn't win any info when she finds a $D_{x}$ (until she has $k$ points).

Suppose again that Eve finds $D_{0}=\left(1,1494\right)$ and $D_{1}=\left(2,329\right)$ , this time the public info is: $n=6,k=3,p=1613,f(x)=a_{0}+a_{1}x+\dots +a_{k-1}x^{k-1}\mod {p},a_{0}=S,a_{i}\in \mathbb {N}$ so she:

fills the $f(x)$ -formula with $S$ and the value of $k$ and $p$ : $f(x)=S+a_{1}x+\dots +a_{3-1}x^{3-1}\mod 1613\Rightarrow {}f(x)=S+a_{1}x+a_{2}x^{2}-1613m_{x}:m_{x}\in \mathbb {N}$
fills (i) with the values of $D_{0}$ 's $x$ and $f(x):1494=S+a_{1}1+a_{2}1^{2}-1613m_{1}\Rightarrow {}1494=S+a_{1}+a_{2}-1613m_{1}$
fills (i) with the values of $D_{1}$ 's $x$ and $f(x):1942=S+a_{1}2+a_{2}2^{2}-1613m_{2}\Rightarrow {}1942=S+2a_{1}+4a_{2}-1613m_{2}$
does (iii)-(ii): $(1942-1494)=(S-S)+(2a_{1}-a_{1})+(4a_{2}-a_{2})+(1613m_{1}-1613m_{2})\Rightarrow {}448=a_{1}+3a_{2}+1613(m_{1}-m_{2})$ and rewrites this as $a_{1}=448-3a_{2}-1613(m_{1}-m_{2})$
knows that $a_{2}\in \mathbb {N}$ $a_2\in\mathbb{N}$ so she starts replacing $a_{2}$ $a_{2}$ in (iv) with 0, 1, 2, 3, ... to find all possible values for $a_{1}$ $a_{1}$ :
- $a_{2}=0\rightarrow {}a_{1}=448-3\times 0-1613(m_{1}-m_{2})=448-1613(m_{1}-m_{2})$
- $a_{2}=1\rightarrow {}a_{1}=448-3\times 1-1613(m_{1}-m_{2})=445-1613(m_{1}-m_{2})$
- $a_{2}=2\rightarrow {}a_{1}=448-3\times 2-1613(m_{1}-m_{2})=442-1613(m_{1}-m_{2})$
- $\,\,\,\,\,\,\,\,\,\vdots$

This time she can't stop because $(m_{1}-m_{2})$ could be any integer (even negative if $m_{2}>m_{1}$ ) so there are an infinite amount of possible values for $a_{1}$ . She knows that $[448,445,442,\ldots ]$ always decreases by 3 so if $1613$ was divisible by $3$ she could conclude $a_{1}\in [1,4,7,\ldots ]$ but because it's prime she can't conclude even that and so she didn't win any information.

Python example[]

"""
The following Python implementation of Shamir's Secret Sharing is
released into the Public Domain under the terms of CC0 and OWFa:
https://creativecommons.org/publicdomain/zero/1.0/
http://www.openwebfoundation.org/legal/the-owf-1-0-agreements/owfa-1-0

See the bottom few lines for usage. Tested on Python 2 and 3.
"""

from __future__ import division
from __future__ import print_function

import random
import functools

# 12th Mersenne Prime
# (for this application we want a known prime number as close as
# possible to our security level; e.g.  desired security level of 128
# bits -- too large and all the ciphertext is large; too small and
# security is compromised)
_PRIME = 2 ** 127 - 1
# 13th Mersenne Prime is 2**521 - 1

_RINT = functools.partial(random.SystemRandom().randint, 0)

def _eval_at(poly, x, prime):
    """Evaluates polynomial (coefficient tuple) at x, used to generate a
    shamir pool in make_random_shares below.
    """
    accum = 0
    for coeff in reversed(poly):
        accum *= x
        accum += coeff
        accum %= prime
    return accum

def make_random_shares(secret, minimum, shares, prime=_PRIME):
    """
    Generates a random shamir pool for a given secret, returns share points.
    """
    if minimum > shares:
        raise ValueError("Pool secret would be irrecoverable.")
    poly = [secret] + [_RINT(prime - 1) for i in range(minimum - 1)]
    points = [(i, _eval_at(poly, i, prime))
              for i in range(1, shares + 1)]
    return points

def _extended_gcd(a, b):
    """
    Division in integers modulus p means finding the inverse of the
    denominator modulo p and then multiplying the numerator by this
    inverse (Note: inverse of A is B such that A*B % p == 1) this can
    be computed via extended Euclidean algorithm
    http://en.wikipedia.org/wiki/Modular_multiplicative_inverse#Computation
    """
    x = 0
    last_x = 1
    y = 1
    last_y = 0
    while b != 0:
        quot = a // b
        a, b = b, a % b
        x, last_x = last_x - quot * x, x
        y, last_y = last_y - quot * y, y
    return last_x, last_y

def _divmod(num, den, p):
    """Compute num / den modulo prime p

    To explain what this means, the return value will be such that
    the following is true: den * _divmod(num, den, p) % p == num
    """
    inv, _ = _extended_gcd(den, p)
    return num * inv

def _lagrange_interpolate(x, x_s, y_s, p):
    """
    Find the y-value for the given x, given n (x, y) points;
    k points will define a polynomial of up to kth order.
    """
    k = len(x_s)
    assert k == len(set(x_s)), "points must be distinct"
    def PI(vals):  # upper-case PI -- product of inputs
        accum = 1
        for v in vals:
            accum *= v
        return accum
    nums = []  # avoid inexact division
    dens = []
    for i in range(k):
        others = list(x_s)
        cur = others.pop(i)
        nums.append(PI(x - o for o in others))
        dens.append(PI(cur - o for o in others))
    den = PI(dens)
    num = sum([_divmod(nums[i] * den * y_s[i] % p, dens[i], p)
               for i in range(k)])
    return (_divmod(num, den, p) + p) % p

def recover_secret(shares, prime=_PRIME):
    """
    Recover the secret from share points
    (x, y points on the polynomial).
    """
    if len(shares) < 3:
        raise ValueError("need at least three shares")
    x_s, y_s = zip(*shares)
    return _lagrange_interpolate(0, x_s, y_s, prime)

def main():
    """Main function"""
    secret = 1234
    shares = make_random_shares(secret, minimum=3, shares=6)

    print('Secret:                                                     ',
          secret)
    print('Shares:')
    if shares:
        for share in shares:
            print('  ', share)

    print('Secret recovered from minimum subset of shares:             ',
          recover_secret(shares[:3]))
    print('Secret recovered from a different minimum subset of shares: ',
          recover_secret(shares[-3:]))

if __name__ == '__main__':
    main()

Properties[]

Some of the useful properties of Shamir's $\left(k,n\right)\,\!$ threshold scheme are:

Secure: Information theoretic security.
Minimal: The size of each piece does not exceed the size of the original data.
Extensible: When $k\,\!$ is kept fixed, $D_{i}\,\!$ pieces can be dynamically added or deleted without affecting the other pieces.
Dynamic: Security can be easily enhanced without changing the secret, but by changing the polynomial occasionally (keeping the same free term) and constructing new shares to the participants.
Flexible: In organizations where hierarchy is important, we can supply each participant different number of pieces according to their importance inside the organization. For instance, the president can unlock the safe alone, whereas 3 secretaries are required together to unlock it.

A known issue in Shamir's Secret Sharing scheme is the verification of correctness of the retrieved shares during the reconstruction process, which is known as verifiable secret sharing. Verifiable secret sharing aims at verifying that shareholders are honest and not submitting fake shares.

References[]

Shamir, Adi (1979), "How to share a secret", Communications of the ACM, 22 (11): 612–613, doi:10.1145/359168.359176, S2CID 16321225.

Liu, C. L. (1968), Introduction to Combinatorial Mathematics, New York: McGraw-Hill.
Dawson, E.; Donovan, D. (1994), "The breadth of Shamir's secret-sharing scheme", Computers & Security, 13: 69–78, doi:10.1016/0167-4048(94)90097-3.
Knuth, D. E. (1997), The Art of Computer Programming, II: Seminumerical Algorithms (3rd ed.), Addison-Wesley, p. 505.

Benzekki, K. (2017), "A Verifiable Secret Sharing Approach for Secure MultiCloud Storage", In Ubiquitous Networking, Lecture Notes in Computer Science, Casablanca: Springer, 10542: 225–234, doi:10.1007/978-3-319-68179-5_20, ISBN 978-3-319-68178-8.

External links[]

Shamir's Secret Sharing in the Crypto++ library
Shamir's Secret Sharing Scheme (ssss) – a GNU GPL implementation
sharedsecret – implementation in Go
s4 - online shamir's secret sharing tool utilizing HashiCorp's shamir secret sharing algorithm