4.5.5 Shamirs algorithm

In 1992, A. Shamir published an algorithm, which broke the basic Merkle-Hellman Cryptosystem [SH82]. The algorithm attemps to find a trapdoor pair (U, M). With (U, M) and the public key, a superincreasing sequence A can be constructed. Now every message p, encrypted with B, can be deciphered.

Recall, how a Merkle-Hellman key-pair is generated:
Choose a superincreasing sequence A = (a₁, ..., a_n), a modulus M with M > a₁ + ... + a_n and a multiplier W with 1 <= W < M. Additionally, W must be relative prim to M, i.e. gcd(M, W) = 1. This choice of W guarantees an inverse element U, such that UW = 1 (mod M). M, W and the set A constitutes the private key, whereas the public key is the set of integers B = (b₁, ..., b_n), with b_i = a_iW mod M.

Provided, M and W are given, the components a_i may be computed by a_i = b_iW^-1 mod M = b_iU mod M.

Given three private keys k_priv1, k_priv2 and k_priv3 with

key	A = (a₁, a₂, a₃)	M	W	U = W^-1 (mod M)
k_priv1	2, 4, 8	23	3	8 (8*3 = 1 mod 23)
k_priv2	1, 2, 6	35	6	6 (6*6 = 1 mod 35)
k_priv3	2, 4, 7	40	23	7 (7*23 = 1 mod 40)

Get the corresponding public keys by computing b_i = a_iW mod M. The public keys are

key	b₁ = a₁W mod M	b₂ = a₂W mod M	b₃ = a₃W mod M
k_pub1	6 = 2*3 mod 23	12 = 4*3 mod 23	1 = 8*3 mod
k_pub2	6 = 1*6 mod 35	12 = 2*6 mod 35	1 = 6*6 mod 35
k_pub3	6 = 2*23 mod 40	12 = 4*23 mod 40	1 = 7*23 mod 40

The keys k_pub1, k_pub2 and k_pub3 are indentical, though three different private keys were used. Obviously, there are many private keys producing the same public key. Indeed, as will turn out later, there are infinetely many private keys corresponding to a public key.

Every pair (U, M) can be used to get the (secret) set A, provided that

A = (a₁, ...,a_n) with a_i = b_iU mod M is a superincreasing sequence and
a₁ + ... + a_n< M

The aim of the presented algorithm is to compute a trapdoor pair, such that the resulting set A = {a_i = b_iU mod M}_i=1...n is a superincreasing one. The algorithm is divided into two parts:

find few small intervals in [0, 1[: in one of those intervals the ratio U/M must be
carry out a finer analysis: find a subinterval such that U/M is in this subinterval and (U, M) produces a superincreasing sequence.

To make the analysis easier, the following values are used:

a₁ is a dn -n bit number
a_i is a dn -n +i -1 bit number
a_n is a dn -1 bit number
M₀ is a dn bit number

The number d is the proportionality constant, which expresses the ratio of cipher to plaintext.
The compontents b_i are expected to be numbers equal in size to M, so they are dn bit numbers as is M.

Setting n = 100 and d = 2, the components of A increase in size from 100 bits for a₁ up to 199 bits for a₁₀₀.
The modulus M then is a 200 bit number.

Given a public key B = (b₁, ..., b_n). Let M₀ be the (unknown) modulus and W₀ the (unknown) multitplier used to construct the public key B. The inverse element to W is denoted by U₀.

Part one: Get a few small intervals in [0, 1[

For each b_i of the public key, define a function f_i with the variable U:

f_i(U) = b_iU mod M₀

Consider arbitrary real positive values for U. As U is less than M₀, the functions are defined in [0, M₀[. For U = U₀, function i produces exactly a_i:

a_i = f_i(U₀) = b_iU₀ mod M₀

As the public key consists of n components, n functions are defined.

The graph of function i is as follows:

Properties of function i:

f_i has a sawtooth form
The minima of f_i are located at U = 0, M₀/b_i, 2M₀/b_i, 3M₀/b_i, ... (b_i -1)M₀/b_i
f_i has a total of b_i minima in [0, M₀[
the distance between two successive minima is M₀/b_i
the slope is b_i

Now analyze f₁(U) = b₁U mod M₀:
The modulus M₀ is greater than the sum of all a_i. So M₀ is much bigger than a₁.
The integer a₁ is a dn-n bit number, M₀ a dn bit number as is b₁. So the unknown U₀ transforms a dn bit number to a dn-n bit number. The following image tries to make clear, that the unknown U₀ must be located near a minima of the first function. The distance to the next left minima is less than 2^-n.

Doing the same analysis with funcion f₂, the result is the same. The unknown U₀ is again near a minima of the second function. As a₂ is a dn-n+1 bit number, U₀ transforms a dn bit number to a dn-n+1 bit number. For that reason, U₀ is located at most 2^-n+1 right of a minimum of function two.

Since U₀ is located near a minimum of f₁ as well as f₂, there are minima of f₁ and f₂ being very close to another. The minimum of f₂ is at most 2^-n+1 to the left and at most 2^-n to the right of the minimum of f₁.

The same analysis holds for f₃, f₄, ...
U₀ is located near a minimum of all these functions, so the minima are very close to each other. The problem of finding U₀ itself can be reformulated as the problem of finding accumulation points of minima of the various functions.

Let k be the number of functions, used to get the accumulation points. Consider the p-th minimum of f₁, located at U = pM₀/b₁. The nearest minimum of f_i to pM₀/b₁ is in the interval [pM₀/b₁ - M₀/(2b_i)], because the distance of two successive minima of f_i is M₀/b_i. Assume now, that the minima in [pM₀/b₁ - M₀/(2b_i)] are random variables with uniform probability distributions. Then the probability, that the mimima of f₂, ..., f_k are sufficiently close the p-th minimum of function one, can be estimated by 2^-n+1×2^-n+2×...×2^-n+k+1, which is about 2^-kn+n+k²/2. As there are b₁ different minima of f₁, the expected number of accumulation points is b₁×2^-n+1×2^-n+2×...×2^-n+k+1, which is about 2^{dn-kn+n+k²/2}. This expression is less than 1, if (k-d-1)n > k²/2. If n is large, this expression is true, if k > d+1. Using d=2, only 4 functions have to be analyzed for getting accumulation points in number not too much. Here, d is again the proportionality constant.

But prior to computing the accumulation points, one problem remains: the functions are defined modulo M₀. This number is unknown. The minima of the i-th function depend on the slope, which is b_i. By dividing both coordinates of the i-th function by M₀, new functions f_i are definded as follows: f_i(V) = b_iV mod 1, with the new variable V in [0, 1[. The slope still is b_i and the number of minima remains b_i. The locations of the minima are now at V = 0, 1/b_i, 2/b_i, ..., (b_i-1)/b_i and the distance of two minima is reduced to 2^-dn-n, because M₀ is a dn bit number.

The unknown value V₀ now is at most 2^-dn-n right to a minimum of function f₁.

To get the accumulation points, consider the p-th minimum of f₁, located at p/b₁. To be an accumulation point, there have to be minima of f₂, f₃ and f₄ near p/b₁. The q-th minimum of f₂ is at V = q/b₂, the r-th minimum of f₃ at V = r/b₃ and the s-th minimum of f₄ is at V = s/b₄. So all these minima must be close to p/b₁. Expressed as a formula, this means that

| p/b₁ - q/b₂ | < d₁

| p/b₁ - r/b₃ | < d₂

| p/b₁ - s/b₄ | < d₃

with d₁, d₂ and d₃ the allowable deviation to the left and right of p/b₁. Because f_i has a total of b_i minima in [0, 1[, the integers p, q, r and s are numbers with 1 <= p < b₁, 1 <= q < b₂, 1 <= r < b₃ and 1 <= s < b₄.

By mulitplying with the denominators, the follwing equivalent system of inequalities is obtained:

| pb₂ - qb₁ | < e₁
| pb₃ - rb₁ | < e₂
| pb₄ - sb₁ | < e₃

This is an integer programmming problem. Lenstra has showed, that this can be solved in polynomial time in the size of the coefficients, if the number of variables is fixed.

With solving this system of inequalites, numbers p are computed for which

p/b₁ is the p-th minimum of f₁ and
near p/b₁ there are also minima of f₂, f₃ and f₄

Part two: carry out a finer analysis

With the end of step one, all accumulation points, satisfying the inequalities are found. For each of these accumulation points, perform a finer analysis. Consider now all n functions. To obtain the accumulation points, only four functions were used. For that reason, you might get points, f₂, f₃ and f₄ have a close minimum to it, but not all the other functions.

Now let p be an accumulation point. Consider the interval [p/b₁, (p+1)/b₁[, which is the interval of two successive minima of function f₁.

For this interval, get all the minima of all functions and sort them in increasing order by the v-coordinate.
Let {v₁, v₂, ..., v_s} be the list of all minima in [p/b₁, (p+1)/b₁[. The only minimum in [v_j, v_j+1[ is located at v_j, so every function f_i may be described as a linear segment:

f_i(V) = b_iV - T_{i, j}, v_j <= V < v_j+1

The value T_{i, j} determines the number of minima of function i in the interval ]0, v_j].

The following image explains the fact, that within two minima, function i may be written as a linear segment. Within ]0, 1/b_i[, no minima exist. In the next interval, f_i is exactly once reduced modulo 1, so the T-value is increased and is now 1. The line segment in [1/b_i, 2/b_i[ may be written as f_i = b_iV - 1, V in [1/b_i, 2/b_i[. Turning to [2/b_i, 3/b_i[, the segment is now f_i = b_iV mod 2, because a further minimum exists at V = 2/b_i. So between two neighbouring minima, f_i may be written as a line segment.

The task is now to scan every interval [v_j, v_j+1[, if there exists a subinterval satisfying the supericreasing property and in which the sum of all functions is less than one. It is important to limit the value V into [v_j, v_j+1[. Only in such an interval, all functions can be written in the form f_i = b_iV - T_{i, j}.
Because there are n functions, there are n-1 inequalities. All of these inequalities have to be evaluated to satisfy the superincreasing property. For example, to satisfy the condition that f₂ is greater than f₁, V is limited by

V > (T_{2, j} - T_{1, j})/(b₂ - b₁), if b₂ > b₁
V < (T_{2, j} - T_{1, j})/(b₂ - b₁), if b₂ < b₁

Together with the condition, that V be in [v_j, v_j+1[, the resuling subinterval is

]MAX(V, v_j), v_j+1[, if V > (T_{2, j} - T_{1, j})/(b₂ - b₁)
]v_j, MIN(V, v_j+1)[, if V < (T_{2, j} - T_{1, j})/(b₂ - b₁)

The resuling subinterval may be empty. If so, skip to the next interval [v_j+1, v_j+2[. Otherwise evaluate the next condition f₃ > f₁ + f₂ and again get a condition for V.
If all n-1 inequalities are satisfied and the resulting subinterval is nonempty, the condition, that their sum is less than 1, has to be checked. This inequality looks like

V < (1 + T_{1, j} + ... + T_{n, j}) / (b₁ + ... + b_n)

Again, V is limited to be in [v_j, v_j+1[.

As stated above, if one inequality is not satisfiable in [v_j, v_j+1[, turn to the next interval [v_j+1, v_j+2[ and try again. Exactly one T-value changes, namely the T-value of that function having a minima at v_j+1. This value is increased by one. All other values remain unchanged.

Now suppose, the resulting interval is nonempty. Let ]u_l, u_r[ be that interval. For every V in ]u_l, u_r[, all functions f_i satisfy

f_i(V) > f₁(V) + ... + f_i-1(V) and
f₁(V) + ... + f_n(V) < 1

The functions form a superincreasing sequence in ]u_l, u_r[; additionally they sum up to a value less than one.

Let p/q be an arbitrary rational number in ]u_l, u_r[. The following holsd true

f_i(p/q) > f₁(p/q) + ... + f_i-1(p/q)
f₁(p/q) + ... + f_n(p/q) < 1

Multiplied by the denominator q, it follows that

f_i(p) > f₁ + ... + f_i-1(p)
f₁(p) + ... + f_n(p) < q

Now a trapdoor pair (U, M) is found. Define M and U as M = q and U = p. Let a_i be f_i(p). The sequence {a_i}_i=1...n is a superincreasing one. Additionally, their sum is less than q. The tuple (A, M, U) therefore forms a valid private key to the public key B and every message encrypted with B, is now readable. As there are infinetely many rational numbers in a nonempty interval, there are infinetely many trapdoor pairs (U, M) and thus there are infinetely many private keys yielding the same public key.

Start applet

DES