4.5.4 Low density knapsacks

One of the more efficient algorithms for solving the general subset sum problem is the meet-in-the-middle algorithm.
For knapsacks having low density there is still another method namely to reduce the subset sum problem to the problem of finding a vector with small length.

Definition
Given a knapsack vector A = (a1, ..., an). Let amax be the largest element in A: amax = maxi ai, i = 1 ... n. Then the density dA of A is defined by

dA = n / log2(amax)

In Cryptosystems, based on the subset sum problem, the density of a knapsack is almost always less than one. If not so, several subsets may exist summing up to the same s and hence no clear decryption may be performed.

Definition
A vector v in Rn is an orderd tuple with elements vi in R:

v = (v1, v2, ..., vn), vi in R, i = 1 ... n

Definition
Let v = (v1, v2, ..., vn) be a vector in Rn. The euclidian length of v is

||v|| = (v12 + ... + vn2)1/2

Definition
Suppose {vi}i=1...m is a set of m vectors in Rn with m <= n. Then {vi} is linearly dependant if and only if there are ki in R with

k1v1 + k2v2 + ... + kmvm = 0, not all ki = 0

Otherwise the set {vi}i=1...m is called linearly independant.

Definition
Let {bi}i=1...n be a set of n linearly independant vectors in Rn. Then {bi}i=1...n forms a base for Rn. All vectors v in Rn may be written as linear combinations of {bi}i=1...n:

v = k1b1 + k2b2 + ... +knbn, ki in R, i = 1 ... n.

Definition
Given m linear independant vectors {bi}i=1...m, m <= n. Then the lattice L, spanned by {bi}i=1...m, is the set of all integral linear combinations of {bi}, that is

L = {v | v = z1b1 + z2b2 + ... + zmbm, with zi in Z}

The set {bi}i=1...m forms a base of L. A lattice may have several bases.

Closely related to a lattice is the problem of finding the shortest nonzero vector of this lattice (SVP: shortest vector problem).
It is an open question whether or not this problem is NP-hard using the euclidean norm. Using the supremum norm, it is shown, that the problem is NP-hard. The supremum norm || ||8 of a vector v is defined by

||v||8 = max |vi|i=1...n

As stated above a lattice may have several bases. A "good" base consists of vectors with relatively small length. Such a base is called a reduced base. The theory of lattice-base-reduction deals with the problem of finding reduced bases, given a lattice.

Algorithms transforming a given base B into a reduced base B', are called lattice reduction algorithms. A popular one is the algorithm invented by Lenstra, Lenstra and Lovász, L3 for short. A L3-reduced base has ,besides others, the property that its first vector, b'1 has a length of at most an exponential factor greater than the smallest nonzero vector of the lattice, spanned by the vectors of B:

||b'1|| <= 2(n-1)/2||v||, v in L.


This paragraph shows as to solve a subset sum problem with a knapsack having low density. Because cryptographic knapsacks almost always have a low density, this method applies to them as well.

Given
  • a public key B = (b1,b2, ..., bn)
  • a cipher c = pB = b1p1 + ... + bnpn
Wanted
  • the plaintext p

To get the original message back, successively perform the following steps

  1. Construct the (n+1)×(n+1) matrix M
    M =
    1 0 0 ... 0 tb1
    0 1 0 ... 0 tb2
    0 0 1 ... 0 tb3
    ... ... ... ... ... ...
    0 0 0 ... 1 tbn
    0.5 0.5 0.5 ... 0.5 tc
    with t an integer > (n/4)1/2

    Now define the vector mi to be the i-th row of M, i = 1...(n+1). Let L be the lattice spanned by {mi}i=1...(n+1).

    Now suppose p = (p1, p2, ..., pn) is the plaintext corresponding to c.
    Regard the vector y = p1m1 + p2m2+ ... + pnmn - mn+1. Either pi = 0 or pi = 1, hence y is an integral linear combination of {mi}i=1...(n+1) and the lattice L contains this vector y.
    Vector addition is perfomed by adding the elements componentwise. Therefore the components of y are
     

    y1 =  p1 + 0 + ... + 0 - 0.5 =  p1 - 0.5
    y2 =  0 + p2 + 0 + ... + 0 - 0.5 = p2 - 0.5
    ...
    yi = 0 + ... + 0 + pi + 0 + ... +0 - 0.5 = pi - 0.5
    ...
    yn = 0 + ... + 0 + pn - 0.5 = pn - 0.5
    yn+1 = t(b1p1 + b2p2+ ... + bnpn) - tc = 0

    With pi in {0, 1}, y is a vector having all its components in {±0.5} except the last one which is zero. Therefore the euclidian length of y is ||y|| = (n/4)1/2 and hence y is a small vector.
    Now, if the knapsack B is of low density, most of the vectors arising from linear combinations of {mi}i=1...(n+1) are expected to have a large length.

  2. The next step is to reduce the base {mi}i=1...(n+1) by applying a reduction algorithm, e.g. the L3 algorithm.
  3. Let {m'i}i=1...(n+1) be the new reduced base.
    Now scan the reduced base for a vector with the properties that
    • mi, n+1 = 0 and
    • mi, j = ±0.5, j = 1 ... n
    If no such vector exists, either there is no solution to (B, c) or the L3 algorithm has failed to find a proper reduced base.

    Now assume that there exists a m'k with m'k = (±0.5, ±0.5, ..., ±0.5, 0). Perform the following steps

    • define the vector X = (x1, x2, ..., xn) by xi = 0.5 + m'k, i , i = 1 ... n
    • compute s = XB = x1b1 + ... + xnbn
    • If s equals c, a solution to (B, c) is found and the plaintext is the vector X.

    Otherwise, if s != c then
    • define X = (x1, x2, ..., xn) by xi = 0.5 - m'k, i , i = 1 ... n
    • again compute s = XB = x1b1 + ... + xnbn.
    • if s = c, then X is the plaintext

In [CO91] it was proven, that for knapsacks having a density d of less than 0.9408, the subset sum problem can almost always be solved.
Assuming that there exists an oracle constructing ALWAYS a reduced base with the shortest nonzero vector, the probability that the shortest nonzero vector of the lattice is not of the form {±0.5, ±0.5, ..., ±0.5, 0} goes to zero as the dimension of the knapsack goes to infinity as well as the density of the knapsack is less than 0.9408.

The assumption of finding ALWAYS the shortest nonzero vector of a lattice is not given when using the L3 algorithm. The algorithm is only guaranteed to find a vector b with length of at most ||b|| <= 2(n-1)/2||v|| with v the shortest nonzero vector of the lattice. Despite this, the algorithm performs much better than is expected to be.


Start applet

 
Shamirs algorithm