6

Select $K$ random binary vectors $Y_i$ of length $m$ uniformly at random.

Let the following collection of random variables be defined: $X_{i,j}=w(Y_i \oplus Y_j)$ where $w(\cdot)$ denotes the Hamming weight of a binary vector, i.e., the number of the nonzero coordinates in its argument. Define $D_{min}(Y_1,\ldots,Y_K)$ as the smallest of the $X_{i,j}$ for $i \neq j.$

Thus we have $n=C(K,2)=K(K-1)/2$ non-independent random variables $X_{i,j}$ with support {$0,1,\ldots,m$} and individual distribution $Bin(m,1/2)$. It seems to me that the random variables $X_{i,j}$ will be $s$-wise negatively correlated (for $s$ large enough) if distances between pairs chosen from a subcollection of $Y_{i_1},Y_{i_2},\ldots,Y_{i_v}$ where ($v < K$) tincreases then the distances between $Y_{i_j}$ and the remaining vectors will tend to decrease. Take $s=v+1.$

It is possible to get a bound on the following quantity. Fix $w$ an integer less than $m/2.$ The Hamming sphere of radius w has "volume", i.e., contains $V_w(m)=\sum_{s=0}^w C(m,s)$ vectors and we approximately have to first order in the exponent $$ V_w(m) =2^{m H((w+1)/2)} $$ where $H(\cdot)$ is the binary entropy function. Then, for a random uniform choice of the $Y_i$ for $i=1,2,\ldots,K$ it is clear that if the Hamming spheres centred at these vectors are disjoint then the minimum distance is at least $2w+1$ thus $Pr[D_{min} \geq 2 w+1] \leq \frac{(2^m-V)}{2^m}\frac{ (2^m-2 V) }{2^m} \cdots\frac{ (2^m - (K-1)V)}{ 2^{m}}$

where $V=V_w(m).$ This means that, by replacing each fraction of the form $(1-x)$ by $exp(-x)$ where $x >0$ but small, we get the approximate upper bound $Pr[ D_{min} \geq 2w+1] \leq exp\left[-K(K-1)V^2/(2^{m+1} \right]$ which then expresses this upper bound in terms of the entropy function, which is nice. Unfortunately this upper bound is quite loose.

I will be happy with any pointers to literature or any other suggestions.

kodlu
  • 10,131
  • 2
  • 34
  • 53
  • Hi Leonid, you make a good point.

    I am happy to choose them independently, in practice, if $K$ is about $2^{m/2}$ or more, you will typically get collisions and min distance will be zero. I am interested in the case that $K$ is much smaller but still of size $2^{a m}$ for some $a \in (0,1/2)$ which is not going to yield collisions.

    – kodlu Feb 25 '10 at 04:16
  • I think you mean the minimum distance is at least 2w+1, rather than 2w or 2w-1. – Douglas Zare Feb 25 '10 at 05:07
  • You're right, I fixed the equation. – kodlu Feb 25 '10 at 06:01
  • @serdar: Just to be sure, are you really interested in Pr(D_min > d) for a random code, or do you actually want to use random code arguments to eventually lower/upper bound the minimum distance of a length-m binary code of K codewords, d(m,K)? If the latter is indeed the case, do you need pointers to known (asymptotic) bounds on d(m,K)? – user2734 Feb 25 '10 at 06:50
  • @serdar: Adversary? Ah, you’re probably working on cryptography. I was thinking about constructing good error-correcting codes, and indeed had the relative distance vs. rate bounds for linear codes (Gilbert--Varshamov lower bound, McEliece--Rodemich--Rumsey--Welch upper bound). – user2734 Mar 07 '10 at 14:29

1 Answers1

5

Here is a direct application of Theorem 21 from Gabor Lugosi's concentration of measure notes. Your $Y_i$ corresponds to his $X_{i,1}^m$ and your $X_{i,j}$ to his $d(X_{i,1}^m, X_{j,1}^m)$. Take his $A$ to be your $\{X_{i,j}\}_{i \neq j}$. The birthday problem gives the probability that any two of the $Y_i$ are exactly the same. That is: $$\mathbb{P}(0^m \in A) = \mathbb{P}\left(\left\{X_{i,j} = 0^m : i \neq j\right\}\right) = \mathrm{(omitted\ for\ simplicity)} $$ Now your $D_{min}$ corresponds to his $d(0^m,A)$. By the Theorem, for any $t > 0$, $$\mathbb{P}\left(D_{min} \geq t + \sqrt{\frac{m}{2} \mathrm{log}\frac{1}{\mathbb{P}(0^m \in A)}}\right) \leq e^{-2t^2/m}.$$ This bound may be OK for your needs. If it isn't, see Lugosi's discussion of Talagrand's convex distance inequality, which is a big improvement.