Distributional Johnson-Lindenstrauss Lemma

Let πš·βˆˆβ„kΓ—d\mathbf{Ξ } ∈ \mathbb{R}^{kΓ—d} be chosen so that each entry equals 1k𝒩(0,1)\frac{1}{\sqrt{k}} \mathcal{N}(0, 1), where 𝒩(0,1)\mathcal{N}(0, 1) denotes a standard Gaussian random variable. If we choose k=O(log(1/Ξ΄)Ο΅2)k = O ( \frac{\log(1/Ξ΄)}{\epsilon^2}), then for any vector 𝐱\mathbf{x}, with probability (1βˆ’Ξ΄)(1 βˆ’ Ξ΄):

(1βˆ’Ο΅)||𝐱||22≀||𝚷𝐱||22≀(1+Ο΅)||𝐱||22(1-\epsilon)||\mathbf{x}||_2^2 \leq ||\mathbf{\Pi x}||_2^2 \leq (1+\epsilon)||\mathbf{x}||_2^2

May be used to prove Johnson-Lindenstrauss Lemma (1984).

Corollary

For any fixed 𝐱\mathbf{x},

#incomplete (TODO: see lecture 12 )

Proof

Want to argue that, with probability (1βˆ’Ξ΄)(1-\delta), (1βˆ’Ο΅)||𝐱||22≀||𝚷𝐱||22≀(1+Ο΅)||𝐱||22(1-\epsilon)||\mathbf{x}||_2^2 \leq ||\mathbf{\Pi x}||_2^2 \leq (1+\epsilon)||\mathbf{x}||_2^2 Claim: 𝔼||𝚷𝐱||22=||𝐱||22\mathbb{E}||\mathbf{\Pi x}||_2^2 = ||\mathbf{x}||_2^2

Intermediate claim: 𝔼[||𝚷𝐱||22]=𝔼[(βŸ¨π›‘i,𝐱⟩)2]\mathbb{E}\left[||\mathbf{\Pi x}||_2^2\right] = \mathbb{E}\left[(\langle \mathbf{\pi}_i, \mathbf{x}\rangle)^2\right]

βŸ¨π›‘,𝐱⟩=Z1⋅𝐱[1]+Z2⋅𝐱[2]+Zd⋅𝐱[d]\langle \mathbf{\pi}, \mathbf{x}\rangle = Z_1 \cdot \mathbf{x}[1] + Z_2 \cdot \mathbf{x}[2] + Z_d \cdot \mathbf{x}[d] where each Z1,…,ZdZ_1, … , Z_d is a standard normal 𝒩(0,1)\mathcal{N}(0,1) random variable.

We have that Zi⋅𝐱(i)Z_i \cdot \mathbf{x}(i) is a normal 𝒩(0,𝐱(i)2)\mathcal{N}(0,\mathbf{x}(i)^2) random variable.

What type of random variable is βŸ¨π›‘i,𝐱⟩\langle \mathbf{\pi}_i, \mathbf{x}\rangle? Use Stability of Gaussian random variables.

βŸ¨π›‘i,𝐱⟩=𝒩(0,𝐱[1]2+𝒩(0,𝐱[2]2+...+𝒩(0,𝐱(d)2)=𝒩(0,||𝐱||22)\langle \mathbf{\pi}_i, \mathbf{x}\rangle = \mathcal{N}(0,\mathbf{x}[1]^2+\mathcal{N}(0,\mathbf{x}[2]^2+...+\mathcal{N}(0,\mathbf{x}(d)^2) = \mathcal{N}(0,||\mathbf{x}||_2^2)

So 𝔼||𝚷𝐱||22=𝔼[(βŸ¨π›‘i,𝐱⟩)2]=𝔼[𝒩(0,||𝐱||22)2]=||𝐱||22\mathbb{E}||\mathbf{\Pi x}||_2^2 =\mathbb{E}\left[(\langle \mathbf{\pi}_i, \mathbf{x}\rangle)^2\right]= \mathbb{E}[\mathcal{N}(0,||\mathbf{x}||_2^2)^2] = ||\mathbf{x}||_2^2, as desired.

$||\mathbf{x}||_2^2 = \mathrm{Var}(\mathcal{N}(0,||\mathbf{x}||_2^2)) = \mathbb{E}[\mathcal{N}(0,||\mathbf{x}||_2^2)^2] - \cancelto{0}{\mathbb{E}[\mathcal{N}(0,||\mathbf{x}||_2^2)]^2}$

Need to use concentration bound ||𝚷𝐱||22=1kβˆ‘i=1k(βŸ¨π›‘i,𝐱⟩)2=1kβˆ‘i=1k𝒩(0,||𝐱||22)||\mathbf{\Pi x}||_2^2=\frac{1}{k}\sum_{i=1}^k(\langle \mathbf{\pi}_i, \mathbf{x}\rangle)^2 = \frac{1}{k}\sum_{i=1}^k\mathcal{N}(0,||\mathbf{x}||_2^2)

β€œchi-squared random variable with kk degrees of freedom”


See Gaussian concentration

See also: for proof of 𝔼[||𝚷𝐱||22]=||𝐱||22\mathbb{E}[||\mathbf{\Pi x}||_2^2]=||\mathbf{x}||_2^2