SUBMIT


  close
ARABTRON.COM FOR SALE
Adi Ben-Israel speaks to the Experimental Mathematics Seminar. Abstract: 1. The unreliability of the Euclidean distance in high-dimension, making a proximity query meaningless and unstable because there is poor discrimination between the nearest and furthest neighbor [3], see also [4]. 2. The uniform probability distribution on the n-dimensional unit sphere S_n, and some non-intuitive results for large n. For example, if x is any point in S_n, taken as the "north pole", then most of the area of S_n is concentrated in the "equator". 3. The advantage of the ℓ1-distance, which is less sensitive to high dimensionality, and has been shown to "provide the best discrimination in high-dimensional data spaces," [1, p. 427]. 4. Clustering high-dimensional data using the ℓ1 distance, [2]. References [1] C.C. Aggarwal et al, On the surprising behavior of distance metrics in high dimensional space, Lecture Notes in Computer Science, vol 1973(2001), Springer, https://doi.org/10.1007/3-540-44503-X_27 [2] T. Asamov and A. Ben-Israel, A probabilistic ℓ1 method for clustering high-dimensional data, Probability in the Engineering and Informational Sciences, 2021, 1-16 [3] K. Beyer et al, When is "nearest neighbor" meaningful?, Lecture Notes in Computer Science, vol 1540(1999), Springer, https://doi.org/10.1007/3-540-49257-7_15 [4] J.M. Hammersley, The distribution of distance in a hypersphere, The Annals of Mathematical Statistics 21(1950), 447-452.