Distance Metric Measures the dissimilarity between two data
![Distance Metric Measures the dissimilarity between two data points. A metric is a fctn, Distance Metric Measures the dissimilarity between two data points. A metric is a fctn,](https://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-1.jpg)
Distance Metric Measures the dissimilarity between two data points. A metric is a fctn, d, of 2 points X and Y, such that d(X, Y) is positive definite: if (X Y), d(X, Y) > 0 if (X = Y), d(X, Y) = 0 d(X, Y) is symmetric: d(X, Y) = d(Y, X) d(X, Y) satisfies triangle inequality: d(X, Y) + d(Y, Z) d(X, Z)
![Standard Distance Metrics Minkowski distance or Lp distance, Manhattan distance, (P = 1) Euclidian Standard Distance Metrics Minkowski distance or Lp distance, Manhattan distance, (P = 1) Euclidian](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-2.jpg)
Standard Distance Metrics Minkowski distance or Lp distance, Manhattan distance, (P = 1) Euclidian distance, (P = 2) Max distance, (P = )
![An Example Y (6, 4) A two-dimensional space: Manhattan, d 1(X, Y) = XZ+ An Example Y (6, 4) A two-dimensional space: Manhattan, d 1(X, Y) = XZ+](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-3.jpg)
An Example Y (6, 4) A two-dimensional space: Manhattan, d 1(X, Y) = XZ+ ZY = 4+3 = 7 Euclidian, d 2(X, Y) = XY = 5 Z X (2, 1) Max, d (X, Y) = Max(XZ, ZY) = XZ = 4 d 1 d 2 d For any positive integer p,
![HOBbit Similarity These notes contain NDSU confidential & Proprietary material. Patents pending on b. HOBbit Similarity These notes contain NDSU confidential & Proprietary material. Patents pending on b.](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-4.jpg)
HOBbit Similarity These notes contain NDSU confidential & Proprietary material. Patents pending on b. SQ, Ptree technology Higher Order Bit (HOBbit) similarity: HOBbit. S(A, B) = A, B: two scalars (integer) ai, bi : ith bit of A and B (left to right) m : number of bits Bit position: 1 2 3 4 5 6 7 8 x 1: 0 1 1 0 0 1 y 1: 0 1 1 1 0 1 x 2: 0 1 1 1 0 1 y 2: 0 1 0 0 HOBbit. S(x 1, y 1) = 3 HOBbit. S(x 2, y 2) = 4
![HOBbit Distance (High Order Bifurcation bit) HOBbit distance between two scalar value A and HOBbit Distance (High Order Bifurcation bit) HOBbit distance between two scalar value A and](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-5.jpg)
HOBbit Distance (High Order Bifurcation bit) HOBbit distance between two scalar value A and B: dv(A, B) = m – HOBbit(A, B) Example: Bit position: 1 2 3 4 5 6 7 8 x 1: 0 1 1 0 0 1 y 1: 0 1 1 1 0 1 x 2: 0 1 1 1 0 1 y 2: 0 1 0 0 HOBbit. S(x 1, y 1) = 3 HOBbit. S(x 2, y 2) = 4 dv(x 1, y 1) = 8 – 3 = 5 dv(x 2, y 2) = 8 – 4 = 4 HOBbit distance for X and Y: In our example (considering 2 -dim data): dh(X, Y) = max (5, 4) = 5
![HOBbit Distance Is a Metric HOBbit distance is positive definite if (X = Y), HOBbit Distance Is a Metric HOBbit distance is positive definite if (X = Y),](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-6.jpg)
HOBbit Distance Is a Metric HOBbit distance is positive definite if (X = Y), = 0 if (X Y), > 0 HOBbit distance is symmetric HOBbit distance holds triangle inequality
![Neighborhood of a Point Neighborhood of a target point, T, is a set of Neighborhood of a Point Neighborhood of a target point, T, is a set of](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-7.jpg)
Neighborhood of a Point Neighborhood of a target point, T, is a set of points, S, such that X S if and only if d(T, X) r 2 r 2 r T X X X T 2 r 2 r X T T Manhattan Euclidian Max HOBbit If X is a point on the boundary, d(T, X) = r
![Decision Boundary decision boundary between points A and B, is the A locus of Decision Boundary decision boundary between points A and B, is the A locus of](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-8.jpg)
Decision Boundary decision boundary between points A and B, is the A locus of the point X satisfying d(A, X) = d(B, X) R 1 d(A, X) X d(B, X) R 2 B D A A B B Decision boundary for HOBbit Distance is perpendicular to axis that makes max distance Manhattan Euclidian Max Euclidian A Manhattan A B > 45 Decision boundaries for Manhattan, Euclidean and max distance B < 45
![Minkowski Metrics Lp-metrics (aka: Minkowski metrics) dp(X, Y) = ( i=1 to n wi|xi Minkowski Metrics Lp-metrics (aka: Minkowski metrics) dp(X, Y) = ( i=1 to n wi|xi](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-9.jpg)
Minkowski Metrics Lp-metrics (aka: Minkowski metrics) dp(X, Y) = ( i=1 to n wi|xi - yi|p)1/p (weights, wi assumed =1) Unit Disks Boundary p=1 (Manhattan) p=2 (Euclidean) p=3, 4, …. . P= (chessboard) P=½, ⅓, ¼, … ? dmax≡ max|xi - yi| d ≡ limp dp(X, Y). Proof (sort of) limp { i=1 to n aip }1/p max(ai) ≡b. For p large enough, other aip << bp since y=xp increasingly concave, so i=1 to n aip k*bp (k=duplicity of b in the sum), so { i=1 to n aip }1/p k 1/p*b and k 1/p 1
![q 2 4 9 100 MAX x 1. 5. 5. 5 y 1 0 q 2 4 9 100 MAX x 1. 5. 5. 5 y 1 0](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-10.jpg)
q 2 4 9 100 MAX x 1. 5. 5. 5 y 1 0 0 0 q 2 3 7 100 MAX x 1. 71. 71. 71 y 1 0 0 0 q 2 8 1000 MAX x 1. 99. 99. 99 y 1 0 0 0 q 2 9 1000 MAX x 1 1 1 q 2 9 1000 MAX x 1. 9. 9. 9 x 2. 5. 5. 5 x 2. 71. 71. 71 y 2 0 0 0 Lq distance x to y. 7071067812. 5946035575. 5400298694. 503477775. 5 y y 2 0 0 0 Lq distance x to y 1. 0. 8908987181. 7807091822. 7120250978. 7071067812 x 2. 99. 99. 99 y 2 0 0 0 Lq distance x to y 1. 4000714267 1. 0796026553. 9968859946. 9906864536. 99 y 1 0 0 0 x 2 1 1 1 y 2 0 0 0 Lq distance x to y 1. 4142135624 1. 0800597389 1. 0069555501 1. 0006933875 1 y y 1 0 0 0 x 2. 1. 1. 1 y 2 0 0 0 Lq distance x to y. 9055385138. 900003. 9. 9. 9 y Lq distance x to y 4. 2426406871 3. 7797631497 3. 271523198 3. 0208666502 3 y q 2 3 8 100 MAX x 1 3 3 3 y 1 0 0 0 x 2 3 3 3 y 2 0 0 0 q 6 9 100 MAX x 1 90 90 y 1 0 0 x 2 45 45 y 2 0 0 Lq distance x to y 90. 232863532 90. 019514317 90 90 x P>1 Lp metrics x y x x
![q 1. 8. 4. 2. 1. 04. 02. 01 2 x 1. 1. 1 q 1. 8. 4. 2. 1. 04. 02. 01 2 x 1. 1. 1](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-11.jpg)
q 1. 8. 4. 2. 1. 04. 02. 01 2 x 1. 1. 1 y 1 0 0 0 0 0 x 2. 1. 1. 1 y 2 Lq distance x to y 0. 238 0. 566 0 3. 2 0 102 0 3355443 0 112589990684263 0 1. 2676 E+29 0. 141421356 q 1. 8. 4. 2. 1. 04. 02. 01 2 x 1. 5. 5. 5 y 1 0 0 0 0 0 x 2. 5. 5. 5 y 2 Lq distance x to y 0 1. 19 0 2. 83 0 16 0 512 0 16777216 0 5. 63 E+14 0 6. 34 E+29 0. 7071 q 1. 8. 4. 2. 1. 04. 02. 01 2 x 1. 9. 9. 9 y 1 0 0 0 0 0 x 2 y 2 Lq distance x to y 0. 1 0 1. 098 0. 1 0 2. 1445 0. 1 0 10. 82 0. 1 0 326. 27 0. 1 0 10312196. 962 0. 1 0 341871052443154 0. 1 0 3. 8 E+29 0. 1 0. 906 x P<1 Lp metrics y d 1/p(X, Y) = ( i=1 to n |xi - yi|1/p)p For p=0 (lim as p 0), doesn’t exist (Does not converge. ) y y x x P<1 Lp
![Min dissimilarity function The dmin function ( dmin(X, Y) = min i=1 to n Min dissimilarity function The dmin function ( dmin(X, Y) = min i=1 to n](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-12.jpg)
Min dissimilarity function The dmin function ( dmin(X, Y) = min i=1 to n |xi - yi| ) is strange. It is not even a psuedo-metric. The Unit Disk is: And the neighborhood of the blue point relative to the red point (the neighborhood of points closer to the blue than the red) is strangely shaped! http: //www. cs. ndsu. nodak. edu/~serazi/research/Distance. html
![Other Interesting Metrics Canberra metric: dc(X, Y) = ( i=1 to n |xi – Other Interesting Metrics Canberra metric: dc(X, Y) = ( i=1 to n |xi –](http://slidetodoc.com/presentation_image/1b8eb1950b12999a8c33f481c3a6484f/image-13.jpg)
Other Interesting Metrics Canberra metric: dc(X, Y) = ( i=1 to n |xi – yi| / (xi + yi) normalized manhattan distance Square Cord metric: dsc(X, Y) = i=1 to n ( xi – yi )2 Already discussed as Lp with p=1/2 Squared Chi-squared metric: Scalar Product metric: dchi(X, Y) = i=1 to n (xi – yi)2 / (xi + yi) dchi(X, Y) = X • Y = i=1 to n xi * yi Hyperbolic metrics: (which map infinite space 1 -1 onto a sphere) Which are rotationally invariant? Translation invariant? Other? Some notes on distance functions can be found at http: //www. cs. ndsu. No. Dak. edu/~datasurg/distance_similarity. pdf
- Slides: 13