Space Filling Curves and Functional Contours Database analysis

Space Filling Curves and Functional Contours Database analysis can be broken down into 2 areas, Querying and Data Mining can be broken down into 2 areas, Machine Learning and Assoc. Rule Mining Machine Learning can be broken down into 2 areas, Clustering and Classification. Clustering can be broken down into 2 areas, Isotropic (round clusters) and Density-based Machine Learning usually begins by identifying Near Neighbor Set(s), NNS. In Isotropic Clustering, one identifies round sets (disk shaped NNSs about a center). In Density Clustering, one identifies cores (dense round NNSs) then pieces them together. In any Classification based on continuity we classifying a sample based on its NNS class histogram (aka k. NN) or we identify isotropic NNSs of centroids (k-means) or we build decision tres with training leafsets and use them to classify samples that fall to that leaf, we find class boundaries (e. g. , SVM) which distinguish NNSs in one class from NNSs in another. The basic definition of continuity from elementary calculus proves NNSs are fundamental: >0 : d(x, a)< d(f(x), f(a))< or NNS about f(a), a NNS about a that maps inside it. So NNS Search is a fundamental problem to be solved. We discuss NNS Search from the a vertical data point of view. With vertically structured data, the only neighborhoods that are easily determined are the cubic or Max neighborhoods (L disks), yet usually we want Euclidean disks. We develop techniques to circumscribe Euclidean disks using the intersections of contour sets, the main of which are coordinate projection contours, the intersection of which form L disks. First we review the standard “space filling curves”, Peano (Z) and Hilbert. In both cases, as the gridding gets finer and finer, each point on the curve converges to a point in the square and those points densely fill the square (no empty spaces of any size) which is why they are called “space filling curves”. Other than the raster orderings of a gridding of the square, these two methods may have advantages in that Peano ordering preserves distance better than a raster ordering (not as many massive junps) and Hilbert ordering preserves distance even better than Peano (always move to a neighbor). Recall that choosing a pixel or voxel ordering is the first step in creating vertical p. Tree spatial data. In any Geospatial Analysis, some ordering of the pixels (voxels in 3 D) is required. Which one is best may depend upon what the definition of best is and what data area is being analyzed. After a brief look at space filling curves, we treat functional contours. How are functional contours related to space filling curves? With space filling curves, we try to cover a “space” (2 D space) with a 1 D curve (up to the pixelization of the space). That is, we try to “fill in” the space with a function from the real line. In functional contouring, we consider sort of the opposite, namely what gets mapped to a single point by a function into the real line (that is, what does the preimage of a point look like). Familiar examples include isobars (which points get mapped to the same pressure value), isotherms, etc.

Hilbert Ordering? • In 2 -dimensions, Peano ordering is 2 2 -recursive z-ordering (raster ordering) • Hilbert ordering is 4 4 -recursive tuning fork ordering (H-trees have fanout=16)

down 0123456789 ABCDEF right 0123456789 ABCDEF 1 2 3 4 5 . . . 6 7 8 9 . . . 0123456789 ABCDEF . . A B C left right . D E F up . . . 0123456789 ABCDEF down . . . 0123456789 ABCDEF 0 3 4 5 down . . 0123456789 ABCDEF up 0123456789 ABCDEF Coordinates of a tuning-fork (upper-left) depend on ancestry. (x, y) = (ggrrbb, ggrrbb). If your parent points Down and you are the H node in your tuning-fork, your 2 -bit contribution is given by: row(x) col(y) 1 E F 0 00 , 00 1 00 , 01 2 01 , 01 2 C D 3 01 , 00 4 10 , 00 5 11 , 00 8 7 B 6 11 , 01 7 10 , 01 8 10 , 10 9 A 6 9 11 , 10 A 11 , 11 B 10 , 11 C 01 , 11 D 01 , 10 E 00 , 10 F 00 , 11 Lookup table for Up, Left, Right Parents are similar.

FUNCTIONAL CONTOURS: Given f: R(A 1. . An) Y (any range but usually the Reals) and S Y (any subset of the range, but usually 1 real) , define contour(f, S) f-1(S). R A 1 A 2 : : An f Y S graph(f) = { (a 1, . . . , an, f(a 1. an)) | (a 1. . an) R } Y . . . S contour(f, S) A 1. . An space There is a DUALITY between functions, f: R(A 1. . An) Y and derived attributes, Af of R given by x. Af f(x) where Dom(Af)=Y R A 1 A 2 An x 1 x 2 : xn. . . f Y f(x) R*A A 2 An Af x 1 x 2 : xn f(x) 1 . . . Contour(Af, S) = SELECT A 1. . An FROM R* WHERE R*. Af S. If S={a}, f-1({a}) is Isobar(f, a)

Given a similarity, s: R R Reals (e. g. , s(x, y)=s(y, x) and s(x, x) s(x, y) x, y R ) and an extension to disjoint subsets of R (e. g. , single/complete/average link. . . ) and C R, a k-disk of C is: disk(C, k) C : |disk(C, k) C'|=k and s(x, C) s(y, C) x disk(C, k), y disk(C, k). Define its skin(C, k) disk(C, k) - C skin stands for s k immediate neighbors and is a k. NNS of C cskin(C, k) allskin(C, k)s closed skin, and ring(C, k) = cskin(C, k) - cskin(C, k-1) disk(C, r) {x R | s(x, C) r}, skin(C, r) disk(C, r) - C ring(C, r 2, r 1) disk(C, r 2) - disk(C, r 1) skin(C, r 2) - skin(C, r 1). r 2 r 1 C For C = {a} Given a [psuedo] distance, d, rather than a similarity, just reverse all inequalities. r 1 a r 2

A definition of Predicate trees (P-trees) based on functionals? (generalizes, but does not alter, previous definitions) Given f: R(A 1. . An) Y and S Y define the uncompressed Functional-P-tree as Pf, S a bit map given by Pf, S(x)=1 iff f(x) S. . The predicate for 0 Pf, S is the set containment predicate, f(x) S Pf, S a Contour bit map (bitmaps, rather than lists the contour points). If f is a local density (ala OPTICS) and {Sk} a partition of Y, {f-1(Sk)} is a clustering! What partition {Sk} of Y should be use? E. g. , a binary partition? (given by a threshold value). In OPTICS Sks are the intervals between crossing points of graph(f) and a threshold line pts below the threshold line are agglomerated into 1 noise cluster. Weather reporters use equi-width interval partitions (of barametric pressure or temp. . ).

Compressed Functional-P-trees (ls)P f, S (with equi-width leaf size, ls) is a compression of Pf, S by doing the following: 1. order or walk R (converts the bit map to a bit vector) 2. equi-width partition R into segments of size, ls (ls=leafsize, the last 1 can be short) 3. eliminate and mask to 0, all pure-zero segments (via a Leaf Mask or LM ) 4. eliminate and mask to 1, all pure-one segments (via a Pure 1 Mask or PM ) Notes: 1. LM is an existential aggregation of R (1 iff that leaf has a 1 -bit). Others? (default=existential) 2. There are partitioning other than equi-width (but that will be the default). Doubly Compressed Functional-P-trees with equi-width leaf sizes, (ls 1, ls 2) Each leaf of (ls 1, ls 2) Pf, S (ls)P f, S is an uncompressed bit vector and can be compressed the same way: (ls 2 is 2 nd equi-width segment size and ls 2<< ls 1) Recursive compression can continue ad infinitum, (ls 1, ls 2, ls 3) Pf, S (ls 1, ls 2, ls 3, ls 4) Pf, S. . .

BASIC P-trees For Ai Real and fi, j(x) jth bit of the ith component, xi {(*)Pfi, j , {1} (*)Pi, j}j=b. . 0 are the basic (*)P-trees of Ai, (* = ls 1, . . . lsk k=0. . . ). For Ai Categorical, and fi, a(x)=1 if xi=a R[Ai], else 0; then {(*)Pfi, a, {1} (*)Pi, a}a R[Ai] are the basic (*)P-trees of Ai For Ai real, the basic P-trees result from binary encoding of individual real numbers (categories). Encodings can be used for any attribute. Note that it is the binary encoding of real attributes, which turns an n-tuple scan into a Log 2(n)-column AND (making P-tree technology scalable). Next, we consider various contour functionals that re useful in Machine Learning, starting with Total Variation, TV.

R(A 1. . An) TV(a)= x R(x-a)o(x-a) = x R d=1. . n(xd 2 If we use d for a index variable over the dimensions, - 2 adxd + ad 2 ) i, j, k bit slices indexes = x R d=1. . n( k 2 kxdk)2 - 2 x R d=1. . nad( k 2 kxdk) + |R||a|2 = x d( i 2 ixdi)( j 2 jxdj) - 2 x R d=1. . nad( k 2 kxdk) + |R||a|2 = x d i, j 2 i+jxdixdj - 2 x, d, k 2 k ad xdk + |R||a|2 = x, d, i, j 2 i+j xdixdj - 2 dad x, k 2 kxdk + |R||a|2 = x, d, i, j 2 i+j xdixdj - 2|R| dad d + = x, d, i, j 2 i+j xdixdj + |R|( -2 dad d + dadad ) TV(a) = i, j, d 2 i+j |Pdi^dj| TV(a) = i>j, d 2 i+j+1 |Pdi^dj| + k 2 k+1 dad |Pdk| + |R| dadad |R||a|2 k, d (22 k- 2 k+1 ad) |Pdk| + collecting |Pdk|s: |R| (a 12+. . +an 2) Note that the first term (the only one involving dual bit-slice predicates) does not depend upon a at all! So it can be subtracted from TV(a), giving a simpler derived attr, TV with identical contours (just a lowered graph) and which can be calculated simply from the basic Ptree rootcounts themselves (no preprocessing). Then subtracting TV( ) ( =mean of R) is a function with identical contours (a High Dimensoin-ready TV).

From equation 7, TV(a) = x, d, i, j 2 i+j xdixdj + |R| ( -2 dad d + dadad ) f(a)=TV(a)-TV( ) = |R| ( -2 d(ad d- d d) + d(adad- d d) ) = |R|( dad 2 - 2 d dad + d d 2 ) = |R| |a- |2 f( )=0 and letting Taking g / ad (a) = g(a) HDTV(a) = ln( f(a) )= ln|R| + ln|a- |2 2( a - )d The Gradient of g at a = | a- |2 2/| a- |2 (a - ) The gradient =0 iff a= and gradient length depends only on the length of a- so isobars are hyper-circles The gradient function is has the form, h(r) = 2/r in along any ray from , Integrating, we get that g(a) has the form, 2 ln|a- | along any coordinate direction (in fact any radial direction from ), so the shape of graph(g) is a funnel: f(c) The way to get an exact -contour is to move in and out along a- by to inner point, b=µ+(1 - /|a- |)(a- ) and outer point c=µ+(1+ /|a- |)(a- ). Then take f(b) and f(c) as lower and upper endpoints of the red vertical interval (use EIN formulas on that interval to get a mask of the exact contour). What inteval endpts gives an exact -contour in feature space? f(b) -contour (radius about a) ba c

Finally we note that the very same vertical pruning procedures can be used for any functional that requires no additional preprocessing (even if it does require preprocessing - i. e. , additional ANDing and Root Counting just to generate the derived attribute values), can be used efficiently (e. g. , the dimension projection functionals already have all their basic Ptrees generated for us (since their basic P-trees are precisely the basic P-trees of that dimension). The procedure is alway as shown in the previous slide. To classify a 1. Calculate basic P-trees for the derived attribute column of each training point 2. Calculate b and c (depend upon a and the chosen) 3. Mask the feature space mask for those points with derived attribute value in that the EIN ring [f(b), f(c)] (that is the precise -contour set). 4. User that mask to prune. 5. If the root count of the candidate set is scan-able, proceed to scan and assign votes, else look for another pruning functional (note that the combination of HDTV and all dimension projections will always suffice). f(c) f(b) -contour (radius about a) ba c

Graph of TV, TV-TV( ) and HDTV TV TV(x 15) TVTV( ) TV(x 15)TV( )=TV(x 33) 5 X 4 3 2 1 1 2 3 4 5 5 Y X 4 3 2 1 1 2 3 4 5 Y

Parameters for Vertical Structuring and Smoothing (zooming) of R(A 1. . An) The parameters defining the conversion of horizontal tables to P-trees are: 1. method of ordering R (walking R) (e. g. , (i 1. . in)-Raster, (i 1. . in)-Peano, (i 1. . in)-Hilbert, etc. ) 2. leaf sizes (e. g. , choice of number of levels, k, and a leafsize for each level, (ls 1, . . . , lsk) Note: How to store these P-trees on disk is an important implementation parameter, but not a theoretical solution space parameter. on ati g re leaf size sequence, (ls 1, . . . , lsk) gg ra po llu ro Given the Basic P-tree set, BPT { (ls 1, . . . lsk)Pi, j | i=col, j = bit position or category}, a P-tree smoothing taxonomy requires two more solution space parameters: So the vertical smoothing solution space has four dimensions: Note: Smoothing is clustering (with a particular goal) and choosing good initial partition-clustering centroid sets can be done by smoothing (and then choosing a representative point in each smoothing component or cluster, e. g. , the mean) od th me 3. smoothing level = sl (# of low order bits) 4. rollup or aggregation method (the predicate) (e. g. , count, existential, universal, rank, etc. ) l , s ng hi sm t oo l ve le ordering method of R (walk of R)

What is the goal of smoothing R(A 1. . An) This is the first question to be answered. One answer is that smoothing can increase the speed of DM algorithm processing and solve the curse of cardinality (essentially as a better alternative than random sub-sampling). In this direction, we think of smoothing as pre-clustering rather than random selection, to reduce the cardinality of the table being mined, hopefully without hiding exceptional data (as random sampling almost always does). So this application of smoothing requires that the smoothing algorithm be fast (or be amortizable) and also, if possible, be sensitive to exceptional data, else why do it? A related direction for smoothing is that is can be a method of pruning to reduce the computational complexity of an algorithm, i. e. , to produce only the strong preclusters. Then the points outside these cores, can be individually scanned, e. g. , to find exceptions or to be processed in some other way. In this direction, the smoothing is used to isolate those dense core neighborhoods that can be treated as "one unit" and therefore vastly increase the processing speed (over examining each individual point in each core). Other goals of Smoothing?

Note that the walk order issue is easily described using functions as well. Given a walk of R (which can be thought of as an ordering of the tuples of R and a "step" numbering of those tuples in that order (i. e. , assigning a step number to each tuple in the walk: 1, 2, 3, . . . ). In a walk, w: R {1, 2, 3, . . . }, w itself is a function on R and defines contours. Since it is a candidate key (uniqueness property) every isobar, w -1(n), is a singleton, {x} (where x is the nth step of the walk). Interval contours are sets of consecutive steps in the walk. The # of steps from x to y is always an upper bound to the Manhattan distance (if x and y are close in steps, they are close in Manhattan distance). x y z A B C D E N J o j K L p m P k h u r s O Q S i d e f g 9 a b c 5 6 7 8 1 2 3 4 M w B N I n K L r s M p l m Q • R S T i d e f g 9 a b c 5 6 7 8 1 2 3 4 v O P k h u H J o j q t F U w A Mixed walk, Mw y z A B C y-first Hilbert walk, y. Hw. U A G N K L p l m k h u r s H J o j t F G I q D E n T z D E x v R y C x-first Peano walk, x. Pw H I l t F G n q x P • d e f g 9 a b c 5 6 7 8 1 2 3 4 v O Q S i M R T U w

x y z A C N j J K L p m P k i d e f g 9 a b c 5 6 7 8 1 2 3 4 x O y A N J o k h B q t u r s F K L p m P • d e f g 9 a b c 5 6 7 8 1 2 3 4 M v O Q S i x-first Hilbert walk, x. Hw. w B N J o k h K L P • e f g 9 a b c 5 6 7 8 1 2 3 4 x Q s v R T U w y z A B C q D E N t u r s F G H I J o K L p m k P • e f g a b c 6 7 8 1 2 3 4 i M v O Q S y-first Raster d walk, 9 y. Rw. 5 h u r M O S i d x-first Raster walk, x. Rw t F p m j q H I j U A G l T z D E n R y C l H I x n w G j v U z y-first Peano order walk y. Rw. R D E l s T C n r M Q S h u H o l t F G n q D E I B R T U w

Mixed Walk, Uncompressed, 2 -bit Count Smoothing (Mw () 2 C) 2 2 3 2 This is smoothing using K 1 130 120 x y 0 0 uncompressed Ptrees with count aggregation on 2 -lo-grid cells. 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 1 1 1 0 0 0 1 1 1 1 z C 5 E A B D N G 7 F 1 q t u 4 r s H A j-hi grid is a grid of cells I J K L M resulting from using the j high-order bits to identify 2 3 cells and the rest to walk the interior of each cell. j-lo uses the j low order n o p O v bits to walk cell interiors l m P Q R and the rest to id-cells. 8 k S 8 T 1 j j-hi gives a square pattern h i U of cells and j-lo gives square cells. When (and d e f g only when) the space is 9 a b c square (n. . n space) are 1 w 5 616 7 8 they equal (j-lo=(b-j)-hi 1 2 3 4 where b=bitwidth(n). ) Mw()2 C creates a 2 -lo-grid count 16 14 histogram and 12 10 is order independent, but requires a 56 -tuple multi-scan (or use rootcounts of each value Ptree? ) 8 6 4 2 0 (0, 0) (0, 1) (1, 1) (3, 0) (2, 1) (0, 2) (1, 2) (0, 3) (1, 3) (2, 3) (3, 3)

Mw () 1 C produces very accurate smoothing, but involves (expensive? ) multiple bit column scan processing. Even calculating rootcounts of P 1 -lo cells may be expensive? Trade-off? give up accuracy for speed. Use LMs instead of uncompressed bit slices See next slide. K 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U 1 3 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 2 0 0 0 0 0 0 1 0 1 0 0 1 1 1 0 1 0 0 1 1 1 1 0 0 1 1 0 1 0 1 0 1 1 1 1 1 0 0 1 0 1 0 2 3 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1 0 0 0 0 2 2 0 0 0 0 1 1 1 1 0 1 1 1 0 0 0 1 1 1 1 2 1 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 x y z A C D E N n j u r s H J o l q t F G I B K L p m k P O Q S d h e f i g 9 a b c 5 6 7 8 1 2 3 4 M v R T U w

K 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U 5 6 1 3 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 2 0 0 0 0 0 0 1 0 1 0 0 1 1 1 0 1 0 0 1 1 1 2 7 3 2 3 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1 0 0 0 0 2 2 0 0 0 0 1 1 1 1 0 1 1 1 0 0 0 1 1 1 1 3 4 Mw (8) 2 C scans the 1 st level LM vectors instead of the full uncompressed bit. x y N J o m k 22 0 0 00 0 13 00 00 e f g 9 a b c 5 6 7 8 1 2 3 4 10 u r s q M O Q v R T U w 23 22 01 01 01 00 00 00 00 1 01 1 00 11 01 01 0 1 01 01 00 00 11 11 00 00 01 1 1 11 11 00 11 01 0 1 01 11 11 11 00 00 1 01 11 0 1 10 01 01 01 00 00 0 1 2 00 1 1 0 1 01 11 3 01 1 1 01 11 0 1 1 1 4 00 0 1 1 1 5 00 0 1 10 01 0 1 10 0 1 01 11 0 1 1 1 6 00 11 0 1 10 11 01 11 0 1 01 00 01 01 01 01 1 11 00 0 1 10 00 11 11 0 1 10 0 1 01 01 11 11 0 1 10 0 1 01 11 11 0 1 01 01 01 1 1 0 1 10 00 11 00 0 1 01 0 1 10 11 1 10 01 00 00 11 0 1 01 0 0 1 10 01 0 1 10 00 0 1 01 0 1 10 0 1 01 01 10 0 11 01 0 00 00 00 11 01 0 0 11 11 11 0 1 10 11 11 01 01 1 1 0 1 10 11 00 00 0 1 10 00 0 01 0 11 11 00 00 1 0 11 11 01 11 0 1 10 11 11 11 0 1 10 01 0 1 01 20 21 00 1 01 1 00 11 01 01 00 P i d 11 12 L S h 23 K p l j t D H I n B F G It depends on the order, but requires only a multi-scan of |LM|=7 bits (not the entire uncompressed bit slice of 56 bits). 12 A C E 13 z

0 Mw (8, 2) 1 C using 2 levels of Leaf. Maps (leaf sizes 8 and 2 respectively the black LMs and the red LMs). 1 x 7 C 3 1 y z 1 A E 6 (When there is no red LM shown, it's pure and one can tell which type of purity from the black/blue LM/PMs). 2 N D F G H I 5 B J K 4 5 6 4 M 2 3 us t r q 1 L 7 4 2 m p 1 P Q O 1 R k 2 i S T 1 f g n 3 o l j 2 h d e U 1 v 2 a b 2 c 5 6 7 8 2 2 1 2 3 4 1 9 0 13 12 00 00 11 0 01 1 0 1 1 1 01 1 1 23 22 21 13 00 00 00 11 0 0 011 1 1 01 1 1 11 11 00 00 00 01 1 1 00 11 01 01 01 01 00 00 11 11 00 00 01 1 1 11 11 00 11 01 0 1 01 11 11 11 00 00 1 01 11 0 1 10 01 01 01 00 00 1 1 011 1 1 00 0 011 1 01 1 1 00 1 011 1 01 1 11 1 1 01 0 0 0 1 011 1 00 0 1 10 01 0 1 01 11 00 11 1 01 0 1 1 1 01 0 0 0 11 1 0 00 11 0 1 01 11 00 0 1 10 11 01 11 0 1 01 00 01 01 01 01 1 11 00 0 1 10 00 11 11 0 1 10 0 1 01 01 11 11 0 1 10 0 1 01 11 11 0 1 01 01 01 1 1 0 1 10 00 11 00 0 1 01 0 1 10 11 1 10 01 00 00 11 0 1 01 01 0 0 10 01 01 0 1 10 00 0 1 01 0 1 10 0 1 01 01 10 0 11 01 0 00 00 00 11 01 0 0 11 11 11 0 1 10 11 11 01 01 1 1 0 1 10 11 00 00 0 1 10 00 0 01 0 11 11 00 00 1 0 11 11 01 11 0 1 10 11 11 11 0 1 10 01 0 1 01 20 21 01 01 00 1 1 011 1 22 01 01 01 01 00 23 00 01 1 1 00 11 00 00 10 11 12 w

0 Mw(8, 2) 1 E (existential aggregation) 1 2 x 7 y z 3 A C Note this also requires a scan of same LM set, so it is the same expense as count smoothing but give up much information (the only advantage is that the result may be simpler to express (one predicate tree over the 1 -lo grid cells) E 6 n K L 0 13 00 00 S i d e f g 9 a b c 5 6 7 8 1 2 3 4 10 B t u D r s q M Q U v R T w 23 22 01 01 00 00 01 1 1 00 11 01 01 01 01 00 00 11 11 00 00 01 1 1 11 0 1 01 11 11 11 00 00 1 01 11 0 1 10 01 01 01 00 00 0 1 10 11 01 11 0 1 01 00 01 01 01 01 1 11 00 0 1 01 01 0 0 10 01 01 0 1 10 00 0 1 01 0 1 10 0 1 01 01 10 0 11 01 0 00 00 00 11 0 1 10 00 11 11 0 1 10 0 1 01 01 11 11 0 1 10 0 1 01 11 11 0 1 01 01 01 1 1 0 1 10 00 11 00 0 1 01 0 1 10 11 1 10 01 00 00 11 01 0 0 11 11 11 0 1 10 11 11 01 01 1 1 0 1 10 11 00 00 0 1 10 01 0 1 01 11 00 0 01 0 11 11 00 00 1 0 11 11 01 11 0 1 10 11 11 11 0 1 10 01 0 1 01 00 20 21 01 0 1 01 11 00 11 01 0 1 01 11 O 00 01 1 1 00 11 01 01 00 P k 11 12 p m h 1 00 J o j 2 00 7 4 l 00 6 H I 5 5 F G 3 00 N 4

hs hs hs 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 17 19 21 23 25 27 29 31 33 35 36 37 39 41 43 45 87 113 168 169 170 171 179 194 195 196 197 201 204 205 209 222 223 224 225 244 247 248 250 251 5 6 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 1 1 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 1 1 0 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 0 1 0 0 1 k P 13 P 12 P 11 P 10 P 23 P 22 P 21 P 20 1 5 6 2 3 4 8 7 b c g f e a 9 d h j n l o m k i U S P p Q O R T w v r s u t q D B A z N H F M L K J I G E C y x 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 2 1 2 2 2 3 2 9 0 9 7 5 6 7 Changing the walk order to y-first Hilbert and reconstructing the LM(8, 2) Ptrees. Note compression. 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 1 0 1 1 1 1 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 0 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 0 0 1 1 0 1 0 1 1 1 0 0 1 1 1 0 1 1 x y z A B C N J o 13 m k 0 00 00 1 00 00 2 00 00 3 00 11 11 4 0 11 1 01 11 0 1 10 00 0 01 0 11 11 e f g a b c 5 6 7 8 1 2 3 4 10 6 00 0 1 10 0 01 01 0 00 00 • O Q v R T U w 23 22 01 1 01 0 00 11 01 01 10 01 01 01 01 10 10 10 01 01 0 1 01 011 11 00 11 01 01 1 1 0 1 10 0 1 01 11 01 01 1 1 11 11 11 011 1 11 00 11 0 1 10 01 01 0 0 11 11 11 0 1 01 11 11 00 01 0 1 01 11 11 11 01 0 1 10 01 0 1 01 11 11 0 1 10 11 00 00 0 1 01 0 1 10 00 00 00 11 01 1 1 00 00 11 00 1 01 1 11 11 0 1 01 00 00 01 0 1 10 0 1 01 1 1 01 0 1 10 00 0 1 01 01 0 1 10 0 1 01 11 11 20 21 00 0 1 10 0 01 11 0 1 10 11 s M 00 11 01 0 1 10 00 0 1 0 0 01 01 01 11 5 00 L S 9 11 1 0 1 01 01 00 00 00 01 11 0 1 10 11 11 11 0 1 01 P i d 11 12 K p l h r H I j u F G n q D E t 10 01 11 00

0 y-first Hilbert (8, 2) 1 -lo Count Vertical Smoothing y. H (8, 2) 1 C 1 2 7 y E o l 0 00 00 0 01 0 11 11 11 1 01 00 00 00 01 11 11 12 00 00 00 11 1 00 00 00 11 01 1 1 00 2 00 00 3 00 11 00 00 00 D r q 3 u s K L M 1 2 00 00 11 00 01 11 00 11 11 0 1 11 11 01 01 01 1 1 11 00 1 01 1 11 11 01 011 11 00 11 01 1 1 01 01 1 1 11 00 11 01 11 11 011 1 11 00 11 01 01 0 0 11 5 00 11 00 01 01 0 0 00 00 01 1 11 11 11 00 01 0 1 11 11 11 01 0 1 01 11 6 00 0 1 10 00 01 01 0 0 00 P S i e f 1 2 1 O Q 1 v R T U g 1 a b 2 3 c 5 6 7 8 2 2 1 2 3 4 10 w 23 22 01 1 01 0 00 11 01 01 10 01 01 01 01 10 10 10 01 01 0 1 01 011 11 00 11 01 01 1 1 0 1 10 0 1 01 11 01 01 1 1 11 11 11 011 1 11 00 11 0 1 10 01 01 0 0 11 11 11 0 1 01 11 11 00 01 0 1 01 11 11 11 01 0 1 10 01 0 1 01 11 11 0 1 10 11 00 00 00 0 1 01 0 1 10 00 00 00 11 01 1 1 00 00 11 00 1 01 1 11 11 0 1 01 00 00 0 1 10 0 1 01 0 1 10 00 0 1 01 01 0 1 10 0 01 11 0 1 10 0 1 01 11 11 20 21 00 11 01 0 1 10 00 0 1 0 0 01 01 01 11 0 1 10 11 11 11 0 1 01 p 9 11 1 0 1 01 01 00 00 00 01 11 00 11 4 0 1 11 0 1 10 k 11 00 0 01 0 11 11 m h d 1 13 7 t H J 1 1 n 2 21 6 F G j 22 5 4 3 23 4 B 5 On these Hilbert ordered basic Ptrees smoothing with count aggregation by using the both levels of LMs (black and red) 11 A N I 12 z C 6 13 3 3 1 1 x 10 01 11 00

key P 13 0 1 0 2 0 5 0 6 0 3 0 4 0 7 0 8 0 9 0 a 0 d 0 e 0 b 0 c 0 f 0 g 0 h 0 j 0 i 0 k 0 l 0 n 0 m 0 o 0 U 0 S 0 T 0 P 0 Q 0 p 0 R 0 O 1 w 1 v 0 I 0 J 0 K 0 L 0 M 0 G 0 E 0 C 0 x 0 y 0 N 0 H 0 F 0 z 0 A 0 D 0 B 1 q 1 r 1 s 1 t 1 u P 12 P 11 P 10 0 0 1 0 0 1 1 0 0 0 1 0 0 1 1 0 0 0 0 1 1 0 0 0 1 1 0 1 0 1 0 0 1 1 1 0 1 1 0 0 1 0 0 1 0 0 1 1 1 0 0 1 1 0 1 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 P 23 P 22 P 21 P 20 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 1 0 1 1 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 0 0 1 1 1 1 0 1 1 1 0 1 1 1 1 Change walk order to peano (x first or Z-ordered) and reconstruct 2 -level Ptrees. x y z A B C D E N J o j L s e f g 9 a b c 5 6 7 8 1 2 3 4 10 0 1 01 01 0 1 01 M O Q S i d v R T U w 23 22 20 21 00 00 11 01 0 0 11 0 1 10 00 11 00 00 01 1 1 11 01 01 00 0 1 10 01 00 11 10 10 10 00 11 01 01 01 10 01 01 11 0 1 10 1 01 0 0 00 11 10 1 1 01 00 1 01 1 1 01 01 11 0 1 10 01 01 00 1 01 1 11 11 0 1 01 010 0 00 1 1 01 00 0 1 01 1 1 11 0 1 01 11 01 01 1 1 0 1 10 00 0 1 01 00 00 11 11 0 1 01 11 01 01 0 0 0 1 01 0 1 10 11 10 01 01 01 11 11 11 0 00 0 01 0 11 11 1 00 00 0 1 01 0 01 11 11 2 00 00 00 11 01 00 11 3 00 11 11 4 0 1 00 00 00 5 00 P k 11 12 K p m h 13 u r H I l t F G n q 00 1 01 6 0 10 01 11 11 00 0 1 01 0 0 01 01 00 01 01 0 1 00 11 0 1 10 0 1 11 11 0 1 01 11 11 01 01 11 11 11 0 1 10 01 0 1 10 00 11

0 x. P (8, 2) 1 C 1 x 7 C 2 y E Now on these x-first Peano ordered basic Ptrees smoothing with count aggregation by using the both levels of LMs: 6 2 z 3 A D N F G I 5 B H J K L 1 1 4 12 3 1 5 6 7 t r q 3 u s M 4 3 n j 9 5 13 12 11 23 22 00 00 00 0 01 0 11 11 00 00 11 01 00 11 11 0 1 00 00 00 1 0 10 11 11 21 13 k 1 P S i e f g a b c 6 7 8 2 3 4 1 1 1 O Q 1 v R T U w 23 10 0 1 01 01 0 1 01 22 21 20 00 00 11 01 0 0 11 0 1 10 00 11 00 00 01 1 1 11 01 01 00 0 1 10 01 00 11 10 10 10 00 11 01 01 01 10 01 01 11 0 1 10 1 01 0 0 00 11 10 1 1 01 00 1 01 1 1 01 01 11 0 1 10 01 01 00 1 01 1 11 11 0 1 01 010 0 00 1 1 01 00 0 1 01 1 1 11 0 1 01 11 01 01 1 1 0 1 10 00 0 1 01 00 00 11 11 0 1 01 11 01 01 0 0 0 1 01 0 1 10 11 10 01 01 01 11 11 11 00 00 00 0 01 0 11 11 00 11 1 00 00 0 1 01 0 01 11 11 00 00 01 1 1 11 2 00 00 00 11 01 00 11 00 01 01 00 11 3 00 11 01 01 0 0 11 01 00 00 01 1 1 11 11 1 00 11 00 1 01 1 11 11 01 010 0 00 11 11 4 0 1 00 00 00 11 1 01 01 11 11 11 01 01 1 1 11 01 1 11 11 00 01 01 1 1 11 01 1 1 01 p 1 1 3 2 2 11 12 m h d 0 o l 2 1 1 11 11 01 0 0 11 5 00 11 00 1 01 6 0 10 01 11 11 00 0 1 01 0 0 01 01 00 01 01 0 1 00 11 0 1 10 0 1 11 11 0 1 01 11 11 01 01 11 11 11 0 1 10 01 0 1 10 00 11

key P 13 0 1 0 5 0 9 0 d 0 j 0 n 0 2 0 6 0 e 0 h 0 l 0 3 0 7 0 b 0 a 0 f 0 k 0 o 0 I 0 G 0 E 0 C 0 x 0 4 0 8 0 c 0 g 0 i 0 m 0 J 0 y 0 S 0 P 0 p 0 K 0 N 0 z 0 U 0 Q 0 L 0 A 0 T 0 O 0 M 0 H 0 F 0 D 0 B 0 R 1 v 1 q 1 w 1 r 1 t 1 s 1 u P 12 P 11 P 10 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 1 0 1 1 0 1 1 0 1 1 1 0 0 1 1 0 1 1 1 P 23 P 22 P 21 P 20 0 0 0 1 1 0 1 0 1 1 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 1 1 0 1 0 1 1 1 1 0 1 1 0 0 0 0 1 1 0 1 0 1 1 0 0 1 1 1 1 0 0 0 1 1 1 0 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 0 1 1 1 1 Change walk order to raster (yfirst or Z-ordered) and reconstruct 2 -level Ptrees. x y z A B C D E N J o j e f g 9 a b c 5 6 7 8 1 2 3 4 10 00 00 01 0 0 11 00 00 00 1 01 01 0 01 11 11 00 3 00 00 0 01 01 0 00 0 1 11 00 5 00 11 0 0 01 01 11 11 11 0 1 10 0 1 01 1 1 11 11 s M O Q v R T U w 23 22 20 21 0 1 01 01 1 1 01 00 00 0 010 11 00 00 11 01 00 00 0 1 01 1 10 00 00 0 1 10 01 0 1 01 11 01 01 01 01 10 01 01 01 10 0 1 01 11 0 1 10 00 01 0 0 00 00 01 00 00 1 01 01 0 1 10 01 11 1 1 00 11 01 0 1 10 01 0 1 11 11 0 1 01 1 1 10 01 11 0 1 10 01 1 1 01 010 00 11 11 11 0 1 10 01 0 00 11 1 10 010 1 10 01 0 11 11 11 01 11 0 1 10 4 00 0 1 01 1 6 0 11 01 11 11 11 0 1 10 00 00 01 0 0 0 1 01 11 L S i d 0 00 2 00 P k 11 12 K p m h 13 u r H I l t F G n q 11 11 11 0 1 10 00 00 01 0 1 01 11 00 11 11 1 01 010 01 1 1 11 0 1 10 0 1 01 10 11 10 01 0 1 10 0 1 01 10 11 1 0 1 01 11 1 10 01 01 0 11 11 01 01

0 y. R (8, 2) 1 C 1 2 x 7 C E Now on these y-first raster ordered basic Ptrees smoothing with count aggregation by using the both levels of LMs: 6 G 3 4 1 4 3 3 3 1 y z A 5 D N 6 7 3 1 B q t u r s F H I J K L M p 1 O 5 4 1 1 1 2 1 1 3 2 n 3 o l m j 2 d 9 5 0 12 11 23 22 21 13 12 1 f g a b c 6 7 8 2 3 4 00 00 01 0 0 11 00 00 11 01 0 1 00 00 1 01 0 11 11 00 01 1 010 00 00 01 01 01 0 1 11 1 00 00 00 1 01 01 0 01 11 1 010 11 01 01 0 1 01 2 00 00 3 00 00 0 01 01 0 00 0 1 10 00 00 00 11 01 1 1 01 00 00 0 01 0 00 01 11 11 01 00 00 01 0 1 01 11 00 11 01 0 1 01 11 010 1 11 11 01 01 01 11 01 01 0 1 11 11 00 1 01 0 11 11 11 010 1 11 11 01 0 0 11 11 010 1 11 1 01 0 11 11 00 00 11 00 01 11 11 11 01 1 0 11 11 11 01 01 01 1 1 11 11 00 5 00 11 0 0 01 01 11 11 11 0 1 10 0 1 01 1 1 11 11 R T U w 23 22 20 21 0 1 01 01 1 1 01 00 00 0 010 11 00 00 11 01 00 00 0 1 01 1 10 00 00 0 1 10 01 0 1 01 01 0 1 10 0 1 01 11 01 01 10 11 0 1 10 0 1 10 0 1 01 11 0 1 10 00 01 0 0 00 00 01 00 00 1 01 01 0 1 10 01 11 1 1 00 11 01 0 1 10 01 0 1 11 11 0 1 01 1 1 10 01 11 0 1 10 01 1 1 01 010 00 11 11 11 0 1 10 01 0 00 11 1 10 010 1 10 01 0 11 11 11 01 11 0 1 10 4 00 0 1 01 1 6 0 11 01 11 11 11 0 1 10 00 00 01 0 0 0 1 01 11 Q 2 3 e 00 0 010 11 00 00 i 10 11 P S h 1 13 k v 11 11 11 0 1 10 00 00 01 0 1 01 11 00 11 11 1 01 010 01 1 1 11 0 1 10 0 1 01 10 11 10 01 0 1 10 0 1 01 10 11 1 0 1 01 11 1 10 01 01 0 11 11 01 01

0 1 2 x 7 C E 6 G 3 1 4 3 3 3 1 y z A 4 5 6 3 1 B D N q t u r s F H I J K L 0 7 6 5 3 1 1 1 2 1 1 3 2 l m j 2 1 9 5 0 1 3 R f g a b c 6 7 8 2 3 4 n D N F H J K L 1 1 U 9 5 w 0 o 1 p m P k h d 1 1 1 l 2 2 3 e B 1 3 2 1 4 5 6 7 t r q 3 u s M x. P (8, 2) 1 C j T i 3 A 4 v Q S h d P k 1 O 2 2 z G I 4 p C y E y. R (8, 2) 1 C o x 7 M 5 n 1 S i 1 1 1 O Q 1 v R T U 1 1 3 2 2 e f g a b c 6 7 8 2 3 4 w Comparing orderings of (8, 2) 1 -low-bit Count Smoothing 0 1 2 3 1 1 C x 7 y E 6 z A N D F G H I 5 B J K L M 4 5 6 4 3 r s t q 1 2 7 u 1 0 2 7 y z A C 6 4 5 N 6 B 7 t D q r 3 u s F G H I M (8, 2) 1 C 3 3 1 1 x J K L M 5 1 2 y. H (8, 2) 1 C 4 n o l 2 1 E 4 3 0 j k h d e f 2 m p 1 P Q O 2 i S T U 1 R 1 v o l 2 1 w 0 m k h d g 6 7 8 2 2 1 2 3 4 1 1 n j 1 9 2 a b 2 c 5 3 e i f p P S 1 2 1 O Q 1 v R T U g 2 c 1 a b 3 5 6 7 8 2 2 1 2 3 4 9 w

0 y. H (8) 2 C 1 x 3 y C 0 0 0 13 12 0 0 00 00 0 1 2 00 00 3 00 11 0 0 J o 1 l K 0 0 1 1 1 11 4 0 1 11 0 1 10 0 1 1 1 5 00 11 0 1 1 1 6 00 0 1 10 01 01 0 0 00 00 S i e f g 9 a b c 5 6 7 8 1 2 3 4 2 10 2 q F L 1 t u r s M 1 Q v R T U w 23 22 01 1 01 0 00 11 01 01 10 01 01 01 01 10 10 10 01 01 0 1 01 011 11 00 11 01 01 1 1 0 1 10 0 1 01 11 01 01 1 1 11 11 11 011 1 11 00 11 0 1 10 01 01 0 0 11 11 11 0 1 01 11 11 00 01 0 1 01 11 11 11 01 0 1 10 01 0 1 01 11 11 0 1 10 11 00 00 0 1 01 0 1 10 00 00 00 11 01 1 1 00 00 11 00 1 01 1 11 11 0 1 01 00 00 0 1 10 0 1 01 0 1 10 00 0 1 01 01 0 1 10 0 01 11 0 1 10 0 1 01 11 11 20 21 00 00 0 1 0 0 01 01 01 11 0 1 10 11 11 11 0 1 01 O 00 11 01 0 1 10 11 1 0 1 01 01 00 00 00 01 11 0 P d 11 00 0 01 0 11 11 p m k h 22 3 2 j 23 2 B H I 1 12 N G n 13 A D E On these Hilbert ordered basic Ptrees smoothing with count aggregation by using highest level LMs only. z 10 01 11 00

x. P (8, 2) 2 C x y On these Peano ordered basic Ptrees smoothing with count aggregation by using highest level LMs only. j 22 13 S i e f g 9 a b c 5 6 7 8 1 2 3 4 q F L 0 1 01 01 0 1 01 2 1 r s 1 Q v R T U w 22 21 20 00 00 00 11 01 0 0 11 0 1 10 00 11 00 00 01 1 1 11 01 01 00 0 1 01 1 1 11 0 1 01 11 01 01 1 1 0 1 10 01 01 01 10 01 01 0 1 10 11 0 0 00 00 00 0 01 0 11 11 0 0 1 00 00 0 1 01 0 01 11 11 2 00 00 00 11 01 00 11 00 0 1 01 0 0 0 1 0 1 3 00 11 00 0 1 10 01 00 11 10 10 10 00 11 1 1 11 4 0 1 00 00 00 01 01 00 1 01 1 11 11 0 1 01 010 0 00 1 1 01 0 1 1 1 5 00 11 0 1 10 1 01 0 0 00 11 10 1 1 01 00 1 01 1 1 01 01 11 0 1 10 00 0 1 01 00 00 11 11 0 1 01 11 01 01 0 0 0 1 01 1 00 1 01 6 0 10 01 11 11 10 01 01 01 11 11 11 01 00 01 01 0 1 00 11 0 1 10 0 1 11 11 u 00 0 1 t M O 23 10 11 12 P d 2 K p m k h 23 J o 1 B H I l 12 N G n 1 A D E 13 z C 0 1 01 11 11 01 01 11 11 11 0 1 10 01 0 1 10 00 11

y. R (8, 2) 2 C x y j 0 2 0 0 0 1 0 1 12 e f g a b c 5 6 7 8 1 2 3 4 0 0 00 00 00 1 1 00 00 00 1 01 01 0 01 11 11 1 1 2 00 00 1 1 3 00 00 0 01 01 0 00 0 1 1 1 1 5 00 1 01 1 6 0 11 01 11 11 1 00 00 01 0 0 11 11 0 1 10 00 00 11 0 0 01 01 11 11 11 u r s M O Q v R T U w 23 22 20 21 0 1 01 01 1 1 01 00 00 0 010 11 00 00 11 01 00 00 0 1 01 1 10 00 00 0 1 10 01 0 1 01 01 0 1 10 0 1 01 10 01 01 10 11 0 1 10 0 1 10 0 1 01 11 0 1 10 00 01 0 0 00 00 01 00 00 1 01 01 0 1 10 01 11 1 1 00 11 01 0 1 10 01 0 1 11 11 0 1 01 1 1 10 01 11 0 1 10 01 1 1 01 010 00 11 11 11 0 1 10 01 0 00 11 1 10 010 1 10 01 0 11 11 11 01 11 0 1 10 1 t 01 01 01 00 00 01 0 0 0 1 01 11 0 1 10 0 1 01 1 1 11 11 i 10 11 L S 9 1 4 00 P d 22 K p m k 23 0 13 J o h 0 F H I l 0 N G n 0 1 B 1 3 On these y-first raster ordered basic Ptrees smoothing with count aggregation by using highest level LMs only. 12 A D E 13 z C 11 11 11 0 1 10 00 00 01 0 1 01 11 00 11 11 1 01 010 01 1 1 11 0 1 10 0 1 01 10 11 10 01 0 1 10 0 1 01 10 11 1 0 1 01 11 1 10 01 01 0 11 11 01 01

x y z A 1 3 C t D E N H I J K L u r F G 0 1 B s o 2 l j p m P k h Q i z C A 2 3 B 2 D N G M 1 q F H I O S 3 y E J K L t u r s M 2 y. R (8, 2) 2 C n 1 x y. H (8, 2) 2 C n v R 1 j T o 1 l p m P k S h U i d e f g 9 a b c 5 6 7 8 1 2 3 4 0 w 2 O 1 Q v R T U w Comparing orderings of (8, 2) 2 -lo-bit Count Smoothing x y z A C B D E N q t u r s x J K L N G H I z B q F H I M 1 A D E F G y C J K L o l j p m k h P O Q S i d e f g 9 a b c 5 6 7 8 1 2 3 4 u r s M x. P (8, 2) 2 C M (8, 2) 2 C n 2 1 t n v o 1 l R j T w P k h U p m S i d e f g 9 a b c 5 6 7 8 1 2 3 4 2 O 1 Q v R T U w

0 M (8, 2) 1 C 1 2 x 7 y z 3 A C D E 6 N 5 3 6 7 1 t u r q s F G H I 5 B 4 J K L M 4 n 3 o l j 2 0 12 11 23 22 0 0 1 0 0 0 0 1 1 1 0 1 21 13 0 00 1 00 00 00 k P i e f 9 a b 5 6 7 1 2 3 O Q U v R T g 1 1 c 8 w 4 10 23 22 01 01 01 00 00 01 1 1 00 11 01 01 01 01 00 00 11 11 00 00 01 1 1 11 11 00 11 01 0 1 01 11 11 11 00 00 1 01 11 0 1 10 01 01 01 00 00 1 1 01 11 01 11 1 1 1 00 0 1 10 01 0 1 01 11 1 1 00 11 0 1 10 11 01 11 0 1 01 00 01 01 01 01 1 11 00 0 1 10 00 11 11 0 1 10 0 1 01 01 11 11 0 1 10 0 1 01 11 11 0 1 01 01 01 1 1 0 1 10 00 11 00 0 1 01 0 1 10 11 1 10 01 00 00 11 0 1 01 01 0 0 10 01 01 0 1 10 00 0 1 01 0 1 10 0 1 01 01 10 0 11 01 0 00 00 00 11 01 0 0 11 11 11 0 1 10 11 11 01 01 1 1 0 1 10 11 00 00 0 1 10 00 0 01 0 11 11 00 00 1 0 11 11 01 11 0 1 10 11 11 11 0 1 10 01 0 1 01 20 21 00 01 1 1 00 11 01 01 00 p S d 11 12 m h 1 13 1

y. H (8, 2) 1 C x y z A C D E N 1 m k h 12 13 11 23 22 0 0 1 1 0 1 1 1 0 0 0 21 13 12 00 0 01 0 11 11 0 00 1 1 00 00 1 2 00 00 1 1 3 00 11 1 1 11 4 0 1 11 0 1 10 1 1 5 00 11 1 1 6 00 0 1 10 01 01 0 0 00 00 0 1 e f 9 a b 5 6 7 1 2 3 L p q P Q 1 s R T U 1 1 g c 8 w 4 23 22 01 1 01 0 00 11 01 01 10 01 01 01 01 10 10 10 01 01 0 1 01 011 11 00 11 01 01 1 1 0 1 10 0 1 01 11 01 01 1 1 11 11 11 011 1 11 00 11 0 1 10 01 01 0 0 11 11 11 0 1 01 11 11 00 01 0 1 01 11 11 11 01 0 1 10 01 0 1 01 11 11 0 1 10 11 00 00 0 1 01 0 1 10 00 00 00 11 01 1 1 00 00 11 00 1 01 1 11 11 0 1 01 00 00 0 1 10 0 1 01 0 1 10 00 0 1 01 01 0 1 10 00 0 1 0 0 01 01 01 11 20 21 00 0 1 01 11 11 r v 00 11 01 0 1 10 0 01 11 0 1 10 u M O S 10 11 1 0 1 01 01 00 00 00 01 11 0 1 10 11 11 11 0 1 01 K i d 11 0 00 0 J o j 1 t H I l 2 F G n B 10 01 11 00

x. P (8, 2) 1 C x y On these Peano ordered basic Ptrees smoothing with count aggregation by using highest level LMs only. o j 23 22 21 13 12 e f 9 a b 5 6 7 1 2 3 g c 8 w 4 23 22 21 20 00 01 1 1 11 01 01 00 0 1 10 01 00 11 10 10 10 00 11 01 01 01 10 01 01 11 0 1 10 1 01 0 0 00 11 10 1 1 01 00 1 01 1 1 01 01 11 0 1 10 01 01 00 1 01 1 11 11 0 1 01 010 0 00 1 1 01 00 0 1 01 1 1 11 0 1 01 11 01 01 1 1 0 1 10 00 0 1 01 00 00 11 11 0 1 01 11 01 01 0 0 0 1 01 0 1 10 11 10 01 01 01 11 11 11 0 1 1 00 00 0 1 01 0 01 11 11 1 2 00 00 00 11 01 00 11 1 3 00 11 0 0 1 0 1 1 1 11 4 0 1 00 00 00 1 5 00 1 01 6 0 10 01 11 11 1 U 11 0 1 R T 0 1 10 01 0 1 10 1 1 v 00 11 01 0 0 11 0 1 Q 1 11 0 1 P O 00 00 0 01 0 11 11 1 p M 00 00 1 L 00 11 01 0 0 11 0 00 1 s 00 0 1 u 00 0 0 r q 2 1 00 0 1 t 1 1 01 0 1 01 1 F S 10 11 K i d 0 1 1 k 0 1 J m h B H I l 11 N G n 12 A D E 13 z C 00 0 1 01 0 0 01 01 00 01 01 0 1 00 11 0 1 10 0 1 11 11 0 1 01 11 11 01 01 11 11 11 0 1 10 01 0 1 10 00 11

1 1 2 y. R (8, 2) 1 C x y E On these x-major raster ordered basic Ptrees smoothing with count aggregation by using highest level LMs only. o l 0 0 0 1 0 1 1 0 0 1 1 1 23 22 0 1 1 1 21 13 12 1 p e f g 9 a b c 5 6 7 8 1 2 3 4 00 00 01 0 0 11 1 00 00 00 1 01 01 0 01 11 11 1 2 00 00 1 3 00 00 0 01 01 0 00 0 1 10 11 0 1 10 00 00 1 4 00 11 00 1 5 00 11 0 0 01 01 11 1 0 1 01 1 6 0 11 01 11 11 u s L M O Q v R T U w 23 22 20 21 0 1 01 01 1 1 01 00 00 0 010 11 00 00 11 01 00 00 0 1 01 1 10 00 00 0 1 10 01 0 1 01 01 0 1 10 0 1 01 10 01 01 10 11 0 1 10 0 1 10 0 1 01 11 0 1 10 00 01 0 0 00 00 01 00 00 1 01 01 0 1 10 01 11 1 1 00 11 01 0 1 10 01 0 1 11 11 0 1 01 1 1 10 01 11 0 1 10 01 1 1 01 010 00 11 11 11 0 1 10 01 0 00 11 1 10 010 1 10 01 0 11 11 11 01 11 0 1 10 1 r 1 01 00 00 01 0 0 0 1 01 11 P i d 0 00 t D F S 10 11 0 1 10 0 1 01 1 1 11 11 K k h B H J m j 11 N I 1 12 A G n 13 z C 11 11 11 0 1 10 00 00 01 0 1 01 11 00 11 11 1 01 010 01 1 1 11 0 1 10 0 1 01 10 11 10 01 0 1 10 0 1 01 10 11 1 0 1 01 11 1 10 01 01 0 11 11 01 01

1 1 2 x y z A C E N B t D r x 1 u s J K L A N 2 J K L 1 o l j 1 p m P k O Q S h i d e f g 9 a b c 5 6 7 8 1 2 3 4 u r s M y. R (8, 2) 1 C n q 1 t H I M B F G H I z D E F G y C y. H (8, 2) 1 C 1 n v o l R j T w P k O Q S h U p m i d e f 9 a b 5 6 7 1 2 3 1 v R T U 1 1 g c 8 w 4 Comparing orderings of (8, 2) 1 -lo-bit Count Smoothing 0 1 2 x 7 y z 3 A C D E 6 N 5 3 6 7 t q r F G H I 5 B 4 J K L M M (8, 2) 1 C 4 3 n o l 2 1 0 j 1 m k h e f 9 a b 5 6 7 1 2 3 P S i d p O Q U v R T g 1 1 c 8 4 w 1 u s As far as using this info to create an good initial cluster centroid set, I like Hilbert because the centriod at (3, 3) is strong and would attract I, so the initial clustering is very good (actually doesn't necessarily need improvement) x y z A C B D E N 1 t q r F G H I J K L M x. P (8, 2) 1 C n o l j 1 m k h e f 9 a b 5 6 7 1 2 3 P O Q S i d p 1 v R T U g 1 1 c 8 4 w 2 1 u s

Comments so far x Someone should look at y-first-Peano y. Pw (N-ordering) It might be a bit better since it moves immediately from the lower left octant to the one above it? ? ? How about y-first Raster (x-major sorting order)? z A C N n J K L p m P k h u r s H o l q t F G I B D E j How about x-first-Hilbert? y O Q S i d e f g 9 a b c 5 6 7 8 1 2 3 4 M v R T U w What about other aggregations? What about universal? (note that finding good initial centroids may work better using universal since it identifies only very dense areas (but maybe too dense? That is, too few centroid areas? ). What about majority aggregation (1 iff the majority of the bits are 1 -bits)? Note that one cannot use the LMs or PMs for this, but must recompute these bit vectors of size |LM| by examining each not-pure-zero leaf. What about other rank aggregations (e. g. , 3/4 ths i. e. , 1 iff at least 3/4 ths of the bits are 1 -bits)? Of course any rank aggregation takes a lot of additional processing, whereas, existential and universal use the LM and PM vectors that are already computed and immdiately available.

Comments continued x-first Hilbert ordering is as shown here --> x y z A B C D E N k J o k h K L p m j u r s e f g a b c 5 6 7 8 1 2 3 4 x P • S i d e f g 9 a b c 5 6 7 8 1 2 3 4 • Q y T z A B n N j U w u r s F K L p m k h t H J o l T q D I R R U G v v w E O Q P C M s M O S 9 H I l t L i d F G n q K p m h Below is x-first Raster and below at right is y-first Raster J o j u r H I l t F G n q P • d e f g 9 a b c 5 6 7 8 1 2 3 4 v O Q S i M R T U w

M (8) y. H (8) x. P (8) y. R (8) m 13 0 0 0 0 1 0 0 0 m 23 0 0 0 0 1 1 1 0 1 H 13 0 0 0 0 0 1 0 0 H 23 0 0 0 0 0 1 1 1 P 13 0 0 0 0 0 1 PH 23 0 0 0 0 0 1 1 1 01111111 01111100 10000000 11111110 00111111 11000000 000111111 m 12 0 0 0 0 1 0 1 1 1 m 22 0 0 1 1 1 H 12 0 0 0 1 1 1 H 22 0 0 1 1 0 1 0 1 P 12 0 0 0 1 1 0 1 0 1 P 22 0 0 1 1 1 X 13 0 0 0 0 1 01111111 X 12 0 0 0 0 1 1 1 0 1 X 23 0 0 0 1 0 1 0 1 X 22 0 1 0 1 00111110 00000110 001110011111 0010111101 001110100111 111111100000 10111101 10000000 0111111100 00011111 10001110 00001111 11101111 01000001 m 11 0 1 0 1 m 21 0 0 1 1 0 1 0 1 0 1 H 11 0 1 0 1 1 1 0 1 H 21 0 0 1 1 0 1 0 1 0 1 P 11 0 1 0 1 1 1 0 1 P 21 0 0 1 1 0 1 0 1 1 1 00110011 0101 01111111 110011111001 01001010 000011111110 00011111 01111000 000011110100 00001111 00000111 10001110 01111111 00111100 00111110 01111111 11100011 11100111 00001111 01001111 0011 00100011 11110110 -1111111 000011111 01111110 01110001 m 10 0 1 0 1 m 20 0 1 0 1 H 10 0 1 0 1 H 20 0 1 0 1 P 10 0 1 0 1 P 20 0 1 0 1 10011111 X 11 0 0 0 1 00011111 1 1 0 1 11111110 0 1 01111111 1 1 X 10 0 1 0 1 00001100000 11011110 00011011110 11101111 X 21 0 1 0 1 X 20 0 1 0 1 00000001 00110100111 0110111010110011 11101111 01010101 1100 0101010000 00001010 00011001 00001111 0011 1000111110011111 11000110110 01101000 10010101 10001010 00011010 01000001 01000010 011000111001 01101010 0101 1100 11110111 11101011 0101 00010101 1010 100010100 01000001 10010101 00110011 0101 01100101 11111110 1011 10100011 00000011 11100000001 11111110 00000111 10000000 10100011 01011101 10001001 0101 01100101 11111110 1011 10100011 We have a wealth of classification and clustering tools now (also ARM). What methods leap to mind?

y. H key P 13 P 12 1 0 0 5 0 0 6 0 0 2 0 0 3 0 0 4 0 0 8 0 0 7 0 0 b 0 0 c 0 0 g 0 0 f 0 0 e 0 0 a 0 0 9 0 0 d 0 0 h 0 0 j 0 0 n 0 0 l 0 0 o 0 0 m 0 0 k 0 0 i 0 0 U 0 1 S 0 1 P 0 1 p 0 1 Q 0 1 O 0 1 R 0 1 T 0 1 w 1 1 v 1 0 r 1 1 s 1 1 u 1 1 t 1 1 q 1 0 D 0 1 B 0 1 A 0 1 z 0 1 N 0 1 H 0 1 F 0 1 M 0 1 L 0 1 K 0 1 J 0 0 I 0 0 G 0 0 E 0 0 C 0 0 y 0 0 x 0 0 ANDing Alg: res. LM = ^TLM ^T'^PPM' res. Leaf exists iff res. LM=1 res. Leaf=^Tres. Leaf^T'^Pres. Leaf' P 11 P 10 P 23 P 22 P 21 P 20 Hs 0 0 0 1 2 0 1 0 0 0 1 3 0 1 0 0 4 1 0 0 0 5 1 1 0 0 6 1 1 0 0 0 1 7 1 0 0 1 8 1 0 0 0 1 0 9 1 1 0 0 10 1 1 0 0 1 1 11 1 0 0 0 1 1 12 0 1 0 0 1 1 13 1 0 0 0 14 0 0 1 0 15 0 0 1 1 16 0 1 0 0 18 0 0 0 1 20 0 1 1 1 22 0 1 1 0 24 1 0 0 1 1 1 26 1 1 0 28 1 0 0 1 30 1 1 0 0 32 0 1 0 0 34 0 0 0 1 36 0 0 0 1 1 0 37 0 0 0 1 1 1 38 0 1 1 0 40 1 0 0 1 1 1 42 1 1 0 44 1 0 0 1 46 1 0 0 1 88 1 0 0 1 114 1 0 1 1 1 0 169 1 1 1 0 170 1 1 1 171 1 0 1 1 172 1 1 1 0 180 1 1 1 0 195 1 0 1 1 196 0 1 1 197 0 0 1 1 198 0 0 1 1 0 1 202 1 0 1 1 0 0 205 1 0 1 206 1 0 1 1 210 0 1 1 223 0 0 1 1 224 1 1 1 0 1 1 225 1 0 1 1 226 1 0 1 1 0 0 245 1 0 1 248 1 0 1 1 1 0 249 1 1 1 251 1 0 1 1 252 H 13 0 0 0 0 0 1 0 0 H 23 0 0 0 0 0 1 1 1 y. H (8) 11111110 00111111 H 12 0 0 0 1 1 0 1 H 22 0 0 1 1 0 1 0 1 10000000 H 11 0 1 0 1 1 1 0 1 0111111100 00011111 H 21 0 0 1 1 0 1 0 1 0 1 10111101 x A fast isotropic clustering algorithm: 0. remove noise using H-step gap analysis. y z C E N G I J 10001110 01111111 H 10 0 1 0 1 00111100 00111110 01111111 11100011 11100111 H 20 0 1 0 1 000011110100 00001111 00000111 2 B L M K A D q F H 001101101000 10010101 10001010 00011010 01000001 01000010 011000111001 01101010 0101 1100 11110111 11101011 1 t u r s 1. Use y. H (8) 2 -lo cells as initial clusters (with strengths) n 1 o l j p P k 9 5 2. expand the strongest cluster by 1 bit (but only if they do not collide with an existing cluster): expand (01 11) to (0 1) 1 i e f a b c 6 7 8 2 3 4 2 w y z A C N 1 I 1 j 9 5 1 L 1 f a b c 6 7 8 2 3 4 1 t u r s M O Q S i e 2 H K P k q F p m h d J o l B D G n T U E This gives us 3 noise points {q, v, w} and 5 clusters (the right ones except that it doesn't separate out an tiny embedded cluster in octant (01, 01) but that is to be expected since the diameter of that embedded cluster is smaller than the 2 hi cell diameter. v R g x revise strength and repeat 2 O Q S h d 1 m v R T U g w

x. P (8) x. P (8, 2) P 13 0 0 0 0 0 1 11000000 0 1 00011111 P 12 0 0 0 1 1 0 1 0 1 P 23 0 0 0 0 0 1 1 1 P 22 0 0 1 1 13 00111111 23 0 1 01 01 0 1 01 P 21 0 0 1 1 0 1 0 1 1 1 P 10 0 1 0 1 00001111 01001111 0011 00100011 11110110 -1111111 P 20 0 1 0 1 000011111 01111110 01110001 22 21 0101 00010101 1010 100010100 01000001 10010101 00110011 0101 01100101 11111110 1011 10100011 20 00 00 11 01 0 0 11 0 1 10 00 11 00 00 01 1 1 11 01 01 00 0 1 10 01 00 11 10 10 10 00 11 01 01 01 10 01 01 11 0 1 10 1 01 0 0 00 11 10 1 1 01 00 1 01 1 1 01 01 11 0 1 10 01 01 00 1 01 1 11 11 0 1 01 010 0 00 1 1 01 00 0 1 01 1 1 11 0 1 01 11 01 01 1 1 0 1 10 00 0 1 01 00 00 11 11 0 1 01 11 01 01 0 0 0 1 01 0 1 10 11 10 01 01 01 11 11 11 0 00 0 01 0 11 11 1 00 00 0 1 01 0 01 11 11 2 00 00 00 11 01 00 11 3 00 11 11 4 0 1 00 00 00 5 00 01000001 10 11 12 10001110 00001111 11101111 P 11 0 1 0 1 1 1 0 1 00 1 01 6 0 10 01 11 11 00 0 1 01 0 0 01 01 00 01 01 0 1 00 11 0 1 10 0 1 11 11 0 1 01 11 11 01 01 11 11 11 0 1 10 01 0 1 10 00 11

y. R (8) 13 X 13 0 0 0 0 1 01111111 X 12 0 0 0 0 1 1 1 0 1 X 23 0 0 0 1 0 1 0 1 X 22 0 1 0 1 00111110 00000110 001110011111 00101111 12 00 00 1 00 00 00 1 01 01 0 01 11 11 3 00 00 0 01 01 0 00 0 1 10 11 X 20 0 1 0 1 11 0 1 10 00 00 1 00 5 00 11 0 0 01 01 11 11 11 23 00110100111 0110111010110011 11101111 22 00000011 11100000001 11111110 00000111 10000000 10100011 01011101 10001001 0101 01100101 11111110 1011 10100011 20 21 0 1 01 01 1 1 01 00 00 0 010 11 00 00 11 01 00 00 0 1 01 1 10 00 00 0 1 10 01 0 1 01 01 0 1 10 0 1 01 11 01 01 10 11 0 1 10 0 1 10 0 1 01 11 0 1 10 00 01 0 0 00 00 01 00 00 1 01 01 0 1 10 01 11 1 1 00 11 01 0 1 10 01 0 1 11 11 0 1 01 1 1 10 01 11 0 1 10 01 1 1 01 010 00 11 11 11 0 1 10 01 0 00 11 1 10 010 1 10 01 0 11 11 11 01 11 0 1 10 0 1 01 1 1 11 11 X 21 0 1 0 1 00 00 01 0 0 0 1 01 4 00 0 1 01 1 6 0 11 01 11 11 00001100000 11011110 00011011110 11101111 00000001 00 00 01 0 0 11 0 00 00 X 10 0 1 0 1 10 11 2 00 10011111 X 11 0 0 0 1 00011111 1 1 0 1 11111110 0 1 01111111 1 1 11 11 11 0 1 10 00 00 01 0 1 01 11 00 11 11 1 01 010 01 1 1 11 0 1 10 0 1 01 10 11 10 01 0 1 10 0 1 01 10 11 1 0 1 01 11 1 10 01 01 0 11 11 01 01

Implementation Specification R(A 1. . An) has basic Ptrees, (ls)Pi, j i=1. . n and if Ai is real with bitwidth=mi or if Ai is categorical with categories {a 1. . ami} then j=1. . mi Let m= i=1. . nmi Sort {(ls)Pi, j} by i first, then j. Alias each P-tree by Pk where k is its sort position, k=1. . m. Develop a simple transportable AND utility (assembler, C, C++. . . ) that takes as input: 2 m-bit vectors P, T and a 2 -bit output-switch, S where P (Pattern) specifies which P-trees are to be involved by (1 -bit) and T (Truth) has a 1 -bit iff P=1 and is the operand (uncomplemented). For those with P=1 and T=0 their complements are the operand. Note: If a simple P-tree complement is called for (no ANDing) just set that P-bit to 1 and leave that T-bit at 0. Let M be a state variable specifying the number of P-trees in the set (M must be at least m). For the output-switch: if the first bit is 1, the result P-tree is to be stored as (ls)PM+1 and if the second bit is 1 the root count is to be returned. rc P, T, S (ls)P M+1

K 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U 5 6 1 3 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 2 0 0 0 0 0 0 1 0 1 0 0 1 1 1 0 1 0 0 1 1 1 2 7 3 1 1 0 0 1 1 0 1 0 1 0 1 1 1 1 1 0 0 1 0 1 0 3 4 1 0 0 1 0 1 1 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 2 2 2 3 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1 0 0 0 0 2 2 0 0 0 0 1 1 1 1 0 1 1 1 0 0 0 1 1 1 1 3 4 2 1 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 3 5 2 0 0 0 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 0 1 1 0 3 3 1 1 1 32 31 3021 2010 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 0 1 0 1 1 5 7 3 3 7 2 2 2 32 31 30 21 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 1 1 0 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 1 0 1 1 0 0 0 0 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 1 0 0 0 1 1 1 2 1 7 7 5 2 9 Note that PM(B')=LM'(B) LM(B')= PM'(B P = AND-input-pattern (vertical slices involved in AND) T = AND=truth-pattern (truth value of thos inolved) 2 10 0 0 0 1 1 0 0 0 1 1 1 1 1 0 0 0 0 2 1 ANDing Algorithm: res. LM = ^TLM ^T'^PPM' res. Leaf exists iff res. LM=1 and then res. Leaf=^Tres. Leaf^T'^Pres. Leaf' (if no operands, install pure 1 or create a PM) e. g. , 13^12: P=1100 0000 T=1100 0000, res. LM = LM 13^LM 12 =0001000 res. Leaf(3): same P and T. LMs in red and PMs in blue below. so res. LM 1 ^ 1 = 1 PMs show that the 2 middle leaves are pure 1 (rc=4 already) 1 1 1 1 1 and that the last leaf of 13 is pure 1 so just retrieve last leaf of 12 (01) and accumulate 1 -count into rc (=5) and ANDing first leaves, 01 ^ 10 = 00, so rc=5 2 L Mw 13 0 00 1 00 2 00 00 00 22 01 01 00 00 1 01 1 00 11 01 01 0 1 01 01 00 00 11 11 00 00 01 1 1 11 0 1 01 11 11 11 00 00 1 01 11 0 1 10 01 01 01 00 00 0 1 10 11 01 11 0 1 01 00 01 01 01 01 1 11 00 0 1 01 0 0 1 10 01 0 1 10 00 0 1 01 0 1 10 0 1 01 01 10 0 11 01 0 00 00 6 00 11 0 1 10 00 11 11 0 1 10 0 1 01 01 11 11 0 1 10 0 1 01 11 11 0 1 01 01 01 1 1 0 1 10 00 11 00 0 1 01 0 1 10 11 1 10 01 00 00 11 01 0 0 11 11 11 0 1 10 11 11 01 01 1 1 0 1 10 11 00 00 0 1 10 01 0 1 10 0 1 01 11 00 0 01 0 11 11 00 00 1 0 11 11 01 11 0 1 10 11 11 11 0 1 10 01 0 1 01 5 00 20 21 01 0 1 01 11 00 11 01 00 23 00 1 01 1 00 11 01 01 0 1 01 11 3 01 1 1 01 11 4 00 10 11 12

K 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U 5 6 1 3 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 2 0 0 0 0 0 0 1 0 1 0 0 1 1 1 0 1 0 0 1 1 1 2 7 3 1 1 0 0 1 1 0 1 0 1 0 1 1 1 1 1 0 0 1 0 1 0 3 4 1 0 0 1 0 1 1 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 2 2 2 3 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1 0 0 0 0 2 2 0 0 0 0 1 1 1 1 0 1 1 1 0 0 0 1 1 1 1 3 4 2 1 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 3 5 2 0 0 0 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 0 1 1 0 3 3 1 2 1' 0 3 2 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 1 0 1 0 1 1 2 res. LM = ^TLM ^T'^PPM' res. PM unnecessary - must be construct. res. Leaf exists iff res. LM=1 and then res. Leaf=^Tres. Leaf^T'^Pres. Leaf' 1 1 0 1 1 1 13'^12^20: P=1100 0001 T=0100 0001, res. LM = LM 12^LM 20 ^PM'13 =0001111 res. Leaf(3456): 10 ^ 10 = 10 rc=12 1 1 0 1 1 1 0 0 0 1 11 10 00 1 1 1 10 00 01 11 1 0 1 1 11 01 10 Note that PM(B')=LM'(B) LM(B')= PM'(B P = AND-input-pattern (vertical slices involved in AND) T = AND=truth-pattern (truth value of thos inolved) 13 0 00 1 00 2 00 00 00 22 01 01 00 00 1 01 1 00 11 01 01 0 1 01 01 00 00 11 11 00 00 01 1 1 11 0 1 01 11 11 11 00 00 1 01 11 0 1 10 01 01 01 00 00 0 1 10 11 01 11 0 1 01 00 01 01 01 01 1 11 00 0 1 01 0 0 1 10 01 0 1 10 00 0 1 01 0 1 10 0 1 01 01 10 0 11 01 0 00 00 6 00 11 0 1 10 00 11 11 0 1 10 0 1 01 01 11 11 0 1 10 0 1 01 11 11 0 1 01 01 01 1 1 0 1 10 00 11 00 0 1 01 0 1 10 11 1 10 01 00 00 11 01 0 0 11 11 11 0 1 10 11 11 01 01 1 1 0 1 10 11 00 00 0 1 10 01 0 1 10 0 1 01 11 00 0 01 0 11 11 00 00 1 0 11 11 01 11 0 1 10 11 11 11 0 1 10 01 0 1 01 5 00 20 21 01 0 1 01 11 00 11 01 00 23 00 1 01 1 00 11 01 01 0 1 01 11 3 01 1 1 01 11 4 00 10 11 12

k 1 1 3 2 1 0 2 2 3 2 1 0 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 1 1 0 1 0 0 1 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 0 1 0 1 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 x-first Peano (Z-ordering? ) Given R(A 1. . An) (vector space), a = (a 1. . an) = ( k=b. . 0 a 1, k 2 k. . . k=b. . 0 an, k 2 k ) i=1. . n Dom(Ai) x i=1. . n k=b. . 0(xi, k-ai, k = k=b. . 0 rk A B N or J o j u r s H I l t F G n q D E First we treat p=1 (Manhattan distance) and we consider only the polytant where all ai xi (other polytants are handled similarly with the appropriate signs) so that all |xi-ai| = xi-ai = 0 and thus, L 1 Ring(x, a)= {x | i=1. . n(xi-ai) = r } or all x such that 2 k z C The Lp r-Ring about a, Lp. Ring(a, r) is: {x | Lp(x, a)p = rp} where Lp(x, a)p = i=1. . n |xi-ai|p )2 k y K L p m P k h O Q S i d e f g 9 a b c 5 6 7 8 1 2 3 4 M v R T U i=1. . n( k=b. . 0 xi, k 2 k - k=b. . 0 ai, k 2 k) = k=b. . 0 rk 2 k or i=1. . n k=b. . 0 xi, k 2 k - i=1. . n k=b. . 0 ai, k 2 k = k=b. . 0 rk 2 k or i=1. . n k=b. . 0 xi, k 2 k = k=b. . 0 rk 2 k + i=1. . n k=b. . 0 ai, k 2 k or i=1. . n k=b. . 0 xi, k 2 k = k=b. . 0 (rk+ i=1. . nai, k)2 k or k=b. . 0( i=1. . nxi, k)2 k = k=b. . 0(rk+ i=1. . nai, k)2 k Forming a P-tree mask for the set of xs that solve this equation seems difficult because increasing one dimension requires decreasing another, etc. w

k 1 1 3 2 1 0 2 2 3 2 1 0 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 1 1 0 1 0 0 1 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 0 1 0 1 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 Given R(A 1. . An) (vector space), a = (a 1. . an) = k=b. . 0 a 1, k 2 k. . . k=b. . 0 an, k 2 k ) i=1. . n Dom(Ai) x-first Peano (Z-ordering? ) ( x y z A B C The Lp r-Ring about a, Lp. Ring(a, r) is: {x | Lp(x, a)p rp} where Lp(x, a)p = i=1. . n |xi-ai|p q D E N t u r s F G H I J K L M Next we treat p=2 (square Euclidean distance) and L 2 Ring(x, a)2= {x | i=1. . n(xi-ai)2 = r } or all x such that n i=1. . n( k=b. . 0(xi, k-ai, k)2 k)2 = ( k=b. . 0 rk 2 k)2 o l j P k h The left side can be multiplied out and one can, again, seek a P-tree mask for the set of solutions, but it presents the same "trade off" problem, right? p m O Q S i d e f g 9 a b c 5 6 7 8 1 2 3 4 v R T U w

ps k 1 1 3 2 1 0 2 2 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 33 34 37 38 41 42 45 46 49 50 54 56 57 58 61 62 86 110 142 143 154 155 158 164 166 172 174 175 178 180 182 186 187 188 190 237 252 253 254 255 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 1 1 0 1 0 0 1 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 0 1 0 1 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 Dr. Scott is looking for a formula the P-tree mask of the solution set based on the tree position of a and x. Is there a closed formula for the P-tree mask of the L 1 or L 2 ring of radius r about a? Note that the walk of any quadrant is the same at a given level. But that suggests an approach based on j-lo cells is the way to do it (the mask any such cell is trivial). The only concern here is when a is near the boundary of its cell (so in order to get a superset mask of its r-disk, one has to consider some neighboring cells). That suggests just using the EIN-disk about a? x y z A B C D E N J o j K L p m P k h 9 ade bcfg hj ik ln mo US T PQ p O Q S i d e f g 9 a b c 5 6 7 8 1 2 3 4 RO s M v R T U w IJ rstu 1256 3478 u r H I l t F G n q w KL M GE C xy N H F z. AD B v ps ps 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 q 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 ps k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 33 34 37 38 41 42 45 46 49 50 54 56 57 58 61 62 86 110 142 143 154 155 158 164 166 172 174 175 178 180 182 186 187 188 190 237 252 253 254 255 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u

Summary slide: The last few slides have been an attempt to develop a formula to mask the Manhattan ring of a point at a given radius. This work builds off of a discussion with Dr. Kirk Scott and his CATA-06 paper. It seems to come down to a case of solving the equation (creating a P-tree mask for the solutions of): k k k=b. . 0( i=1. . nxi, k)2 = k=b. . 0(rk+ i=1. . nai, k)2 This involves trading off among the dimensions (give from 1, take from another). Can we form P-tree masks in this case? One can use j-lo grid cells as Euclidean r-disk supersets. However, when the center, a, is not in the middle of the cell, these may not give small enough supersets so they can then be pruned using scans. One could always take the j-lo cell and a selection of its bordering j-lo cells to make sure the Euclidean r-disk is completely super-setted. This would involve determining the subset of dimensions in which ai is close to zero (pushing a too close to those "low-side" cell borders) and close to 1 (pushing a too close to those "hi-side" cell borders). The EIN r-disk about a (the L r-disk about a) is the best cube-shaped superset, of course. But is its P-tree mask easily computed, i. e. , is there preprocessing that makes it a matter of plugging the ai and r values into one formula with no additional ANDing or Root-Counting? That is, can all the vertical processing be preprocessing and therefore amortized for all a and r? One additional note regarding EIN-disk super-setting of the Euclidean r-disk about a: That's what Taufik is now doing. First, he takes the r-disk-superscribed TV-countour of a (the thinnest TV-contour that contains the Euclidean r-disk about a). Then prune out a sufficient number of the "far away or halo points" by intersecting it with (ANDing masks) the rdisk-superscribing X i-ai-contours of a (either one i at a time or taking the cluster of all large i-ai values, if there is one). We note that the intersection of all Euclidean-r-disk-superscibing Xei-contours IS the EIN-disk or radius r about a. Another approach suggested by Dr. Scott is to develop formulas for an approximation of the Euclidean disk (or ring) about an arbitrary center point based on where it sits in its j-lo cell. Once these formulas are developed for one cell, they are the same for others (just change the hi-bits that are used to address the cell). (This is similar to the process described in the j -lo grid cell paragraph above).

K 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U 5 6 1 1 3 2 3' 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 1 1 0 0 1 1 0 1 0 0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 1 0 0 1 1 0 1 1 0 1 1 0 1 1 2 7 3 1 1 0 0 1 1 0 1 0 1 0 1 1 1 1 1 0 0 1 0 1 0 3 4 1 0 0 1 0 1 1 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 2 2 2 3 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1 0 0 0 0 2 2 0 0 0 0 1 1 1 1 0 1 1 1 0 0 0 1 1 1 1 3 4 2 1 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 3 5 2 0 0 0 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 0 1 1 0 3 3 rc(13^23)=5 <32 rc(13^23')=rc 13 -rc(13^23)7 -5=2<32 rc(13'^23)=rc 23 -rc(13^23)=22 -5=17<32 rc(13'^23')=56 -17 -2 -5=32 32 so 3 -lo cell (0, 0) is core x y z A C N j J K L p m P k h u r s H o l t F G n q D E I B O Q S i d e f g 9 a b c 5 6 7 8 1 2 3 4 M v R T U w j-lo core cell mining (assume a cell is core iff it is 50% full): What we note is that we need all patterns precomputed (and Root. Counted). Can we AND, Root. Count, e. g. , 13^23 13'^23 13^23' 13'^23' in 1 step? (e. g. , by concatenating, flipping and shifting before ANDing? ? ? ) Next slide attempts to use the Peano. Step derived attribute in combo with this approach since it walks by j-lo cells. There are no 2 -lo core cell in 3 -lo cells (1, 1) or (1, 0) since <8 points in them (5 & 2 resp. ) 2 -lo core cells in 3 -lo cell (0, 1)? rc(13'^23^12^22)=7<8. . .

p s k p p 1 1 3 2 1 0 p p 2 2 3 2 1 0 p p s s 7 6 5 4 p p s s 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 33 34 37 38 41 42 45 46 49 50 54 56 57 58 61 62 86 110 142 143 154 155 158 164 166 172 174 175 178 180 182 186 187 188 190 237 252 253 254 255 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 2 3 2 7 3 4 2 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 2 3 3 3 2 4 5 3 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 1 1 2 3 2 2 7 4 3 0 0 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 3 3 3 2 5 4 3 2 NOTE!! ps 7=p 23 ps 6=p 13 ps 5=p 22 ps 1=p 20 ps 0=p 10 ps 4=p 12 ps 3=p 21 ps 2=p 11 So there is nothing in the basic p-tree set of the Peano Step Count derived attribute that we didn't already have in the basic Ptree set of the table itself! x y z A B C D E N J o j K L p m P k h u r s H I l t F G n q O Q S i d e f g 9 a b c 5 6 7 8 1 2 3 4 M v R T U w

k 1 1 3 2 1 0 2 2 3 2 1 0 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 2 3 2 7 3 4 2 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 2 3 3 3 2 4 5 3 What's next? What do we get for free once we have computed all of basic P-tree pairs? i. e. , The basic P-tree set is {Pi, j | j=bi. . 0 for each i=1. . n} (there are b= i=1. . nbi+1 of them ) Taufik precomputes {rc(Pi, j^Pi, k) | i=1. . n, all j, k} (there are i=1. . n(bi+1)2 of them ) we were to pre-compute all { rc(Pi, i^Ph, k | Pi, i and Ph, k basic P-trees } (b 2 of them), we can get the rcs of any equi-width partition of TV-contours out of it for free (just using Taufik's precomputation). We can also get the rcs of all 2 -hi grid cells out of it. We can also get the rcs of all intersections of equi-width TV-contours with 2 -hi cells. That might be good enough to always yield up a very good Euclidean disk superset? By very good, I mean, given a point, a, a superset of Disk(a, r) which has few enough points so it can be scanned for the Disk(a, r) points (or it can be fitted with a Gaussian Radial Basis Vote Function for NN Classification). If we were to pre-compute { rc(Pi, i^Ph, k^Pl, m} (b 3 of them), we can get the rcs of any equi-width partition of TV-contour and also the rcs of all 3 -hi grid cells, etc. Clearly, if we had the rcs of all basic P-tree combinations, we could do anything! Is there a parallel (or pipelined) way to compute all of them? If

h s h h h h s s s s 7 6 5 4 3 2 1 0 k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 17 19 21 23 25 27 29 31 33 35 36 37 39 41 43 45 87 113 168 169 170 171 179 194 195 196 197 201 204 205 209 222 223 224 225 244 247 248 250 251 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 5 6 2 3 4 8 7 b c g f e a 9 d h j n l o m k i U S P p Q O R T w v r s u t q D B A z N H F M L K J I G E C y x 5 6 2 1 2 2 2 3 2 9 0 9 7 5 6 7 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 1 1 Changi 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 1 1 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 1 1 0 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 0 1 0 0 1 h h 1 1 3 2 1 0 h h 2 2 3 2 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 0 1 1 1 1 1 0 0 1 1 0 0 1 0 1 1 0 0 0 1 1 0 0 0 1 0 2 3 2 7 3 4 2 0 0 0 0 1 1 1 1 0 1 1 1 1 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 0 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 0 0 1 1 0 1 0 1 1 1 0 0 1 1 1 0 1 1 2 3 3 3 2 4 5 3 x y z A B C N J o K L p l m k h r s H I j u F G n q D E t P • O Q S i d e f g 9 a b c 5 6 7 8 1 2 3 4 M v R T U w

Pre-processing costs? Pairs within attributes first (what Taufik does). rc(a 3â 2') no^ required = rc(a 3) - rc(a 3â 0) rc(a 3â 2) = 7 -3=4 7 -5=2 rc(a 3â 0') rc(a 3â 1) 7 -7=0 rc(a 3'â 2) ^ req, (but just count black-0 red-1 combos = 19) rc(a 3'â 1) rc(a 3'â 0) 18) ( a 3â 2 and a 3'â 2 in 1 instr or in || ? ) 27) 2 3 3 3 2 4 5 3 b 1' 23 b 2' 21 22 b 3' 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 34 a 0' 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 34 a 1' 2 3 2 7 3 4 2 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 22 a 2' 5 6 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 33 a 3' 2 3 3 4 7 2 0 0 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 49 31 22 30 33 b 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 35 b 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 34 b 2 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u 0 0 22 b 3 0 0 1 1 0 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 1 1 0 1 22 a 0 b b 3 2 1 0 19 34 a 1 a a 3 2 1 0 0 27 23 a 2 k a 3 a 2 1 3 0 b 0' rc(a 3'â 1') no ^ req, = rc(a 3') - rc(a 3'â 0) rc(a 3'â 0') rc(a 3'â 1) 27 22 rc(a 3'â 2') rc(a 3'â 2) = 49 - 19 18 = 30 31 (so far: 4 rc's out of 2 ANDs) 7 5 7 3 56 a 3 a 2 a 1 a 0 b 3 b 2 b 1 b 0 0 18 0 0 2 0 4 a 3' a 2' a 1' a 0' b 3' b 2' b 1' b 0'

Pre-processing costs? rc(a 2â 1') no^ required = rc(a 2) - rc(a 2â 0) rc(a 2â 1) = 23 -7=16 23 -13=10 rc(a 2â 0') rc(a 2'â 1) ^ req, (but just count black-0 red-1 combos = 15) rc(a 2'â 0) 21) 2 3 3 3 2 4 5 3 b 1' 23 b 2' 21 22 b 3' 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 34 a 0' 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 34 a 1' 2 3 2 7 3 4 2 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 22 a 2' 5 6 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 a 3' 3 2 3 4 0 0 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 49 33 12 18 31 22 30 33 b 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 35 b 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 34 b 2 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u 0 0 22 b 3 0 0 0 0 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 1 1 1 1 1 22 a 0 b b 3 2 1 0 34 a 1 a a 3 2 1 0 19 15 0 27 21 0 23 13 7 18 0 10 16 7 5 7 3 0 2 0 4 56 a 3 a 2 a 1 a 0 a 3' a 2' a 1' a 0' a 2 k a 3 a 2 1 b 0' rc(a 2'â 0') rc(a 2'â 1') no ^ req, = rc(a 2') - rc(a 2'â 0) rc(a 2'â 1) = 33 - 15 21 = 18 12 b 3 b 2 b 1 b 0 b 3' b 2' b 1' b 0'

Pre-processing costs? rc(a 1â 0') no^ required = rc(a 1) - rc(a 1â 0) = 34 -12=22 rc(a 1'â 0) ^ req, (but just count black-0 red-1 combos = 10) 2 3 3 3 2 4 5 3 b 1' 23 b 2' 21 b 3' 22 a 0' 34 a 1' 34 a 2' 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 a 3' 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 49 22 12 33 12 18 31 22 30 33 b 0 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 0 35 b 1 2 3 2 7 3 4 2 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 0 34 b 2 5 6 0 0 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 0 0 22 b 3 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 1 1 22 19 15 10 0 34 12 27 21 0 22 23 13 7 18 0 10 16 7 5 7 3 0 2 0 4 56 a 3 a 2 a 1 a 0 a 3' a 2' a 1' a 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 a 1 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u a 2 b b 3 2 1 0 a 3 a a 3 2 1 0 b 0' rc(a 1'â 0') no ^ req, = rc(a 1') - rc(a 1'â 0) = 22 - 10 = 12 (total of 12 ANDs so far. k b 3 b 2 b 1 b 0 b 3' b 2' b 1' b 0'

2 3 3 3 2 4 5 3 b 1' 23 b 2' 21 b 3' 22 a 0' 34 17 16 16 34 a 1' 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 a 2' 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 a 3' 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 49 22 12 33 12 18 31 22 30 33 b 0 2 3 2 7 3 4 2 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 18 35 b 1 5 6 0 0 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 22 17 17 17 22 19 15 10 0 34 12 27 21 0 22 23 13 7 18 0 10 16 7 5 7 3 0 2 0 4 56 a 3 a 2 a 1 a 0 a 3' a 2' a 1' a 0' b 3 b 2 b 1 0 18 34 b 2 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 1 1 b 3 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 a 0 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u a 1 b b 3 2 1 0 a 2 a a 3 2 1 0 a 3 k b 0' Pre-processing costs? b 0 0 17 0 0 5 5 7 b 3' b 2' b 1' b 0'

2 3 3 3 2 4 5 3 b 1' 23 9 22 9 8 34 17 16 16 33 18 14 14 0 35 21 18 13 0 12 34 22 19 17 0 12 15 17 17 17 0 5 5 7 b 3' b 2' b 1' b 0' a 0' b 3' b 2' 21 34 a 1' 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 a 2' 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 a 3' 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 49 b 0 2 3 2 7 3 4 2 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 b 1 5 6 0 0 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 b 2 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 1 1 22 b 3 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 22 12 33 12 18 31 22 30 22 19 15 10 0 34 12 27 21 0 22 23 13 7 18 0 10 16 7 5 7 3 0 2 0 4 56 a 3 a 2 a 1 a 0 a 3' a 2' a 1' a 0 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u a 1 b b 3 2 1 0 a 2 a a 3 2 1 0 a 3 k b 0' Pre-processing costs? b 3 b 2 b 1 b 0

Pre-processing costs? rc(a 3^b 3') = rc(a 3) - rc(a 3^b 3) = 7 -5 = 2 2 3 3 3 2 4 5 3 b 0' similarly, rc(a 3^b 2)=6 rc(a 3^b 1)=6 rc(a 3^b 2')=1 rc(a 3^b 0)=4 rc(a 3^b 1')=1 rc(a 3'^b 2)=28 rc(a 3^b 0')=3 rc(a 3'^b 1)=29 rc(a 3'^b 2')=21 rc(a 3'^b 0)=29 rc(a 3'^b 1')=20 rc(a 3'^b 0')=20 b 1' 23 9 22 9 8 34 17 16 16 32 21 20 20 a 0' b 3' b 2' 21 34 a 1' 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 a 2' 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 a 3' 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 49 22 22 12 33 12 18 31 22 30 33 29 18 14 14 0 35 21 29 18 13 0 12 34 22 19 28 17 0 12 15 17 17 0 5 5 7 b 0 2 3 2 7 3 4 2 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 b 1 5 6 0 0 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 b 2 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 1 1 b 3 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 22 19 15 10 0 34 12 27 21 0 22 23 13 7 18 0 10 16 7 5 7 3 5 6 6 4 0 2 0 4 2 1 1 3 56 a 3 a 2 a 1 a 0 b 3 b 2 b 1 b 0 a 3' a 2' a 1' a 0' b 3' b 2' b 1' b 0' a 0 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u a 1 b b 3 2 1 0 a 2 a a 3 2 1 0 a 3 k rc(a 3'^b 3) = rc(b 3) - rc(a 3^b 3) = 22 -5 = 17 rc(a 3'^b 3') = total - rc(a 3^b 3)-rc(a 3^b 3'(-rc(a 3'^b 3)=56 -5 -2 -17=32

Pre-processing costs? 2 3 3 3 2 4 5 3 b 0' b 1' 23 9 22 9 8 34 17 16 16 a 0' b 3' b 2' 21 34 a 1' 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 22 12 33 12 18 25 18 18 15 49 31 22 30 32 21 20 20 33 29 18 18 14 14 0 35 21 29 19 18 13 0 12 34 22 19 28 15 17 0 12 15 17 17 8 0 5 5 7 22 19 15 10 0 34 12 27 21 0 22 23 13 7 14 19 16 15 18 0 10 16 9 4 14 8 7 5 7 3 5 6 6 4 0 2 0 4 2 1 1 3 56 a 3 a 2 a 1 a 0 b 3 b 2 b 1 b 0 a 3' a 2' a 1' a 0' b 3' b 2' b 1' b 0' a 2' 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 a 3' 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 b 0 2 3 2 7 3 4 2 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 b 1 5 6 0 0 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 b 2 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 1 1 22 b 3 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 a 0 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u similarly, rc(a 2^b 0)=15 rc(a 2^b 2)=19 rc(a 2^b 3)=14 rc(a 2^b 1)=16 rc(a 2'^b 0)=18 rc(a 2'^b 2)=15 rc(a 2'^b 3)=8 rc(a 2'^b 1)=19 rc(a 2^b 3')=9 rc(a 2^b 2')=4 rc(a 2^b 0')=8 rc(a 2^b 1')=7 rc(a 2'^b 3')=25 rc(a 2'^b 2')=18 rc(a 2'^b 0')=15 rc(a 2'^b 1')=14 a 1 b b 3 2 1 0 a 2 a a 3 2 1 0 a 3 k

Pre-processing costs? 2 3 3 3 2 4 5 3 b 0' b 1' 23 9 22 9 8 34 17 16 16 34 19 12 14 10 22 12 17 10 9 9 33 12 18 25 18 18 15 49 31 22 30 32 21 20 20 33 29 18 13 24 18 14 14 0 35 21 29 19 13 20 18 13 0 12 34 22 19 28 15 12 22 17 0 12 15 22 17 17 8 5 15 0 5 5 7 22 7 12 15 9 19 15 10 0 15 10 7 13 34 12 17 22 22 20 27 21 0 22 17 12 12 14 23 13 7 14 19 16 15 18 0 10 16 9 4 14 8 7 5 7 3 5 6 6 4 0 2 0 4 2 1 1 3 56 a 3 a 2 a 1 a 0 b 3 b 2 b 1 b 0 a 3' a 2' a 1' a 0' b 3' b 2' b 1' b 0' a 0' b 3' b 2' 21 a 1' 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 a 2' 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 a 3' 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 b 0 2 3 2 7 3 4 2 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 b 1 5 6 0 0 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 b 2 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 1 1 b 3 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 a 0 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u similarly, rc(a 0^b 0)=9 rc(a 1^b 1)=22 rc(a 1^b 0)=20 rc(a 0^b 3)=7 rc(a 0^b 2)=12 rc(a 0^b 1)=15 rc(a 1^b 3)=17 rc(a 1^b 2)=22 rc(a 0'^b 0)=24 rc(a 0'^b 3)=15 rc(a 1'^b 1)=13 rc(a 1'^b 0)=13 rc(a 0'^b 2)=22 rc(a 0'^b 1)=20 rc(a 1'^b 3)=5 rc(a 1'^b 2)=12 rc(a 0^b 3')=15 rc(a 0^b 0')=13 rc(a 1^b 3')=17 rc(a 1^b 1')=12 rc(a 1^b 0')=14 rc(a 0^b 2')=10 rc(a 0^b 1')=7 rc(a 1^b 2')=12 rc(a 0'^b 3')=19 rc(a 1'^b 3')=17 rc(a 0'^b 0')=10 rc(a 0'^b 1')=14 rc(a 1'^b 1')=9 rc(a 1'^b 0')=9 rc(a 0'^b 2')=12 rc(a 1'^b 2')=10 (16 additional AND operations were required for the mixed attribute pairs. The total was 28 ANDs) a 1 b b 3 2 1 0 a 2 a a 3 2 1 0 a 3 k

2 3 3 3 2 4 5 3 b 1' 23 9 22 9 8 34 17 16 16 34 19 12 14 10 22 12 17 10 9 9 33 12 18 25 18 18 15 49 31 22 30 32 21 20 20 33 29 18 13 24 18 14 14 0 35 21 29 19 13 20 18 13 0 12 34 22 19 28 15 12 22 17 0 12 15 22 17 17 8 5 15 0 5 5 7 22 7 12 15 9 19 15 10 0 15 10 7 13 34 12 17 22 22 20 27 21 0 22 17 12 12 14 23 13 7 14 19 16 15 18 0 10 16 9 4 14 8 7 5 7 3 5 6 6 4 0 2 0 4 2 1 1 3 56 a 3 a 2 a 1 a 0 b 3 b 2 b 1 b 0 a 3' a 2' a 1' a 0' b 3' b 2' b 1' b 0' a 0' b 3' b 2' 21 a 1' 0 0 1 1 0 1 0 1 0 1 1 0 0 1 1 1 1 1 0 1 0 0 0 1 1 a 2' 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 a 3' 0 0 0 0 1 1 1 1 0 1 0 0 0 1 1 1 1 1 b 0 2 3 2 7 3 4 2 0 1 0 1 1 0 1 0 1 0 0 0 0 0 1 0 1 b 1 5 6 0 0 0 0 1 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 1 b 2 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 1 1 b 3 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 a 0 1 2 5 6 3 4 7 8 9 a d e b c f g h j i k l n m o U S T P Q p R O w v I J K L M G E C x y N H F z A D B q r s t u a 1 b b 3 2 1 0 a 2 a a 3 2 1 0 a 3 k b 0' With 4 aj^bj ANDs (i, j=3, 2) plus the 12 intra-attribute ANDs required for TV-contouring, preprocessing, TVcontours plus 2 -hi cell masks can be created by just plugging the right selection of the resulting rootcounts into a formula. For 3 -hi cell masks how many would be needed (in addition to the 12 for TV contouring)?

b 0' Attributes and Complements b 1' 9 22 9 8 34 17 16 16 34 19 12 14 10 22 12 17 10 9 9 33 12 18 25 18 18 15 49 31 22 30 32 21 20 20 33 29 18 13 24 18 14 14 0 35 21 29 19 13 20 18 13 0 12 34 22 19 28 15 12 22 17 0 12 15 22 17 17 8 5 15 0 5 5 7 22 7 12 15 9 19 15 10 0 15 10 7 13 34 12 17 22 22 20 27 21 0 22 17 12 12 14 23 13 7 14 19 16 15 18 0 10 16 9 4 14 8 7 5 7 3 5 6 6 4 0 2 0 4 2 1 1 3 56 a 3 a 2 a 1 a 0 b 3 b 2 b 1 b 0 a 3' a 2' a 1' a 0' b 3' b 2' b 1' b 0' b 3' a 0' b 0 b 1 b 2 b 3 a 0 a 1 a 2 e. g. , a 3 b 2 -triple rc slice card a 3' a 2' a 1' A+A' e. g. , a 3 -triple rc slice card 21 b 2' A' 1 A+ 1 23 a 3 dual rc card (the 104 black values shown below) A+A' We can think of the preprocessing as filling Rolo. Dex cards (Note that this Rolo. Dex is built to fit TV-analysis - i. e. , 2 -D cards with the primary one containing the needed dual-AND P-tree rootcounts needed for TV analysis (40 red and blue rc s below)