Basic Algorithms — Spring 2020 — Problem Set 8 Due: Wed, May 6, 11am 1. Half-priced hash. In class, we studied a family of hash functions based on taking inner products. That family was a family of...

Basic Algorithms Problem Set

Basic Algorithms — Spring 2020 — Problem Set 8 Due: Wed, May 6, 11am 1. Half-priced hash. In class, we studied a family of hash functions based on taking inner products. That family was a family of hash functions from keys Ztm to slots Zm indexed by seeds Ztm, where m is prime. For each seed λ = (λ1, . . . , λt) ∈ Ztm, and each key a = (a1, . . . , at) ∈ Ztm, the hash function was defined as hλ(a) := ∑t i aiλi, which requires t multiplications to evaluate. In this problem, you are to analyze a variant hash which cuts the number of multiplications in half. In most computing environments, multiplications are much more expensive than addition, so this can speed up the hash function evaluation significantly. Assume t is even, so t = 2s. Hash function seeds and keys have the same structure as above, but the hash function is defined as follows: h′λ(a) := s∑ i=1 (a2i−1 + λ2i−1)(a2i + λ2i). So, for example, for t = 4, we have h′λ(a) = (a1 + λ1)(a2 + λ2) + (a3 + λ3)(a4 + λ4). Your task is to show that the family of hash functions {h′λ}λ∈Ztm is a universal family. Hint: Your proof should mimic the one given in class for the inner-product based family. Namely, consider two distinct keys a = (a1, . . . , at) and b = (b1, . . . , bt), and show that the number of seeds (λ1, . . . , λt) which satisfy s∑ i=1 (a2i−1 + λ2i−1)(a2i + λ2i) = s∑ i=1 (b2i−1 + λ2i−1)(b2i + λ2i). is at most mt−1. To keep the notation simple, you may first want to do the calculation for the case t = 4. 2. Mod-free hash. In class, most of the hash functions we looked at required arithmetic mod m, where m was a prime. This exercise looks a family of hash functions where this is not necessary, which can result in a significantly more efficient implementation. Let k, `, and t be positive integers. Let U := [0 . . 2k)t, V := [0 . . 2`), and Λ := [0 . . 2k+`)t. We define a family of hash functions {hλ}λ∈Λ from U to V as follows. For a = (a1, . . . , at) ∈ U and λ = (λ1, . . . , λt) ∈ Λ, we define hλ(a) := ⌊( (a1λ1 + · · ·+ atλt) mod 2k+` ) / 2k ⌋ . That is, if we write ∑ i aiλi = L2 k+` + M2k + R, where 0 ≤ R < 2k="" and="" 0="" ≤="" m="">< 2`,="" then="" hλ(a)="M" .="" in="" other="" words,="" hλ(a)="" consists="" of="" the="" “middle="" bits”,="" i.e.,="" bits="" k="" through="" k="" +="" `−="" 1,="" of="" the="" inner="" product="" ∑="" i="" aiλi.="" to="" see="" why="" this="" can="" be="" efficiently="" implemented,="" suppose="" that="" k="10" and="" `="20." then="" using="" 32-bit="" unsigned="" arithmetic,="" we="" can="" compute="" sum="" ←="" ∑="" i="" aiλi="" using="" t="" integer="" multiplications="" and="" additions,="" and="" we="" can="" then="" compute="" the="" hash="" as="" hash="" ←="" (sum="" &="" (230="" −="" 1))="">> 10. Make sure you understand why this works before proceeding. Your goal is to show that that {hλ}λ∈Λ is a 2−`+1-universal family of hash functions. Here is an outline you should follow: (a) Suppose hλ(a) = hλ(b), where a = (a1, . . . , at), b = (b1, . . . , bt), and λ = (λ1, . . . , λt). Show that we must have c1λ1 + · · ·+ ctλt ≡ d (mod 2k+`), (∗) where ci := ai − bi for i = 1, . . . , t, and d is an integer with |d| < 2k.="" (b)="" further="" suppose="" that="" a="" 6="b," so="" that="" ci′="" 6="0" for="" some="" i′="" ∈="" [1="" .="" .="" t].="" let="" 2j="" be="" the="" largest="" power="" of="" two="" that="" divides="" all="" of="" the="" ci’s.="" show="" that="" j="">< k and that 2 j | d. note: from (a) and (b), it follows that d ∈ s := {0,±1 · 2j ,±2 · 2j , . . . ,±(2k−j − 1) · 2j}. moreover, |s| = (2k−j − 1) + (2k−j − 1) + 1 = 2k−j+1 − 1. 1 (c) without loss of generality, we may assume that c1 = 2 jf , where f is odd (we just need to re-arrange indices if necessary). argue that for every choice of d ∈ s, and every choice of λ2, . . . , λt ∈ [0 . . 2k+`), there are exactly 2j choices of λ1 ∈ [0 . . 2k+`) that satisfy the congruence (∗). hint: use theorem 2.8 in the number theory primer. (d) conclude that {hλ}λ∈λ is a 2−`+1-universal family of hash functions. hint: first, argue that for every possible d, there are 2(k+`)(t−1)+j choices for the λi’s that satsify (∗); from this, argue that the total number of choices for that λi’s that can satisfy (∗) for any d is at most 2(k+`)(t−1)+k+1. 3. massive pile-up. suppose we hash n keys into a hash table using a randomly chosen seed for a universal family of hash functions. let m be the maximum number of keys that get hashed to any one slot. in class, we called m the “max load”, and we showed that e[m ] = o( √ n), assuming that the load factor (# keys / # slots) was o(1). the goal of this exercise is to show that this upper bound cannot be significantly improved. let v = {1, . . . , n}. we define a family of hash functions from v to v as follows. let k ∈ [1 . . n] be a parameter (determined below). for each subset i ⊆ v of carnality k, we define the hash function hi : v → v by the following property: hi maps each element of i onto 1, and hi maps v\i injectively (i.e., in a one-to-one fashion) into v \ {1}. note that hi is not uniquely determined by this property, but we can always choose one hi satisfying this property (you should verify this). define h := { hi } i⊆v,|i|=k . argue that h is universal if k ≤ √ n− 1. hints: how many seeds i are there, as a function of n and k? for fixed a, b ∈ v with a 6= b, how many seeds i are there with hi(a) = hi(b), again, as a function of n and k? discussion. observe that if we hash all n elements in v, then no matter which hash function hi we choose, k elements get hashed to slot 1. thus, for this family of universal hash functions with k = b √ n− 1c, we get a lower bound of ω( √ n) on the expected max load, which matches the o( √ n) upper bound obtained in class. even though this family of hash functions is rather artificial, similar lower bounds hold for more natural universal families of hash functions. for example, for the “inner product hash” discussed in class, there is a lower bound of ω(n1/3) on the expected max load [alon, et al, “is linear hashing good”, stoc (1997)]. that paper also gives ω( √ n) lower bounds for other natural families of universal hash functions. 4. 2d hash. in class, we presented a (t − 1)/m-universal hash family based on polynomial evaluation. this exercise develops a two-dimensional variant. the universe of keys u consists of all t × t matrices over zm, where m is prime. we write such a matrix a ∈ u as a = (aij), where the indices i and j run from 0 to t− 1. the set of seeds λ consists of pairs (λ1, λ2) ∈ zm × zm. for λ = (λ1, λ2) ∈ λ and a = (aij) ∈ u , define hλ(a) = t−1∑ i=0 t−1∑ j=0 aijλ i 1λ j 2 ∈ zm. (1) for example, if t = 3, we have hλ(a) = a00 + a01λ2 + a02λ 2 2 + a10λ1 + a11λ1λ2 + a12λ1λ 2 2 + a20λ 2 1 + a21λ 2 1λ2 + a22λ 2 1λ 2 2 . show that {hλ}λ∈λ is �-universal for � := 2(t− 1)/m. hint: suppose hλ(a) = hλ(b), then hλ(c) = 0, where c := a−b. if c = (cij), re-write hλ(c) as hλ(c) = t−1∑ i=0 λi1 ( t−1∑ j=0 cijλ j 2 ) . for t = 3, this looks like this: hλ(c) = ( c00 + c01λ2 + c02λ 2 2 ) + λ1( c10 + c11λ2 + c12λ 2 2 ) + λ21( c20 + c21λ2 + c22λ 2 2 ) . 2 you want to show that if a 6= b, then the number of pairs λ = (λ1, λ2) such that hλ(c) = 0 is at most �m2, where � = 2(t− 1)/m. to do this, you should make use of fact that any polynomial of degree at most t − 1 over zm has at most t− 1 roots. you will need to use this fact twice: once with f(y ) = t−1∑ j=0 ci0jy j for some i0 ∈ [0 . . t), and once with gλ2(x) = t−1∑ i=0 xi ( t−1∑ j=0 cijλ j 2 ) for some λ2 ∈ zm. 5. 2d pattern matching. in the 2d pattern matching problem, you are given an n × n array a and a t × t array b, where t ≤ n, and you want to determine if b appears as a subarray within a. for example, the array b = ( 1 2 3 4 ) appears as a subarray of a =  0 1 1 0 0 1 2 3 2 3 4 0 0 1 0 1  . (a) show how to compute the 2d hash hλ1,λ2 of the previous exercise on all of the t × t subarrays of a in time o(n2). hint: you will have to somehow adapt the “rolling hash” idea of karp/rabin to the 2d hash. assume a = (aij), where the row indices i and column indices j both run from 1 to n. for i = 1, . k="" and="" that="" 2="" j="" |="" d.="" note:="" from="" (a)="" and="" (b),="" it="" follows="" that="" d="" ∈="" s="" :="{0,±1" ·="" 2j="" ,±2="" ·="" 2j="" ,="" .="" .="" .="" ,±(2k−j="" −="" 1)="" ·="" 2j}.="" moreover,="" |s|="(2k−j" −="" 1)="" +="" (2k−j="" −="" 1)="" +="" 1="2k−j+1" −="" 1.="" 1="" (c)="" without="" loss="" of="" generality,="" we="" may="" assume="" that="" c1="2" jf="" ,="" where="" f="" is="" odd="" (we="" just="" need="" to="" re-arrange="" indices="" if="" necessary).="" argue="" that="" for="" every="" choice="" of="" d="" ∈="" s,="" and="" every="" choice="" of="" λ2,="" .="" .="" .="" ,="" λt="" ∈="" [0="" .="" .="" 2k+`),="" there="" are="" exactly="" 2j="" choices="" of="" λ1="" ∈="" [0="" .="" .="" 2k+`)="" that="" satisfy="" the="" congruence="" (∗).="" hint:="" use="" theorem="" 2.8="" in="" the="" number="" theory="" primer.="" (d)="" conclude="" that="" {hλ}λ∈λ="" is="" a="" 2−`+1-universal="" family="" of="" hash="" functions.="" hint:="" first,="" argue="" that="" for="" every="" possible="" d,="" there="" are="" 2(k+`)(t−1)+j="" choices="" for="" the="" λi’s="" that="" satsify="" (∗);="" from="" this,="" argue="" that="" the="" total="" number="" of="" choices="" for="" that="" λi’s="" that="" can="" satisfy="" (∗)="" for="" any="" d="" is="" at="" most="" 2(k+`)(t−1)+k+1.="" 3.="" massive="" pile-up.="" suppose="" we="" hash="" n="" keys="" into="" a="" hash="" table="" using="" a="" randomly="" chosen="" seed="" for="" a="" universal="" family="" of="" hash="" functions.="" let="" m="" be="" the="" maximum="" number="" of="" keys="" that="" get="" hashed="" to="" any="" one="" slot.="" in="" class,="" we="" called="" m="" the="" “max="" load”,="" and="" we="" showed="" that="" e[m="" ]="O(" √="" n),="" assuming="" that="" the="" load="" factor="" (#="" keys="" #="" slots)="" was="" o(1).="" the="" goal="" of="" this="" exercise="" is="" to="" show="" that="" this="" upper="" bound="" cannot="" be="" significantly="" improved.="" let="" v="{1," .="" .="" .="" ,="" n}.="" we="" define="" a="" family="" of="" hash="" functions="" from="" v="" to="" v="" as="" follows.="" let="" k="" ∈="" [1="" .="" .="" n]="" be="" a="" parameter="" (determined="" below).="" for="" each="" subset="" i="" ⊆="" v="" of="" carnality="" k,="" we="" define="" the="" hash="" function="" hi="" :="" v="" →="" v="" by="" the="" following="" property:="" hi="" maps="" each="" element="" of="" i="" onto="" 1,="" and="" hi="" maps="" v\i="" injectively="" (i.e.,="" in="" a="" one-to-one="" fashion)="" into="" v="" \="" {1}.="" note="" that="" hi="" is="" not="" uniquely="" determined="" by="" this="" property,="" but="" we="" can="" always="" choose="" one="" hi="" satisfying="" this="" property="" (you="" should="" verify="" this).="" define="" h="" :="{" hi="" }="" i⊆v,|i|="k" .="" argue="" that="" h="" is="" universal="" if="" k="" ≤="" √="" n−="" 1.="" hints:="" how="" many="" seeds="" i="" are="" there,="" as="" a="" function="" of="" n="" and="" k?="" for="" fixed="" a,="" b="" ∈="" v="" with="" a="" 6="b," how="" many="" seeds="" i="" are="" there="" with="" hi(a)="hI(b)," again,="" as="" a="" function="" of="" n="" and="" k?="" discussion.="" observe="" that="" if="" we="" hash="" all="" n="" elements="" in="" v,="" then="" no="" matter="" which="" hash="" function="" hi="" we="" choose,="" k="" elements="" get="" hashed="" to="" slot="" 1.="" thus,="" for="" this="" family="" of="" universal="" hash="" functions="" with="" k="b" √="" n−="" 1c,="" we="" get="" a="" lower="" bound="" of="" ω(="" √="" n)="" on="" the="" expected="" max="" load,="" which="" matches="" the="" o(="" √="" n)="" upper="" bound="" obtained="" in="" class.="" even="" though="" this="" family="" of="" hash="" functions="" is="" rather="" artificial,="" similar="" lower="" bounds="" hold="" for="" more="" natural="" universal="" families="" of="" hash="" functions.="" for="" example,="" for="" the="" “inner="" product="" hash”="" discussed="" in="" class,="" there="" is="" a="" lower="" bound="" of="" ω(n1/3)="" on="" the="" expected="" max="" load="" [alon,="" et="" al,="" “is="" linear="" hashing="" good”,="" stoc="" (1997)].="" that="" paper="" also="" gives="" ω(="" √="" n)="" lower="" bounds="" for="" other="" natural="" families="" of="" universal="" hash="" functions.="" 4.="" 2d="" hash.="" in="" class,="" we="" presented="" a="" (t="" −="" 1)/m-universal="" hash="" family="" based="" on="" polynomial="" evaluation.="" this="" exercise="" develops="" a="" two-dimensional="" variant.="" the="" universe="" of="" keys="" u="" consists="" of="" all="" t="" ×="" t="" matrices="" over="" zm,="" where="" m="" is="" prime.="" we="" write="" such="" a="" matrix="" a="" ∈="" u="" as="" a="(aij)," where="" the="" indices="" i="" and="" j="" run="" from="" 0="" to="" t−="" 1.="" the="" set="" of="" seeds="" λ="" consists="" of="" pairs="" (λ1,="" λ2)="" ∈="" zm="" ×="" zm.="" for="" λ="(λ1," λ2)="" ∈="" λ="" and="" a="(aij)" ∈="" u="" ,="" define="" hλ(a)="t−1∑" i="0" t−1∑="" j="0" aijλ="" i="" 1λ="" j="" 2="" ∈="" zm.="" (1)="" for="" example,="" if="" t="3," we="" have="" hλ(a)="a00" +="" a01λ2="" +="" a02λ="" 2="" 2="" +="" a10λ1="" +="" a11λ1λ2="" +="" a12λ1λ="" 2="" 2="" +="" a20λ="" 2="" 1="" +="" a21λ="" 2="" 1λ2="" +="" a22λ="" 2="" 1λ="" 2="" 2="" .="" show="" that="" {hλ}λ∈λ="" is="" �-universal="" for="" �="" :="2(t−" 1)/m.="" hint:="" suppose="" hλ(a)="hλ(B)," then="" hλ(c)="0," where="" c="" :="A−B." if="" c="(cij)," re-write="" hλ(c)="" as="" hλ(c)="t−1∑" i="0" λi1="" (="" t−1∑="" j="0" cijλ="" j="" 2="" )="" .="" for="" t="3," this="" looks="" like="" this:="" hλ(c)="(" c00="" +="" c01λ2="" +="" c02λ="" 2="" 2="" )="" +="" λ1(="" c10="" +="" c11λ2="" +="" c12λ="" 2="" 2="" )="" +="" λ21(="" c20="" +="" c21λ2="" +="" c22λ="" 2="" 2="" )="" .="" 2="" you="" want="" to="" show="" that="" if="" a="" 6="B," then="" the="" number="" of="" pairs="" λ="(λ1," λ2)="" such="" that="" hλ(c)="0" is="" at="" most="" �m2,="" where="" �="2(t−" 1)/m.="" to="" do="" this,="" you="" should="" make="" use="" of="" fact="" that="" any="" polynomial="" of="" degree="" at="" most="" t="" −="" 1="" over="" zm="" has="" at="" most="" t−="" 1="" roots.="" you="" will="" need="" to="" use="" this="" fact="" twice:="" once="" with="" f(y="" )="t−1∑" j="0" ci0jy="" j="" for="" some="" i0="" ∈="" [0="" .="" .="" t),="" and="" once="" with="" gλ2(x)="t−1∑" i="0" xi="" (="" t−1∑="" j="0" cijλ="" j="" 2="" )="" for="" some="" λ2="" ∈="" zm.="" 5.="" 2d="" pattern="" matching.="" in="" the="" 2d="" pattern="" matching="" problem,="" you="" are="" given="" an="" n="" ×="" n="" array="" a="" and="" a="" t="" ×="" t="" array="" b,="" where="" t="" ≤="" n,="" and="" you="" want="" to="" determine="" if="" b="" appears="" as="" a="" subarray="" within="" a.="" for="" example,="" the="" array="" b="(" 1="" 2="" 3="" 4="" )="" appears="" as="" a="" subarray="" of="" a="" 0="" 1="" 1="" 0="" 0="" 1="" 2="" 3="" 2="" 3="" 4="" 0="" 0="" 1="" 0="" 1="" ="" .="" (a)="" show="" how="" to="" compute="" the="" 2d="" hash="" hλ1,λ2="" of="" the="" previous="" exercise="" on="" all="" of="" the="" t="" ×="" t="" subarrays="" of="" a="" in="" time="" o(n2).="" hint:="" you="" will="" have="" to="" somehow="" adapt="" the="" “rolling="" hash”="" idea="" of="" karp/rabin="" to="" the="" 2d="" hash.="" assume="" a="(aij)," where="" the="" row="" indices="" i="" and="" column="" indices="" j="" both="" run="" from="" 1="" to="" n.="" for="" i="1,">

ps8-gnlz035g.pdf

May 06, 2021

SOLUTION.PDF

Basic Algorithms — Spring 2020 — Problem Set 8 Due: Wed, May 6, 11am 1. Half-priced hash. In class, we studied a family of hash functions based on taking inner products. That family was a family of...

Get Answer To This Question

Related Questions & Answers

Submit New Assignment