Z-algorithm walk-through

Pseudo-code:

Z[0] = 0      // The z array
maxZ = 0   // The furthest we've reached into the string (the largest j + Z[j] value seen so far)
j = 0 // the location where maxZ occurs
for (i = 1; i < n; i++)
  if (maxZ < i) // case 0: must do the hard work
    l = 0 // index within currently matched region
    Z[i] = 0
    while (T[i + l] == T[l]) // naive match - while characters match increment Z[i]
      Z[i]++; l++
  end while
  maxZ = i + Z[i] - 1// this is the farthest we've gotten in the string
  j = i // and this is where it happened
else // now we have the 3 cases from the lecture notes
k = i – j // this is how far we are in the Z box of j
if (Z[k] + i < maxZ) // case 1
Z[i] = Z[k]
else if (Z[k] + i > maxZ) // case 2
Z[i] = maxZ – i + 1
else // case 3 Z[k] + i = maxZ
Z[i] = maxZ – i + 1 // I know I've gotten this far
l = Z[i] + 1
while (T[i + l] == T[l]) // all the following characters must be checked
Z[i]++; l++
end while
maxZ = i + Z[i] - 1
j = i
end if
end if
end for

 

Now going through the algorithm

T = A G A G A G A C C A G C A G A G A G A G A C G T

i = 1; maxZ is less than 1, hence we are in case 0 - brute force

the first while loop ends immediately as T[l] != T[i + l] (T[0] != T[1])

Z[1] is set to 0; maxZ remains 0, j remains 0

i=2; maxZ is less than 1, hence we are in case 0 - brute force

the first while loop:
  l=0,   T[0] == T[2] =>  Z[2] = 1
  l=1,  T[1] == T[3] => Z[2] = 2
  l=2,  T[2] == T[4] => Z[2] = 3
...
  l = 5, T[5] == T[7] => Z[2] = 5
  l =6, T[6]  != T[8] => Z[2] = 5
  loop ends
  j = 2; maxZ = 6

i=3; maxZ > 3, hence we look at the second part of the if statement
  k = i - j = 1
  Z[k] = 0
  since Z[k] + i = 3 < maxZ we are in case 1, and can directly set Z[3] = Z[1] = 0;

i=4; maxZ > 3  hence we look at the second part of the if statement
    k = i - j = 2
    Z[k] = 5
    since Z[k] + i = 9 > maxZ we are in case 2 and can directly set Z[4] = maxZ - i + 1 = 3

...

i=14; maxZ = 18 , j=12 (pen and paper will help figure this out)
   k = i -j = 2
   Z[k] = 5
   since Z[k] + i = 18 = maxZ we are in case 3

   we start the loop counting from l = maxZ - i + 1= 5; and setting Z[i] = 4
   we match GAC and mismatch at T[8] != T[22], making Z[i] = 4 + 3 = 7
    maxZ = 21, j = 14

and so on...