Consolidate Page Number info

I have an array of pairs… LEFT=a keyword RIGHT=page number
the array is sorted by keyword then by page number gving something like this

word:001
word:002
word:004
word:005
word:006
word:009

I need to end up with something like this (and I cannot get my head around it)

word : 001-002,004-006,009

where consecutive page numbers are collapsed into a “range”

anyone have any insight

Untested, of course

dim word, index as string
dim bpage, epage as integer
dim i, pg as integer

word = wordarray(0).left
index = word+ ": "
bpage = wordarray(0).right
epage= pbage
for i = 1 to ubound(wordarray)
   pg = wordarray(i).right
   if word = wordarray(i).left then
      if pg = epage+1 then
         epage = pg
      else
         if bpage = epage then
            index = index + str(bpage)
         else
            index = index + str(bpage) + "-" + str(epage)
         end
         index = index + ","
         bpage = pg
         epage = pg
      end
   else
      index = left(index, len(index)-1) // trim trailing comma
      // store index somewhere
      word = wordarray(i).left
      index = word + ": "
      bpage = pg
      epage = pg
   end
next
if bpage = epage then
   index = index + str(bpage)
else
   index = index + str(bpage) + "-" + str(epage)
end
// store index somewhere

This assumes they are in fact in ascending page order
Drag a listbox onto a windows & stick this in the open event of the window

EDIT - added more pairs just to show it really does work :stuck_out_tongue:

  dim words() as pair = Array ( new pair("word",1), new pair("word",002),_
  new pair("word",004), new pair("word",005) , new Pair("word",006), new Pair("word",009),_
  new pair("something", 3 ), new pair ("something", 4), new pair("other", 1) )
  
  
  
  dim d as new dictionary
  
  for each p as Pair in words
    dim counters() as Pair
    
    redim counters(-1)
    
    if not d.HasKey( p.Left) then 
      counters.append new pair (p.right, p.right)
      d.Value(p.left) = counters
    else
      counters = d.Value(p.left)
      if counters(ubound(counters)).right + 1 = p.right then
        dim cp as pair = Pair(counters(ubound(counters)))
        dim tmp as new pair( cp.left, cp.right + 1)
        counters(ubound(counters)) = tmp
      else
        counters.append new pair(p.right, p.right)
      end if
      
    end if
    
  next
  
  for each k as string in d.keys()
    dim counters() as Pair = d.value(k)
    
    for each c as Pair in counters
      listbox1.addrow k + " from  " + str(c.left) + " to " + str(c.right)
    next
  next

Sorry Norman :slight_smile: I already took Tims code (fixed it, sorry there was a significant omission)… and implemented it…

  word=index_data(0).Left
  index=""
  bpage=index_data(0).Right
  epage= bpage
  For i=1 To ubound(index_data)
    pg=index_data(i).Right
    If word=index_data(i).Left Then
      if pg=epage then continue
      If pg=epage+1 Then
        epage=pg
      Else
        If bpage=epage Then
          index=index + Str(bpage)
        Else
          index=index + Str(bpage) + "-" + Str(epage)
        End if
        index=index + ","
        bpage=pg
        epage=pg
      End if
    Else
      If bpage=epage Then
        index=index + Str(bpage)
      Else
        index=index + Str(bpage) + "-" + Str(epage)
      End if
      out_index.append word:index
      word=index_data(i).Left
      index=""
      bpage=pg
      epage=pg
    End if
  Next
  If bpage=epage Then
    index=index + Str(bpage)
  Else
    index=index + Str(bpage) + "-" + Str(epage)
  End if
  out_index.append word:index

Your original code did not take into consideration the LAST page or range for a word
But THANKS… I was trying to do it all in the same array… and not having any luck

Don’t quite follow ?
The display was very simplistic - never checked for the range being one page (ie/ if from & to were the same page it still wrote “from X to X” which is a minor thing - trivial in fact)

  
  dim words() as pair = Array ( new pair("word",1), new pair("word",002),_
  new pair("word",004), new pair("word",005) , new Pair("word",006), new Pair("word",009),_
  new pair("something", 3 ), new pair ("something", 4), new pair("other", 1) )
  
  
  
  dim d as new dictionary
  
  for each p as Pair in words
    dim counters() as Pair
    
    redim counters(-1)
    
    if not d.HasKey( p.Left) then
      counters.append new pair (p.right, p.right)
      d.Value(p.left) = counters
    else
      counters = d.Value(p.left)
      if counters(ubound(counters)).right + 1 = p.right then
        dim cp as pair = Pair(counters(ubound(counters)))
        dim tmp as new pair( cp.left, cp.right + 1)
        counters(ubound(counters)) = tmp
      else
        counters.append new pair(p.right, p.right)
      end if
      
    end if
    
  next
  
  for each k as string in d.keys()
    dim counters() as Pair = d.value(k)
    
    dim row as string 
    for each c as Pair in counters
      if row <> "" then row = row + ","
      if c.left = c.right then
        row = row + str(c.left)
      else
        row = row + str(c.left) + "-" + str(c.right)
      end if
    next
    listbox1.addrow k + " " + row
  next

produces
word 1-2,4-6,9
something 3-4
other 1

which seems to match your stated goal precisely (ohh not 3 digits but again thats dead simple using str or format with an actual mask)
there’s as much code to count the items (18 lines) as there is to display them (13 lines) almost :slight_smile:

Sorry Norm… .didn’t mean you code… but Tims

Glad it nudged you in the right direction. You’re not one who needs to be spoon-fed code. Just something illustrative of a solution.

Oh ………

Been messing with my code to make it NOT depend on pages being in ascending order and that makes for a whole set of fun issues
Basically have to go through and decompress all the ranges repeatedly

Say you have
word:2
word:4
word:3
word:6
word:5
word:1
my revised code

  dim words() as pair = Array (  new pair("word",002), new pair("word",004), new pair("word",003) , _
  new Pair("word",006), new Pair("word",005), new pair("word",1) )
  
  dim d as new dictionary
  
  for each p as Pair in words
    dim counters() as Pair
    
    redim counters(-1)
    
    if not d.HasKey( p.Left) then
      counters.append new pair (p.right, p.right)
      d.Value(p.left) = counters
    else
      
      counters = d.Value(p.left)
      
      dim found as boolean = false
      
      for i as integer = 0 to ubound(counters)
        if counters(i).left-1 = p.right then
          dim tmp as new pair( counters(i).left-1, counters(i).right)
          counters(i) = tmp
          found = true
          exit
        elseif counters(i).right + 1 = p.right then
          dim tmp as new pair( counters(i).left, counters(i).right + 1)
          counters(i) = tmp
          found = true
          exit
        end if
      next
      if not found then 
        counters.append new pair(p.right, p.right)
      end if
      
    end if
    
    
  next
  
  for each k as string in d.keys()
    dim counters() as Pair = d.value(k)
    
    dim row as string
    for each c as Pair in counters
      if row <> "" then row = row + ","
      if c.left = c.right then
        row = row + str(c.left)
      else
        row = row + str(c.left) + "-" + str(c.right)
      end if
    next
    listbox1.addrow k + " " + row
  next

currently gives

word 1-3,4-5,6

but it SHOULD give you

word 1-6

fixed
should not matter what order words are listed
and handles the case where a word may occur more than once on a page too

    dim words() as pair = Array ( new pair ("something", 4), new pair("word",2), new pair("word", 2), new pair("word",6),_
  new pair("word",4), new pair("word",5) , new Pair("word",3), new Pair("word",1),_
  new pair("something", 3 ), new pair("word", 1) , new pair("other", 1) )
  
  
  dim d as new dictionary
  
  for each p as Pair in words
    dim counters() as Pair
    
    redim counters(-1)
    
    if not d.HasKey( p.Left) then
      counters.append new pair (p.right, p.right)
      d.Value(p.left) = counters
    else
      
      counters = d.Value(p.left)
      
      dim found as boolean = false
      
      for i as integer = 0 to ubound(counters)
        if counters(i).left = p.right or counters(i).right = p.right then
          found = true
        elseif counters(i).left-1 = p.right then
          dim tmp as new pair( counters(i).left-1, counters(i).right)
          counters(i) = tmp
          found = true
          exit
        elseif counters(i).right + 1 = p.right then
          dim tmp as new pair( counters(i).left, counters(i).right + 1)
          counters(i) = tmp
          found = true
          exit
        end if
      next
      if not found then 
        counters.append new pair(p.right, p.right)
      end if
      
    end if
    
  next
  
  for each k as string in d.keys()
    dim counters() as Pair = d.value(k)
    dim lows() as integer
    
    for each c as Pair in counters
      lows.append c.left
    next
    lows.SortWith counters
    for i as integer = ubound(counters) downto 1
      if counters(i).left = counters(i-1).right+1 then
        dim tmp as new Pair(counters(i-1).Left, counters(i).right)
        counters(i-1) = tmp
        counters.remove i
      end
    next
    d.value(k) = counters
  next
  
  
  for each k as string in d.keys()
    dim counters() as Pair = d.value(k)
    
    dim row as string
    for each c as Pair in counters
      if row <> "" then row = row + ","
      if c.left = c.right then
        row = row + str(c.left)
      else
        row = row + str(c.left) + "-" + str(c.right)
      end if
    next
    listbox1.addrow k + " " + row
  next