Improving cache algorithm

DrSlump drslump at drslump.biz
Sat Oct 8 23:09:28 CEST 2005


Hi again,

I had some spare time this evening so I decided to check the idea I 
posted previously. The result is the following code, to try it just 
replace the _list_headers() function in /program/include/rcube_imap.inc 
with this one.

It implements a somewhat better caching algorithm by taking advantage of 
the spec's incremental sequences for messages. If a mailbox is cached 
and there is a new message received, when loading the mailbox again we 
will just get two headers: the last one already in the cache and the new 
one. If the last one's UID is the same as the cached one then nothing 
else is done, otherwise (perhaps a message was deleted using another 
client) an intelligent cache revalidation is performed.
The cache revalidation fetches from older to newer all the messages in 
blocks of N headers. It keeps fetching blocks until we find a header 
with same ID and UID in the cache and in the block. So it gets much 
faster than the current code, specially for huge mailboxes.

The original code has a 'bug' in which it uses the cache if the number 
of messages is the same in the mailbox and in the cache. In the case 
that a message is deleted and a new one stored in the mailbox, this 
comparison will show up the cache which is not up to date.

ciao,
ivan

----

  // private method for listing message header
  function _list_headers($mailbox='', $page=NULL, $sort_field='date', 
$sort_order='DESC')
    {
    $a_out = array();
    $cached_count = 0;
   
    if (!strlen($mailbox))
      return $a_out;
 
    $mbox_count = $this->_messagecount($mailbox /*, 'ALL', TRUE*/);

    $revalidate = false;
    if ($mbox_count) {
        // get cached headers
        $a_out = $this->get_cache($mailbox.'.msg');       
        $a_out = is_array($a_out)?$a_out:array(); //make sure we get an 
array
        $cached_count = count($a_out);
       
        $a_new = array();
        $revalidate = true; //revalidate by default
       
        //if the cache count is greater then there have been changes for 
sure
        if ($cached_count <= $mbox_count) {
            $from = $cached_count?$cached_count:1;
           
            //get new headers (at least one is returned)
            $a_temp = iil_C_FetchHeaders($this->conn, $mailbox, $from . 
':' . $mbox_count);
           
            $duplicated = $cached_count?true:false;
            foreach ($a_temp as $hdr) {
                //skip the first one if duplicated
                if ($duplicated) {
                    //check for changes using the UID
                    if ($hdr->uid === end($a_out)->uid)
                        $revalidate = false;
                       
                    $duplicated = false;
                    continue;
                }
                //skip deleted ones
                if (! $hdr->deleted)
                    $a_new[ $hdr->uid ] = $hdr;
                                       
            }   
        }

       
        //revalidate cache if needed
        $to = $mbox_count - count($a_new);
        if ($revalidate && $to !== 0) {           
            //we'll needt o reindex the array so we have to make a copy
            $a_dirty = $a_out;
            $a_out = array();
            $a_buffers = array();

            //fetch chunks of 20 headers
            $step = 20;
           
            $found = false;
           
            //fetch headers in blocks starting from new to old
            do {
                $from = $to-$step;
                if ($from < 1) $from = 1;
               
                //store the block in a temporal buffer
                $a_buffers[$from] = iil_C_FetchHeaders( $this->conn, 
$mailbox, $from . ':' . $to);
               
                //compare the fetched headers with the ones in the cache
                $idx = 0;         
                foreach ($a_buffers[$from] as $k=>$hdr) {
                    //if it's different the comparison ends
                    if (!isset($a_dirty[$hdr->uid]) || 
$a_dirty[$hdr->uid]->id !== $hdr->id)
                        break;

                    //if we arrive here then we know that the older 
messages in cache are ok
                    $found = $hdr->id;
                    $idx++;
                }
               
                //remove from the buffer the headers which are already 
cached
                if ($found)
                    $a_buffers[$from] = array_splice($a_buffers[$from], 
0, $idx );
               
                $to = $from-1;
            } while ($found===false && $from > 1);
           
            //just keep the headers we are certain that didn't change in 
the cache
            if ($found !== false) {
                foreach ($a_dirty as $hdr) {
                    if ($hdr->id > $found) break;
                    $a_out[$hdr->uid] = $hdr;
                }
            }
           
            //we builded the block buffers from new to older, we process 
them in reverse order
            ksort($a_buffers, SORT_NUMERIC);
            foreach ($a_buffers as $a_buff) {
                foreach ($a_buff as $hdr) {
                    if (! $hdr->deleted)
                        $a_out[$hdr->uid] = $hdr;
                }
            }
        }
       
        //array_merge() would reindex the keys, so we use this 'hack'
        $a_out += $a_new;
    }
   

   
    //write headers list to cache if needed
    if ($revalidate || count($a_out)!=$cached_count) {
        $this->update_cache($mailbox.'.msg', $a_out);
    }

    //sort headers by a specific col
    $a_out = iil_SortHeaders( $a_out, $sort_field, $sort_order );
   
   
    // return complete list of messages
    if (strtolower($page)=='all')
      return $a_out;

    $start_msg = ($this->list_page-1) * $this->page_size;
    return array_slice($a_out, $start_msg, $this->page_size);
    }




More information about the Dev mailing list