Understanding database [3] : Buffer algorithms

Understanding database 3 : Buffer algorithms

As we use buffers to boost up our performance, it is also necessary for us to choose a suitable algorithm for our buffer to follow. With formal expression, the access pattern is the key of ensuring the number of I/Os to be controlled, which has a massive impact on the performance of execution.

We introduce the formula for read efficiency here, noted as
(number of page requests - physical I/Os) /(number of page requests)
And we should also aim for above 0 .8 or 80% for an adequate efficiency.

We have 4 basic policies to decide for replacing frames within a buffer, including

First in first out (FIFO)
->oldest leaves
Least recently used (LRU)          <--------------------------- most common
->frame with oldest access leaves
Most recently used (MRU)
->frame with freshest access leaves
Least frequently used (LFU)
->frame with least access count leaves

Most database management systems would use a variant of LRU called the CLOCK, where the CLOCK would keep a circular list (consider the pages are the times on the clock), and also a hand pointing to the oldest page on the clock. Each of the frames would have a bit variable that is initially 0, but will be set to 1 on each page request to the frame. When it comes to the time when the clock needs to kick out a page, the first page with a 0 for their bit will be chosen. If the frame we are iterating through has a bit of 1, we will set its bit to 0 after sweeping through the frame.

An elaboration on the CLOCK method is the generalized CLOCK, namely the GCLOCK, where instead of setting an upper limit of 0 for the frame, each request will increment the bit by 1 and sweep of hand will decrement the bit. Similarly, for a frame to be replaced, the bit has to be 0.

If the requests can be predicted, for instance, a sequential scan will have a confined pattern, pages may be pre-fetched by many at the same time

Short topic : OS File System

The reason why DBMS takes over instead of OS for the disk space and buffer management is that OS may have significant portability issues. 


Popular Posts