On mySimon: NEW Video games of the Half Blood Prince
BNET Business Network:
BNET
TechRepublic
ZDNet

Talkback

Add your opinion
Database partitioning

Sponsored: Databases are tripling in size every two years. Willie Hardie, vice president of database product marketing at Oracle, explains how database partitioning will not only improve performance, but increase availability. The content for this video was sponsored and provided by Oracle.

Hello everyone. My name is Willie Hardie. I am Vice President of Database Product Marketing at Oracle Corporation, and today we are going to talk about database partitioning. Database partitioning gives us the ability to take our very large tables and divide, or partition, them up into smaller, more manageable pieces.

The reason we need to think about database partitioning is that databases that support our transaction processing and our data warehousing systems are tripling in size, on average, every two years. This means a 500 gigabytes or half a terabyte database back in 2001 is going to one and a half terabytes by 2003, four and a half terabytes by 2005, and will be over thirteen and a half terabytes by the year 2007. This gives us a big challenge in terms of managing our user's performance and availability expectations.

Let us say we have a large sales order table that is spread across four individual disks. This table typically contains millions of rows. The challenge we have with large tables like this is that all of our operations take place at the table level. We do queries at the table level. We do index builds at the table level. We do data loads at the table level. Which means that as we add more rows into this large table, performance and availability can tail off. If we were to lose one of the disks that support this table, we would lose the entire table. Our systems down while we try and repair that failure.

A better solution is to take our sales order table and, by issuing a simple alter table command, we can partition our large table. We can partition it based on a range of values such as date or time. We can partition it on a list of values such as country, region or product. Or we could partition it based on a hashing algorithm on a partition key.

In our example we are going to take our sales order table and we are going to partition it up into four quarters. So any rows that were in January, for example, are going to appear in quarter one. Any rows that appeared in say the month of April are going to appear in quarter two. We are going to similarly insert rows into quarters three and four. Now, what we are able to do by partitioning this large table is instead of having the large table spread across those four disks, we've now got individual partitions that have each got their own disks.

The advantage we have got in terms of query performance is that if we need to retrieve rows from quarter one, we do not do this full table scan or full index scan anymore. We go straight into the partition that contains the data we are after. Now, in terms of providing higher availability, if we were to lose one of the disks that supports one of our partitions, all we actually lose is access to that partition, in this case quarter two. We can continue to do inserts, updates and deletes to all our other partitions.

By issuing a simple alter table command to partition our large transactional tables, we can improve the performance of our systems, we can increase the availability of our systems, and we can ease the complexity of managing very large databases.