Thursday, April 7, 2011

Estimating & Projecting the size of a table in Oracle

There are various method of estimating and projecting the size of a table . The best way to estimate database size is to load some sample data into our tables and then calculate statistics on those tables. Query DBA_TABLES for the AVG_ROW_LEN to get an idea of the average number of bytes for each row. We can use Tom Kyte’s ( show_space code to help us with the #of blocks evaluation or just use a plain simple technique shown below.  Many people just use the average rowlength (avg_row_len column) in order to as certain the size after doing a CTAS (Create Table AS)…however, that is not accurate as we will show below:

Example : 
SQL> create table test as select * from all_objects ;
Table created.
SQL>Exec dbms_stats.gather_table_stats(user,'TEST') ;
SQL>select num_rows ,blocks,empty_blocks.avg_row_len from user_tables where table_name='TEST';
----------              ------------     -------------------         -------------------
183546               2694                    0                           98
So, the average rowlength is being reported as 98 
The “right” and the only sure-shot way of calculating the size would be to calculate using these steps:

1.) We check on the existing schemas and that will give us the tables filled with representative data (say expected volume for one of the schemas for a large table is say 1 million rows – we check the # of blocks for n number of rows).
2.) Collect DBMS_STATS.
3.) Check the number of blocks.
4.) Then multiply by the multiplying factor i.e. if we estimated for say 1% of the actual requirement, multiply by 100 .
Example (Same table from above):
SQL> select extent_id, bytes, blocks  from user_extents   where segment_name = ‘TEST’  
                    and segment_type = ‘TABLE’ ;
-------------   --------      -----------
       330      65536          8
       331      65536          8
       332      65536          8
       333      65536          8
       334      65536          8
       667      65536          8
       ------                    --------
Total (sumof block )      2696
SQL> select blocks, empty_blocks, avg_space, num_freelist_blocks  from user_tables  
                where table_name = ‘TEST’ ;
--------         ------------------       ---------------      ------------------------------
2694                   0                           0                                 0
So :
1.)  We have 2696 blocks allocated to the table TEST.
2.)  0 blocks are empty (of course – in this example that is bound to happen – not in reality though).
.3)  2694 blocks contain data (the other 2 are used by the system).
4.)  Average of 0k is free on each block used.
1.) Our table TEST consumes 2694 blocks of storage in total for 183546 records.
2.) Out of this : 2694 * 8k blocksize – (2694 * 0k free) = 21552k is used for our data.
The calculation from the average row-length would have yielded : 183546 * 98 = ~17566k (see the difference ?).
Also, now that we have the calculation, if the actual table TEST needs to be sized for say 10 million records, then we use the multiplying factor for it :

183546 records take —> 21552 k
10 million will take —-> (21552k * 10 million) / 183546

That way, we will be assured that the data calculations are correct.  Likewise for the indexes.  Oracle Guru Tom Kyte has a lot of very good examples on his site that we should read before we embark on our sizing calculators for an Oracle Schema.

Enjoy    :-)

No comments: