Monday, September 15, 2008

Minimising Physical I/Os in Oracle Applications

This article focuses on how to minimize the physical (disk) I/O in any Oracle application
system. It mainly talks for large-scale OLTP systems and describes pointers to areas that
can be looked upon for performance bottlenecks and improvements.

Physical I/Os vs. Logical I/Os
-----------------------------------
Logical I/O can be defined as the amount of Input and Output Oracle performs on the lowest
level of it’s storage i.e. the Oracle Blocks. So the logical I/O is calculated on the basis of the
number of blocks read, number of times the block(s) is changed during a particular DML
operation.
Logical I/Os are calculated as the sum of Consistent Gets and DB Block Gets which are further
explained as below:

Consistent Gets – the number of buffers gotten in consistent mode. During the course of a read operation on a table, there is a possibility that some other transaction may change the underlying matching records for the query. During the read operation, any information for the row changes is first checked from the Block Header. This avoids the need to check for each row in the block, if the corresponding row has been changed or not. The buffers as read in this mode are referred to as Consistent Gets.
DB Block Gets – these are the number of buffers gotten normally during update/insert
operations and where the information on the segment header block needs to be updated.

Logical I/O = Consistent Gets + Db Block Gets.

Physical I/Os are part of Logical I/Os. This means that for a typical read operation on a table,
the resulting operation can be a mix of information gotten from the Buffer Cache and the
physical disk (for whatever information not available in the cache). The information read from
the disk forms physical I/O for the operation, while the total information forms Logical I/Os
for the operation.

Logical I/O can be relatively straightforward (albeit not simple) to calculate, and is mostly done
by breaking down the application in terms of tables being accessed against the transactions in the
system. However, the estimates depend heavily on various factors like data volumes, data
distribution, probable access methods, frequency of access, frequency of change, and so forth.
There are, however, other simple rules to calculate estimates for the logical I/O. The resulting
physical I/O is much more subject to the specific nature of the system

Why look at physical I/Os
------------------------------
Physical I/Os tend to be expensive in terms of desired performance from a typical OLTP
system. Physical disk I/O is by far the slowest of the three core components found in all
computing devices (CPU, memory, and disk). For that reason, minimising the volume of physical
I/O becomes a major step in system performance tuning process.

While investigating physical I/O for a system, it should, however, be kept in mind that physical
I/O is not inherently a problem, and is in fact necessary for the database to function at all.
However, unnecessary physical I/O can become major bottlenecks in the system performance.


We can convert peak Logical I/Os per second to the amount of CPU required for the
application, and we can convert peak Physical I/Os per second to the number of disks. This
leads us to the direction of tuning the process of disk access to enable better performance.
However, it must be kept be kept in mind, that steps for tuning physical I/O still start from the
very first step of tuning the application first, moving on to the database, and finally attacking the
operating system and hardware.

Following sections describe pointers to areas to tune w.r.t. The Application, Database, and the
System:

Pointers for Application Tuning
--------------------------------------
1) Minimising block accesses – choosing best retrieval mechanism – Indexes vs. Table Scans
Typically, we always try to tune the application in order to avoid Full Table Scans for queries
being executed. Here is a choice between Full Table Scans and putting indexes in force.
Most often, in a long running and continually growing OLTP system, index look -ups tend to
experience less physical I/O in comparison with table scans. However, the choice lies with
the optimiser unless we force it to use indexes with hints (Setting Optimiser mode to Rule
is not advised as its going to get de-supported in later versions of Oracle). Table scans
will sometimes consume fewer resources in terms of I/O than the equivalent index look -up.
This will prominently be true when the selection criteria points to a large proportion of the
table data. Apart from this, this will also depend on many other factors like data distribution,
optimiser statistics are up to date, selectivity of the index, depth of the index, init file
parameter – DB_FILE_MULTIBLOCK_READ_COUNT etc.).
So properly analysing the factors influencing cost-based optimiser decisions to choose
between table scans or index scans, will help minimise the physical I/O for the system.
We can help cost-based optimiser chose between index and table scans by doing following
basic things:

a) Keeping optimiser statistics up to date
b) Specifying correct optimiser goal – thumb rule is to set FIRST_ROWS_n (n is 1, 10,
100..) for best response times, and ALL_ROWS for best possible throughput.
c) Using hints, wherever necessary
d) Creating histograms, wherever necessary

2) Avoiding Disk sorts
Disk sorts occur when there is not sufficient memory available for any sort operation to
complete; in which case, Oracle allocates a temporary segment within the temporary
tablespace. This causes additional I/O overheads for the read and write operations from the
temporary tablespace.
We try to avoid / minimise disk sorts by:
a) Properly setting the value for configuration parameter – SORT_AREA_SIZE. The value
of this parameter should not exceed the memory capacity of the system.
b) Properly setting the value for configuration parameter –
SORT_AREA_RETAINED_SIZE. One should be very careful in setting the value for
this parameter. Setting the value too high can cause memory contentions resulting in
heavy paging and swapping activities.
c) Think of striping the temporary tablespace across disks
d) With optimiser mode set to FIRST_ROWS, the optimiser will favour avoiding sorts if
other alternatives are available.
e) Avoid DISTINCT clause, if possible in queries. The DISTINCT clause will usually
perform sorts.
f) Using UNION ALL instead of UNION, if possible. This will avoid sorts required for
duplicate row elimination when UNION is used.
g) Sorts can be avoided if the indexes are being used. For e.g. if an index exists on some or
all the columns in the ORDER BY clause, the already sorted index may get used.
h) Replacing sort -merge join operations with hash joins.

3) Avoiding Parse Operations
-----------------------------------
Excessive parsing operations cause recursive SQL’s to be frequently generated. Excessive
recursive SQL’s may sometimes result in physical disk I/O if the information desired is not
available in the dictionary cache.
Avoid parse overhead by:
a) Using bind variables
b) Reusing cursors available for later operations in a transaction
c) Properly setting the value of configuration parameter –
SESSION_CACHED_CURSORS
d) Properly setting the value of configuration parameter – CURSOR_SHARING
(especially when it is difficult to re -write application or using bind variables)

Pointers for Database Tuning
----------------------------------
1) Proper Space Management
Proper space management plays a major role in enhancing system’s performance. Form an
I/O perspective, wastage of space means Oracle has to perform more I/O in order to satisfy
a request, or getting a job done. Therefore, it becomes imperative to take steps to avoid
wastage of space in order to improve performance as desired.
A proper value of the block size along -with other space parameters like PCTFREE,
PCTUSED, INITTRANS, MXTRANS (as explained below) can help utilize the space
effectively thereby reducing I/O.
It all depends how a block is packed efficiently so that minimum I/Os is required to retrieve
a particular amount of data.
a) INITTRANS and MAXTRANS
These two parameters affect the way the transactional section of the block is allowed to
grow for the transactions applied to the block. INITTRANS specifies the initial number
of slots reserved for the transactions to occur. MAXTRANS specifies the maximum
slots that can be created in the block (defaults to 255). Setting too high a value of
MAXTRANS means allowing Oracle to push the transaction table to a size that is
unnecessarily large for most of the life cycle of the table/index

b) PCTFREE and PCTUSED
PCTFREE tells Oracle how much of he remainder of the block is available for inserting
new data. PCTUSED tells to what percentage the used space be dropped to allow Oracle
to insert new data in the same block. Setting a high value for PCTFREE could mean
more empty space in the block. This, in turn, means more I/Os are to be performed for
retrieving the amount of data required. An improper setting of this parameter can also
lead to row migrations thus leaving a lot of waste space in the original block (this is not
desirable as an extra I/O will still be required on the original block to read the rowid of
the migrated row).

c) Choosing appropriate block size
Traditionally speaking, smaller block sizes are often quoted to be good for OLTP
systems and larger blocks are better for DSS systems. Bigger blocks mean fewer buffers,
and less degree of buffering. In a typical OLTP system – where the data is accessed
mostly through indexes, there is a higher probability that the some other query may flush
data from the previously buffered block.
The above problem, however, can be addressed well by proper use of various buffer
pools available – KEEP, RECYCLE, and DEFAULT
For a typical OLTP system averaging 1-1.5 Terabytes of data for one year into
production, a block size of 16K can work well to deliver good performance.

d) Extent size
Never set the extent size smaller than the physical maximum read size allowed by the
operating system. This is important because if any object were to be scanned, it would
mean extra I/O if Oracle has to issue more than one read request in order to get its
single maximum read.

e) Tablespace
Splitting database across multiple Tablespaces is a common practice and offers great deal
of benefits in terms of backup, recovery, and the administration tasks getting easier to
handle. Identifying and correcting I/O bottlenecks becomes easy, as the I/O costs can
be isolated across sub-sets of databases.

2) Physical I/O Distribution across physical devices
Balancing physical I/O is an important step to execute for maximising Oracle’s system
performance. This step involves stripping total I/O across data-files that are spread evenly
across physical disk drives.
Following points below indicate areas to tune for achieving this goal:

a) Identifying Hot-spots
Query on dictionary view – V$FILESTAT and V$DBFILE – can get the following
information to read for identifying hot files / Tablespaces



b) Balancing Physical I/O
Hot data files i.e. files with the most I/O can be identified by sorting the above
information by either Physical Reads or Physical Writes. If a small number of data-files
are experiencing a large percentage of load (around 80% of total load), and the overall
volume (Blks Rd/Min + Blks Wrt/Min) approaches the capacity of the disk, then it’s an
indication of spreading I/O across multiple disks. If there is just one hot file on a disk
that is responsible for heavy I/O, moving it on to another disk will not help. In this case,
considering the option of moving heavily accessed segments to separate Tablespaces can
help reduce I/O bottleneck here.
It’s a common and a useful practice to have small sized files, instead of one large file for
a tablespace, and spread them across multiple disks (if available). For e.g. spreading 10
100MB files across 10 disks for a 1GB tablespace required (instead of a single 1GB datafile)
c) Placing Redo Logs on separate disk
For large databases involving heavy update and insert activities, performance can be
improved by placing the redo log files on a separate disk that supports no other activity.

d) Separate Indexes from tables
Having indexes and tables assigned to different Tablespaces respectively can help
improve the I/O performance. Performance can further be expected to improve if the
data-files under individual Tablespaces are spread across different disks

3) Database Fragmentation
Extent / Tablespace Fragmentation: Database segments may be composed of many non -
contiguous extents of disk space. The I/O times will increase due to non-sequential disk
reads and/or because of split I/Os. The extent fragmentation further results in Tablespace
fragmentation (the Oracle Tablespaces may be composed of many files, and the Oracle
segments within a Tablespace may be composed of many individual extents; thus resulting in
Tablespace fragmentation.
Tablespace fragmentation prevents Oracle to take advantage of its multi-block read capability
along-with resulting in wastage of space.
Steps to eliminate Tablespace fragmentation:
a) Recreate Tablespace – this can just be a temporary solution and might not prevent it
from recurring
b) Appropriately sizing segment storage parameters

4) Adjusting Database buffers
Increasing the number of database buffers to an optimum value (check to avid excessive
paging here) will also help improve the I/O efficiency. A checkpoint can be established for
the Buffer it Ratio. Check for the Buffer Hit Ratio – if it is consistently less than 90%, it
becomes necessary to increase DB_BLOCK_BUFFERS
Note: If the dictionary cache hit Ratio is less than 95%, consider increasing the shared Pool
size to an optimum value.

5) Considering Multiple DBWRs
For a large scale Oracle application, multiple database writers should be considered. The
configurable parameter – DB_WRITERS – can be set in initdb.ora to reflect number of
DBWRs. This needs to be adjusted in an iterative manner i.e. increasing conservatively and
monitoring the results for I/O improvements.

------------------------------------------------

HAPPY LEARNING !


No comments:

Post a Comment

Thanks for you valuable comments !