Tuesday, August 5, 2008

Using Onstat commands to tune Informix Database


We are going to see TOP 10 onstat commands used to tune Informix database

NOTE: All the below points may not hold well in all the conditions. The overall reason to write this document is to just guess some problems in a broader level. There are no fixed rules in interpreting all this data; it all depends on how your application behaves.

1. onstat -F : Page Flushers (cleaners)

The onstat -F command provides information on what types of buffer pool XE "buffer pool:flushing" flushing have occurred as well as the status of each of the system page cleaner threads.

NOTES:
  • onstat -F doesn’t print out as part of the onstat -a command.
  • FG writes may occur periodically from a load and do not necessarily indicate a need for additional page cleaners.

In general, foreground writes XE "foreground writes" (FG) should always be zero.A foreground write is caused when a session needs to have a page from disk placed into the buffer pool, but there are no clean/free buffers available.In this case, the sqlexec thread will pick a page off the least recently used side of an LRU queue, write it out, and mark the page as clean.This process is time consuming for a session and should be avoided.

Whether buffer flushing occurs as part of an LRU write or a Chunk Write should be based on the nature of the system.LRU writes do not block the system, but are random in nature.Chunk Writes block the system, but are sequential in nature and tend to be more efficient for large writes.It is therefore preferable to have LRU writes occuring in a production system for normal operations and Chunk Writes for loads.


If you are having long checkpoints, onstat -F may provide some insight.While a checkpoint is occurring, execute onstat -F -r 2.This will show what chunks the page flushers are writing out to disk every two seconds. By monitoring the data column, it is possible to determine which chunks are taking the longest to write. If one or more chunks continually have long chunk writes, relative to the rest of the system, the tables within it might be good candidates for redistribution.

Under normal circumstances, there will be one flusher assigned to each chunk during a checkpoint. LRU Writes and Chunk Writes will vary depending on the type of system you have.An OLTP system should maximize LRU Writes (Best value is 1:2 of chunk write: LRU writes).There will always be some Chunk Writes, but LRU Writes will speed up checkpoint duration.In a DSS system, Chunk Writes should be maximized.Some LRU Writes may still occur in an effort to eliminate foreground writes (Fg Writes).

Also monitor page cleaners (flushers) at checkpoint time.Make sure they are all busy doing Chunk Writes.


2: onstat -l : Logging

The onstat -l command provides a two-part output. The first part contains information on the physical log buffer and the physical log. The second part contains information on the Logical Log buffer and the Logical Logs.

Key Notes:

  • Don't confuse the Physical Log Buffer (in memory) with the Physical Log (on disk). Writes are made first to the Physical Log Buffer. The Physical Log Buffer is then flushed to the Physical Log on disk.
  • The Physical Log Buffer is flushed to disk when it hits 75%, during a checkpoint or when a flush of the Logical Log Buffer would place a transaction in the Logical Log without it's corresponding Physical Log image existing on disk.
    The Physical Log on disk is flushed when it hits 75% by signaling that a checkpoint is required.
  • There are two Physical Log buffers and three Logical Log buffers.
  • The number of Physical Log buffers and Logical Log buffers cannot be modified.
    An archive must be performed before a newly added Logical Log can become free and be used.
  • You may not create more than 32768 logical logs. However, the unique ID field has a maximum value of the size of an integer.

The Physical Log buffer should be tuned relative to how often you want it to flush to disk. Keep in mind that the larger it is, the more you could lose should a memory failure occur. The default is 32K and should not generally be tuned below this value.


The Physical Log on disk should be tuned based on two items: how often checkpoints should occur and how long fast recovery should take. Checkpoints are not driven just by the value of CKPTINTVL, but when the Physical Log hits 75% full. If checkpoints are occurring too frequently (i.e., in less than CKPTINTVL intervals), then the Physical Log may be too small. Oversizing the Physical Log won't hurt anything, but it will waste disk space. The Physical Log also drives the length of Fast Recovery. When the system is shut down normally, the Physical Log is logically marked as empty. If the system comes down abnormally, pages will remain in the Physical Log. During Fast Recovery, Informix Dynamic Server restores these pages. These pages must be re-written through the buffer pool and back to disk to synchronize the DBSpaces with the transaction logs. The smaller the Physical Log, the shorter the recovery time.

There are two items to address when sizing the Logical Logs: the size of each log and the number of logs. Since a Logical Log will not be backed up until it is full, the amount of transactional loss, should a disk failure occur, is directly affected by the size of each Logical Log. As an example, if each Logical Log is 100 Mbytes and a disk failure occurs while the third log is in use, only logs 1 and 2 will have been backed up, even if the third log is 99% full. Therefore, a single logical log should be sized to the amount of information you are willing to possibly loose should a complete disk failure occur. Keep in mind that during a restore, Informix Dynamic Server will attempt to save any logical logs that still exist and have not been backed up.

The number of Logical Logs required is a function of how frequently you wish to back up your logs. If you wish to back up your logs only once every 8 hours, then you should have sufficient log space to allow for transactions over an 8-hour period. This must be judged by monitoring the onstat -l output over a period of normal system behavior.


The length of transactions also affects the total size of all logs combined. If a single transaction spans more than the maximum percentage of log space allowed, given by LTXHWM (long transaction high water mark), it will be automatically rolled back. The transaction itself might not be large, however, if it is held open too long, relative to other operations, it could easily span too many logs. The default value for LTXHWM is 50(%). It is not recommended to increase this value since the rollback operation for a long transaction also consumes Logical Log space.


3. onstat -m : Message Log


The onstat -m command displays the last 20 lines from the Informix Dynamix Server log which is specified by the MSGPATH parameter in the configuration file.


Key Notes:

  • Checkpoint entries only appear in the log if pages were actually flushed to disk.
  • To handle server level errors, use the ALARMPROGRAM parameter in the configuration file.

The message log should be monitored periodically for potential system problems. Should a session abort with an assertion failure, the message log will contain important information including a stack trace of the session.


For checkpoints, the online log identifies the interval (time between checkpoints) and the duration (the length of the checkpoint). The interval should be no less than the value of CKPTINTVL unless you are using the size of your physical log to control checkpoints (checkpoints occur when the physical log is 75% full). The duration of the checkpoint will vary depending on many factors (e.g., BUFFERS , LRU_MIN_DIRTY, LRU_MAX_DIRTY, CKPTINTVL, CLEANERS).


Other factors can cause checkpoints besides CKPTINTVL. Booting or shutting down the system, creating a DBSpace, or performing an archive are a few.


The message log should be cleaned out from time to time as it can become rather large on a highly active system.


4. onstat -p : Profile statistics

The onstat -p command provides statistical information that has accumulated since either boot time or the last onstat -z command.


Key Notes:

  • The values for ovtbls and ovuserthread are leftovers from previous versions of OnLine. Starting with 7.x, the values for these fields are dynamically allocated and therefore the overflows should always be zero.
  • The number of flushes is not driven entirely by the number of checkpoints.
  • The value for commits and rollbacks is not necessarily reflective of actual transaction requests by sessions. Internally, Informix performs its own commits and rollbacks as well.
  • Actual deadlocks (two or more threads waiting on one another's locks) can't actually happen within a single Informix Dynamic Server instance. The thread that would create the deadlock fails and receives an error.
  • Page compression occurs when rows have been deleted from a page. The compression is actually the repositioning of the remaining rows on the page such that they are contiguous.


Cached reads and cached writes will usually be above 90%. If they are under 85% in an OLTP system, this may be an indication of a shortage of buffers. Although the %cached writes should be above 85%, the actual value can vary greatly. Low %cached writes may simply indicate that not many actual writes have occurred. This can be determined by examining the number of writes, rewrites, and deletes relative to the number of reads.


Bufwaits shows the number of times a thread had to wait to acquire a latch on a buffer. If the readahead values (RA_PAGES and RA_THRESHOLD) are too high, bufwaits may grow excessively. High bufwaits may also be caused by too small a buffer pool relative to the amount of work being performed (see %cached values).


Lock waits should normally be less than 1% of lock requests unless you have applications that hold locks for an inordinant amount of time. If locks waits are excessive, examine the isolation levels used by SQL statements plus the overall design of transaction management.


Checkpoint waits can and do occur in a high volume system. Checkpoints are prevented from occurring if a userthread is in the middle of a critical section (known as a critical write). If a checkpoint must wait, it is usually only for a brief moment.


The sequential scans field should monitored in an OLTP system since most OLTP system operate most efficiently using index reads.


Read aheads allow the system to perform more efficient I/O when a need arises to scan multiple sequential pages from disk. This capability is highly desired in DSS systems where large table and index scans are normal and in certain types of OLTP processing. In an ideal situation, ixda-RA + idx-RA + da-RA should equal RA-pgsused. However, if the values of RA_PAGES and RA_THRESHOLD are too high, the system may end up reading in more pages for a read ahead than it should. This will result in high bufwaits since the pages being read in will force existing buffer pool pages to be overwritten or, in the case of dirty pages, flushed out to disk.


5: onstat -R : LRU Queues


The onstat -R command displays the current status of the LRU queues. The buffer pool is broken into a series of queues to improve accessibility and reduce buffer contention, with each LRU queue being further broken down into a free/clean side (FLRU) and a modified side (MLRU).


Key Notes:

  • The onstat -R command is not included as part of the onstat -a output.
  • A free page is one that has not been used. A clean page is free page that has had a page written to it, but not modified. A dirty page is a clean page that has had modifications made to it.
  • Pages can migrate amongst the LRU queues. For this reason, the number queued and the total number of buffers may not always correspond. In addition, the number of buffer pages per LRU queue may become slightly unbalanced.
  • Each LRU queue is divided up into four separate priority queues: LOW, MED_LOW, MED_HIGH and HIGH.
  • Pages with a priority of HIGH are considered last for page replacement; pages with a priority of LOW are considered first for page replacement. Page replacement priorities are utilized when no free pages exist in the buffer pool.
  • Priorities are part of the Dynamic Buffer Management API incorporated into the server.

Upon startup, OnLine will divide the number of pages in the buffer pool among the LRU queues as evenly as possible. The LRU queues are designed to help reduce contention among active threads within the same OnLine instance. In addition, they assist in reducing the processing load required for a checkpoint. When the LRU_MAX_DIRTY percentage of pages for an LRU is exceeded, a page cleaner is assigned to the LRU to begin flushing dirty pages out to disk until only the LRU_MIN_DIRTY percentage of pages remains in the queue.


Setting the value for LRU_MIN_DIRTY and LRU_MAX_DIRTY should be driven by the length of the system checkpoints. If the checkpoint duration is too long, it can possibly be reduced by lowering the values for LRU_MIN_DIRTY and LRU_MAX_DIRTY. This will reduce the overall number of dirty pages remaining in the buffer pool.


Tuning the value for LRUS, which defaults to 8, depends on whether the tuning is being performed to reduce buffer contention or reduce checkpoint duration.


6: onstat -g glo : MT Global Information


The onstat -g glo command provides information on the status of the virtual processors in the system.


Gives status of the virtual processors in the system. You have to increase the No. of sessions for each pollthread, if total number of sessions = No. of pollthreads * No of sessions each poll thread can handle (check NETTYPE parameter in onconfig file).
Generally keep number of pollthreads equal to NUMCPUVPS. And NUMCPUS equal to number of CPUS - 1.


If there are more lngspins, then there is some congestion in the resources.


Also check the usage of each VP.


7: onstat -g iof : Disk IO Statistics by Chunk/File


The onstat -g iof command displays the statistics for disk I/O by chunk/file.


Examination of the values in each of the operations columns (totalops, dskread, dskwrite) can identify heavy I/Os against a particular device or chunk. This information can be used to determine where hotspots exist across the system. This data clearly explains how best the fragmentation is done.


8: onstat -g ioq : Disk IO Statistics by Queue


The onstat -g ioq command provides statistics on the disk I/O by queue.

  • One gfd queue name exists for each chunk in the system.


The information from the onstat -g ioq command can assist in identifying bottlenecks with the processing of I/O through the queues. If the maxlen column for the AIO queue tends to exceed 25 or len column for AIO queue tends to exceed 10, additional AIO VPs should be started.


The onstat -g ioq output for the global file descriptors (gfd) or chunks may also identify a potential I/O problem with specific disks in the system if certain gfd queues begin to back up. It may also point out a chunk that contains a table that needs better partitioning (fragmentation).
If the AIO queue is building very badly, the in most of the cases I prefer setting NUMAIOVPS = Number of chunks.


9: onstat -g iov : Disk IO Statistics by VP


The onstat -g iov command displays the statistics for disk I/O by virtual processor.


Although the column is called wakeups, it is actually incremented every time the AIO VP goes idle.

The greater the number of I/Os per wakeup, the busier the AIO VP is being kept. This can be used as an indicator for increasing or decreasing the number of VPs for AIO.
Generally io/wup should be <10>


10: onstat -g rea : Ready Threads


The onstat -g rea command displays the threads that are on the ready queue and are awaiting a VP on which to run.


Key Notes:

  • A thread priority can range from 1 (lowest) to 4 (highest) with a default priority of 2.
  • Threads are pulled off the ready queue based on their priority. Within a priority, threads are taken on a first-in-first-out basis (FIFO).
  • Ready threads are displayed from highest priority to lowest priority within each class of virtual processor (VP).

A consistent number of entries on the ready queue may be an indication that additional CPU VPs are required. The number of CPU VPs should never exceed the number of physical CPUs. If there is already a CPU VP for every CPU and there are threads backing up on the ready queue, there may not be enough physical processors or fast enough processors on the box based on the workload required. This ready queue also indicates buffer constraints.

No comments: