Wednesday, December 13, 2006

Merry Xmas and Happy New Year


I am off to Goa for my annual Xmas vacation.

Here's wishing u all a Merry Christmas and a Happy New Year

See you all in the New Year

Regards
-Fairlie

Monday, December 11, 2006

Analysing systemstate dumps

In my last article I discussed how to dump the systemstate using various methods. These files are quite huge and are sometimes difficult to decipher. You can parse the file using an awk script in
LTOM
provided by Oracle’s center of expertise. LTOM is a lite onboard monitor which is a diagnostic tool.

Once you download and install LTOM you can find a script called ass109.awk in tom_base/tom/src which parses the systemstate into a more readable format

buffalo>pwd
/usr/opt/oracle/bin/tom_base/tom/src
buffalo>ls -al ass*
-rwxr-x--- 1 oracle dba 34549 Nov 25 2003 ass109.awk


We will attempt to analyse a systemstate dump generated by the following error

>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=42
System State dumped to trace file /u00/app/oracle/admin/F8900DEV/udump/f8900dev1_ora_12057.trc
Mon Dec 11 09:58:05 2006

buffalo >awk -f ass109.awk f8900dev1_ora_12057.trc > rowcache.out

If you look at rowcache.out
42: last wait for 'ksdxexeotherwait' [Rcache object=0x483a70e4,]
The holder of 0x483a70e4 is
Rcache object=0x483a70e4, 48: 48: is waiting for Latch 5213eeb4

If you look at rowcache.out in conjunction with the raw trace file it can be concluded that process 42 has requested row cache lock on dc_profiles which is held by process 48.
Process 48 is in turn waiting on the “Child object queue header” latch which is held by process 44
Process 44 which is a DataPump Worker is in turn waiting on 'SGA: allocation forcing component growth' which is seen when using ASMM (sga_target)

Sunday, December 10, 2006

Systemstate dumps

A system state dump is trace file containing a snapshot of all processes running in a database. It is used to diagnose hang issues and has more information compared to a hanganalyze dump.There are various ways to dump the systemstate.


If you can connect to the database you can dump it using

SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit  this is important so that the file is not truncated.
Statement processed.
SQL> oradebug dump systemstate 10
Statement processed.
SQL> oradebug tracefile_name
c:\oracle\product\10.2.0\admin\usyd\udump\usyd_ora_17452.trc

A complete file should have the following string at the end
END OF SYSTEM STATE

If you want only the short stack of each process in the instance
you need to use

SQL> oradebug dump systemstate 256
Statement processed.

Alternatively you can dump the system state using

SQL> ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME SYSTEMSTATE LEVEL 10';

Session altered.

If you cannot connect to the database instance you can use a session less connection in 10g using
sqlplus -prelim / as sysdba
oradebug setmypid
oradebug unlimit;
oradebug dump systemstate 10

Prior to 10g you can use a unix debugger like gdb, adb or dbx to attach to a shadow process and dump the system state using the function ksudss which is the subroutine in the Oracle source code which does the same.

eiger-> ps -ef | grep HRDEV | grep j00 | grep –v grep


oracle 14022 1 0 Dec10 ? 00:00:21 ora_j000_HRDEV

eiger-> gdb
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
(gdb) attach
Argument required (process-id to attach).
(gdb) attach 14022
Attaching to process 14022
Reading symbols
…… (output deleted for readability purposes)
0x000000324f7c7c0c in semtimedop () from /lib64/tls/libc.so.6
(gdb) print ksudss(10)
[Switching to Thread 182922098368 (LWP 14022)]
$1 = -1073747016
(gdb) detach
Detaching from program: /usr/opt/oracle/u00/app/oracle/product/10.2.0/bin/oracle, process 14022
(gdb) quit

Since in the above case we have attached to a job process the trace file containing the dump will be in bdump else the trace file would be in udump.

For RAC databases if you need to dump the systemstate on all instances

SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit
Statement processed.
SQL> oradebug -g all dump systemstate 266
Statement processed.
SQL> oradebug tracefile_name
/u00/app/oracle/admin/F8902UAT/udump/f8902uat1_ora_2513.trc

The trace file will have only the following contents

PORADEBUG REMOTE COMMAND BEGIN SERIAL:0 MYPID:2513
PORADEBUG REMOTE COMMAND END SERIAL:0 MYPID:2513 [SUCCESS]

The systemstate dump will be written to by the DIAG process on each node.


Please use these commands with caution. On busy systems with a large number of processes these commands can take quite a while. Also a systemstate dump need not always portray the correct information of a system especially when the time to
dump all the processes is long.

Sunday, December 03, 2006

Adding a voting disk online

In 10.2 RAC it is now possible to add a voting disk online in addition to specifying more than one voting disk during the ClusterWare install.In 10G R1 there was no software level mirroring of the voting disk.

I tried this out for the first time today and things didn't go as I expected.

[root@dbrac1 ~]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl add css votedisk /dev/raw/raw4
Cluster is not in a ready state for online disk addition


You have to use the force option in the above command due to
Bug 4898020 ADDING VOTING DISK ONLINE CRASH THE CRS
which is fixed in the 10.2.0.4 patchset. This is again another case where clearly documented functionality in 10.2 does not work.

Before using the force option make sure the CRS stack is not up on any of the nodes.

[root@dbrac1 init.d]# ps -ef | grep d.bin | grep -v grep
should retrun no rows.


[root@dbrac1 init.d]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl add css votedisk /dev/raw/raw4 -force
Now formatting voting disk: /dev/raw/raw4
successful addition of votedisk /dev/raw/raw4.
[root@dbrac1 init.d]#

Check that the newly added disk can be seen from both nodes.

[root@dbrac1 init.d]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl query css votedisk
0. 0 /dev/raw/raw2
1. 0 /dev/raw/raw4

located 2 votedisk(s).


[root@dbrac2 init.d]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl query css votedisk
0. 0 /dev/raw/raw2
1. 0 /dev/raw/raw4

located 2 votedisk(s).


The same error is encountered should you wish to delete the newly added disk

[root@dbrac1 bin]# ./crsctl delete css votedisk /dev/raw/raw4
Cluster is not in a ready state for online disk removal
[root@dbrac1 bin]# ./crsctl delete css votedisk /dev/raw/raw4 -force
successful deletion of votedisk /dev/raw/raw4.

Again please note that the force option should be used only after the CRS stack
is shutdown on all nodes. Failure to do so could result in OCR corruption.