Tuesday, October 24, 2006

Recovering a Standby database from a missing archivelog

A Physical Standby database relies on continuous application of archivelogs from a Primary Database to be in synch with it. In Oracle Database versions prior to 10g in the event of an archivelog gone missing or corrupt you had to rebuild the standby database from scratch.

In 10g you can use an incremental backup and recover the standby using the same to compensate for the missing archivelogs as shown below

In the case below archivelogs with sequence numbers 137 and 138 which are required on the standby are deleted to simulate this problem.

Step 1: On the standby database check the current scn.

STDBY>>select current_scn from v$database;
CURRENT_SCN
-----------
4793543

Step 2: On the primary database create the needed incremental backup from the above SCN

C:\Documents and Settings\frego>rman target /
Recovery Manager: Release 10.2.0.2.0 - Production on Wed Oct 25 13:39:04 2006

Copyright (c) 1982, 2005, Oracle. All rights reserved.

connected to target database: USYD (DBID=768471617)

RMAN> BACKUP DEVICE TYPE DISK INCREMENTAL FROM SCN 4793543 DATABASE
FORMAT 'C:\temp\bkup_%U';


Starting backup at 25/OCT/06
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00001 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\SYSTEM01.DBF
input datafile fno=00002 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\UNDOTBS01.DBF
input datafile fno=00003 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\SYSAUX01.DBF
input datafile fno=00004 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\USERS01.DBF
channel ORA_DISK_1: starting piece 1 at 25/OCT/06
channel ORA_DISK_1: finished piece 1 at 25/OCT/06
piece handle=C:\TEMP\BKUP_02I0NSNE_1_1 tag=TAG20061025T134102 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:35
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
including current SPFILE in backupset
channel ORA_DISK_1: starting piece 1 at 25/OCT/06
channel ORA_DISK_1: finished piece 1 at 25/OCT/06
piece handle=C:\TEMP\BKUP_03I0NSOI_1_1 tag=TAG20061025T134102 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 25/OCT/06

Step 3: Cancel managed recovery at the standby database

STDBY>>recover managed standby database cancel;
Media recovery complete.

Move the backup files to a new folder called INCR_BACKUP so that they are the only files in that folder.

Step 4: Catalog the Incremental Backup Files at the Standby Database

C:\oracle\product\10.2.0\db_1\database>rman target /

Recovery Manager: Release 10.2.0.2.0 - Production on Wed Oct 25 14:54:27 2006

Copyright (c) 1982, 2005, Oracle. All rights reserved.

connected to target database: USYD (DBID=768471617, not open)

RMAN> CATALOG START WITH 'C:\temp\INCR_BACKUP';
searching for all files that match the pattern C:\temp\INCR_BACKUP

List of Files Unknown to the Database
=====================================
File Name: C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_1_1
File Name: C:\TEMP\INCR_BACKUP\BKUP_03I0NSOI_1_1

Do you really want to catalog the above files (enter YES or NO)? YES
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_1_1
File Name: C:\TEMP\INCR_BACKUP\BKUP_03I0NSOI_1_1

Step 5: Apply the Incremental Backup to the Standby Database

RMAN> RECOVER DATABASE NOREDO;

Starting recover at 25/OCT/06
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=151 devtype=DISK
channel ORA_DISK_1: starting incremental datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\SYSTEM01.DBF
destination for restore of datafile 00002: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\UNDOTBS01.DBF
destination for restore of datafile 00003: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\SYSAUX01.DBF
destination for restore of datafile 00004: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\USERS01.DBF
channel ORA_DISK_1: reading from backup piece C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_
1_1
channel ORA_DISK_1: restored backup piece 1
piece handle=C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_1_1 tag=TAG20061025T134102
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
Finished recover at 25/OCT/06

Step 6: Put the standby database back to managed recovery mode.

STDBY>> recover managed standby database nodelay disconnect;
Media recovery complete.

From the alert.log you will notice that the standby database is still looking for the old log files

*************************************************
FAL[client]: Failed to request gap sequence
GAP - thread 1 sequence 137-137
DBID 768471617 branch 600609988
**************************************************

This is because the controlfile has not been updated.
Hence the standby controlfile has to be recreated

On the primary

PRIM>>alter database create standby controlfile as 'C:\temp\standby01.ctl';
System altered.

Copy the standby control file to the standby site and restart the standby database in managed recovery mode...

Managed Standby Recovery not using Real Time Apply
parallel recovery started with 2 processes
Media Recovery Waiting for thread 1 sequence 139

As you can see from the above output the standby is now looking for the next archivelog.
If you have the luxury of time at your disposal you can do an export of the standby database in read only mode.

Sunday, October 15, 2006

Case Study on How to diagnose a node reboot

The following article looks at how to diagnose a node reboot in a 10G RAC cluster.

Configuration:
2 node RAC on 10.2.0.1 on RH ES 4.0 (kernel 2.6.9-34.0.2.ELsmp)

To diagnose such issues it is very important that you get the timelines of the occurrences of the reboot spot on. It is a recommended practice to have the NTP daemon running on all nodes in the cluster so that the wall clocks of all nodes in the cluster are in sync.

hippo >ps -ef | grep ntp | grep -v 'grep'
ntp 3349 1 0 Oct15 ? 00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g

1. Check the Linux log file (/var/log/messages) to determine the time of the reboot

In this case the reboot occurred at 02:09 hours on the node hippo

Oct 12 09:24:34 hippo sshd(pam_unix)[5885]: session opened for user frego by frego (uid=0)
Oct 15 02:09:17 hippo syslogd 1.4.1: restart.
Oct 15 02:09:17 hippo syslog: syslogd startup succeeded
Oct 15 02:09:17 hippo kernel: klogd 1.4.1, log source = /proc/kmsg started.

Check the Linux log file on the other node (elephant) at the same point in time.

Oct 14 17:42:01 elephant su(pam_unix)[18041]: session closed for user root
Oct 15 02:09:12 elephant syslogd 1.4.1: restart.
Oct 15 02:09:12 elephant syslog: syslogd startup succeeded
Oct 15 02:09:12 elephant kernel: klogd 1.4.1, log source = /proc/kmsg started.

From the above output it is clear that both the nodes rebooted at approximately the same time.

2. Check which process is responsible for the reboot.

Review the log file of the CSS daemon. The log file resides in $ORA_CRS_HOME/log/

hippo >pwd
/u00/app/oracle/product/10.2.0/crs/log/hippo

hippo >cd cssd

hippo >pwd
/u00/app/oracle/product/10.2.0/crs/log/hippo/cssd

hippo >ls -ltr
total 9376
drwxr-xr-x 2 oracle dba 4096 Sep 8 18:08 oclsmon
-rw-r----- 1 oracle dba 105 Oct 15 02:02 ocssd.trc
-rw-r--r-- 1 oracle dba 5 Oct 15 02:11 hippo.pid
-rw-r--r-- 1 oracle dba 17 Oct 15 02:11 cssdOUT.log
-rw-r--r-- 1 oracle dba 9567227 Oct 16 15:13 ocssd.log

From the ocssd.log file

[ CSSD]2006-10-15 02:02:48.697 [50609072] >WARNING: clssnmDiskPMT: long disk latency (54990 ms) to voting disk (0//crs/voting.dsk)
[ CSSD]2006-10-15 02:02:50.719 [50609072] >ERROR: clssnmDiskPMT: 1 of 1 voting disks unavailable (0/0/1)
[ CSSD]2006-10-15 02:02:50.734 [50609072] >TRACE: clssscctx: dump of 0x0x810d500, len 3528


Hence the reboot has been caused due to latency issues in accessing the voting disk. The specific function which monitors this is the DiskPing Monitor Thread which monitors the length of a single I/O to the voting disk and marks the voting disk is unavailable. This has caused the CSS daemon to fail.

As seen from the /etc/inittab
h2:35:respawn:/etc/init.d/init.cssd fatal > /dev/null 2>&1 < /dev/null

a failure exit of the CSS daemon causes the machine to reboot.
This indicates the node is rebooted in case of CSS daemon failure.

This is also logged in the alertelephant.log in the $ORA_CRS_HOME/log/ which is the alert.log of the CRS stack.

2006-10-15 02:02:50.740
[cssd(8533)]CRS-1606:CSSD Insufficient voting files available [0 of 1]. Details in /u00/app/oracle/product/10.2.0/crs/log/elephant/cssd/ocssd.log.


Check the ocssd.log file on the other node (elephant) too

[ CSSD]2006-10-15 02:01:40.255 [50617264] >TRACE: clssgmClientConnectMsg: Connect from con(0xb7c3f580) proc(0xb7c67918) pid() proto(10:2:1:1)
[ CSSD]2006-10-15 02:02:50.724 [112925616] >ERROR: clssnmDiskPMT: 1 of 1 voting disks unavailable (0/0/1)
[ CSSD]2006-10-15 02:02:50.777 [112925616] >TRACE: clssscctx: dump of 0x0x810d500, len 3528


There have been a couple of additional CSS related parameters introduced in the 10.2.0.2 patchset to address long I/O requirements of storage vendors such as EMC and NetApp.

• reboottime: (default 3 seconds)
The amount of time allowed for a node to complete a reboot after the CSS daemon has been evicted.
This parameter can be set via the command
crsctl set css reboottime R [-force] (R is seconds)

• disktimeout (default 200 seconds)
The maximum amount of time allowed for a voting file I/O to complete; if this time is exceeded the voting disk will be marked as unavailable
This parameter can be set via the command
crsctl set css disktimeout D [-force] (D is seconds)

These commands must be run as root on a node where the CRSD is up, unless
'-force' is specified, in which case the the CRS stack should not be up on any other node.
After the commands are run all CSS daemons must be restarted for the settings to take effect on all nodes asap.
You can verify the settings by doing an ocrdump and checking the values in the OCRDUMPFILE

Tuesday, October 03, 2006

Clearing UNKNOWN status of CRS resources

If you get the following errors whilst starting up nodeapps on a 10G RAC
cluster

bart-> srvctl start nodeapps -n bart
CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.bart.gsd' has placement error.
CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.bart.ons' has placement error.

Check the status of these resources.

bart-> ./ShowCRS | grep UNK
ora.bart.gsd ONLINE UNKNOWN on bart
ora.bart.ons ONLINE UNKNOWN on bart

To clear the state of the resource you have to use the force option
in the crs_stop command

bart-> crs_stop ora.bart.gsd -f
Attempting to stop `ora.bart.gsd` on member `bart`
Stop of `ora.bart.gsd` on member `bart` succeeded.


bart-> crs_stop ora.bart.ons -f
Attempting to stop `ora.bart.ons` on member `bart`
Stop of `ora.bart.ons` on member `bart` succeeded.

bart-> srvctl start nodeapps -n bart
bart-> ./ShowCRS | grep UNK

bart-> ./ShowCRS | grep bart
ora.bart.gsd ONLINE ONLINE on bart
ora.bart.ons ONLINE ONLINE on bart
ora.bart.vip ONLINE ONLINE on bart

ShowCRS is a shell script which is based on Metalink Note 259301.1