Wednesday, December 13, 2006
Merry Xmas and Happy New Year
I am off to Goa for my annual Xmas vacation.
Here's wishing u all a Merry Christmas and a Happy New Year
See you all in the New Year
Regards
-Fairlie
Monday, December 11, 2006
Analysing systemstate dumps
In my last article I discussed how to dump the systemstate using various methods. These files are quite huge and are sometimes difficult to decipher. You can parse the file using an awk script in
LTOM
provided by Oracle’s center of expertise. LTOM is a lite onboard monitor which is a diagnostic tool.
Once you download and install LTOM you can find a script called ass109.awk in tom_base/tom/src which parses the systemstate into a more readable format
buffalo>pwd
/usr/opt/oracle/bin/tom_base/tom/src
buffalo>ls -al ass*
-rwxr-x--- 1 oracle dba 34549 Nov 25 2003 ass109.awk
We will attempt to analyse a systemstate dump generated by the following error
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=42
System State dumped to trace file /u00/app/oracle/admin/F8900DEV/udump/f8900dev1_ora_12057.trc
Mon Dec 11 09:58:05 2006
buffalo >awk -f ass109.awk f8900dev1_ora_12057.trc > rowcache.out
If you look at rowcache.out
42: last wait for 'ksdxexeotherwait' [Rcache object=0x483a70e4,]
The holder of 0x483a70e4 is
Rcache object=0x483a70e4, 48: 48: is waiting for Latch 5213eeb4
If you look at rowcache.out in conjunction with the raw trace file it can be concluded that process 42 has requested row cache lock on dc_profiles which is held by process 48.
Process 48 is in turn waiting on the “Child object queue header” latch which is held by process 44
Process 44 which is a DataPump Worker is in turn waiting on 'SGA: allocation forcing component growth' which is seen when using ASMM (sga_target)
LTOM
provided by Oracle’s center of expertise. LTOM is a lite onboard monitor which is a diagnostic tool.
Once you download and install LTOM you can find a script called ass109.awk in tom_base/tom/src which parses the systemstate into a more readable format
buffalo>pwd
/usr/opt/oracle/bin/tom_base/tom/src
buffalo>ls -al ass*
-rwxr-x--- 1 oracle dba 34549 Nov 25 2003 ass109.awk
We will attempt to analyse a systemstate dump generated by the following error
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=42
System State dumped to trace file /u00/app/oracle/admin/F8900DEV/udump/f8900dev1_ora_12057.trc
Mon Dec 11 09:58:05 2006
buffalo >awk -f ass109.awk f8900dev1_ora_12057.trc > rowcache.out
If you look at rowcache.out
42: last wait for 'ksdxexeotherwait' [Rcache object=0x483a70e4,]
The holder of 0x483a70e4 is
Rcache object=0x483a70e4, 48: 48: is waiting for Latch 5213eeb4
If you look at rowcache.out in conjunction with the raw trace file it can be concluded that process 42 has requested row cache lock on dc_profiles which is held by process 48.
Process 48 is in turn waiting on the “Child object queue header” latch which is held by process 44
Process 44 which is a DataPump Worker is in turn waiting on 'SGA: allocation forcing component growth' which is seen when using ASMM (sga_target)
Sunday, December 10, 2006
Systemstate dumps
A system state dump is trace file containing a snapshot of all processes running in a database. It is used to diagnose hang issues and has more information compared to a hanganalyze dump.There are various ways to dump the systemstate.
If you can connect to the database you can dump it using
SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit this is important so that the file is not truncated.
Statement processed.
SQL> oradebug dump systemstate 10
Statement processed.
SQL> oradebug tracefile_name
c:\oracle\product\10.2.0\admin\usyd\udump\usyd_ora_17452.trc
A complete file should have the following string at the end
END OF SYSTEM STATE
If you want only the short stack of each process in the instance
you need to use
SQL> oradebug dump systemstate 256
Statement processed.
Alternatively you can dump the system state using
SQL> ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME SYSTEMSTATE LEVEL 10';
Session altered.
If you cannot connect to the database instance you can use a session less connection in 10g using
sqlplus -prelim / as sysdba
oradebug setmypid
oradebug unlimit;
oradebug dump systemstate 10
Prior to 10g you can use a unix debugger like gdb, adb or dbx to attach to a shadow process and dump the system state using the function ksudss which is the subroutine in the Oracle source code which does the same.
eiger-> ps -ef | grep HRDEV | grep j00 | grep –v grep
oracle 14022 1 0 Dec10 ? 00:00:21 ora_j000_HRDEV
eiger-> gdb
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
(gdb) attach
Argument required (process-id to attach).
(gdb) attach 14022
Attaching to process 14022
Reading symbols
…… (output deleted for readability purposes)
0x000000324f7c7c0c in semtimedop () from /lib64/tls/libc.so.6
(gdb) print ksudss(10)
[Switching to Thread 182922098368 (LWP 14022)]
$1 = -1073747016
(gdb) detach
Detaching from program: /usr/opt/oracle/u00/app/oracle/product/10.2.0/bin/oracle, process 14022
(gdb) quit
Since in the above case we have attached to a job process the trace file containing the dump will be in bdump else the trace file would be in udump.
For RAC databases if you need to dump the systemstate on all instances
SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit
Statement processed.
SQL> oradebug -g all dump systemstate 266
Statement processed.
SQL> oradebug tracefile_name
/u00/app/oracle/admin/F8902UAT/udump/f8902uat1_ora_2513.trc
The trace file will have only the following contents
PORADEBUG REMOTE COMMAND BEGIN SERIAL:0 MYPID:2513
PORADEBUG REMOTE COMMAND END SERIAL:0 MYPID:2513 [SUCCESS]
The systemstate dump will be written to by the DIAG process on each node.
Please use these commands with caution. On busy systems with a large number of processes these commands can take quite a while. Also a systemstate dump need not always portray the correct information of a system especially when the time to
dump all the processes is long.
If you can connect to the database you can dump it using
SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit this is important so that the file is not truncated.
Statement processed.
SQL> oradebug dump systemstate 10
Statement processed.
SQL> oradebug tracefile_name
c:\oracle\product\10.2.0\admin\usyd\udump\usyd_ora_17452.trc
A complete file should have the following string at the end
END OF SYSTEM STATE
If you want only the short stack of each process in the instance
you need to use
SQL> oradebug dump systemstate 256
Statement processed.
Alternatively you can dump the system state using
SQL> ALTER SESSION SET EVENTS 'IMMEDIATE TRACE NAME SYSTEMSTATE LEVEL 10';
Session altered.
If you cannot connect to the database instance you can use a session less connection in 10g using
sqlplus -prelim / as sysdba
oradebug setmypid
oradebug unlimit;
oradebug dump systemstate 10
Prior to 10g you can use a unix debugger like gdb, adb or dbx to attach to a shadow process and dump the system state using the function ksudss which is the subroutine in the Oracle source code which does the same.
eiger-> ps -ef | grep HRDEV | grep j00 | grep –v grep
oracle 14022 1 0 Dec10 ? 00:00:21 ora_j000_HRDEV
eiger-> gdb
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
(gdb) attach
Argument required (process-id to attach).
(gdb) attach 14022
Attaching to process 14022
Reading symbols
…… (output deleted for readability purposes)
0x000000324f7c7c0c in semtimedop () from /lib64/tls/libc.so.6
(gdb) print ksudss(10)
[Switching to Thread 182922098368 (LWP 14022)]
$1 = -1073747016
(gdb) detach
Detaching from program: /usr/opt/oracle/u00/app/oracle/product/10.2.0/bin/oracle, process 14022
(gdb) quit
Since in the above case we have attached to a job process the trace file containing the dump will be in bdump else the trace file would be in udump.
For RAC databases if you need to dump the systemstate on all instances
SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit
Statement processed.
SQL> oradebug -g all dump systemstate 266
Statement processed.
SQL> oradebug tracefile_name
/u00/app/oracle/admin/F8902UAT/udump/f8902uat1_ora_2513.trc
The trace file will have only the following contents
PORADEBUG REMOTE COMMAND BEGIN SERIAL:0 MYPID:2513
PORADEBUG REMOTE COMMAND END SERIAL:0 MYPID:2513 [SUCCESS]
The systemstate dump will be written to by the DIAG process on each node.
Please use these commands with caution. On busy systems with a large number of processes these commands can take quite a while. Also a systemstate dump need not always portray the correct information of a system especially when the time to
dump all the processes is long.
Sunday, December 03, 2006
Adding a voting disk online
In 10.2 RAC it is now possible to add a voting disk online in addition to specifying more than one voting disk during the ClusterWare install.In 10G R1 there was no software level mirroring of the voting disk.
I tried this out for the first time today and things didn't go as I expected.
[root@dbrac1 ~]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl add css votedisk /dev/raw/raw4
Cluster is not in a ready state for online disk addition
You have to use the force option in the above command due to
Bug 4898020 ADDING VOTING DISK ONLINE CRASH THE CRS
which is fixed in the 10.2.0.4 patchset. This is again another case where clearly documented functionality in 10.2 does not work.
Before using the force option make sure the CRS stack is not up on any of the nodes.
[root@dbrac1 init.d]# ps -ef | grep d.bin | grep -v grep
should retrun no rows.
[root@dbrac1 init.d]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl add css votedisk /dev/raw/raw4 -force
Now formatting voting disk: /dev/raw/raw4
successful addition of votedisk /dev/raw/raw4.
[root@dbrac1 init.d]#
Check that the newly added disk can be seen from both nodes.
[root@dbrac1 init.d]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl query css votedisk
0. 0 /dev/raw/raw2
1. 0 /dev/raw/raw4
located 2 votedisk(s).
[root@dbrac2 init.d]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl query css votedisk
0. 0 /dev/raw/raw2
1. 0 /dev/raw/raw4
located 2 votedisk(s).
The same error is encountered should you wish to delete the newly added disk
[root@dbrac1 bin]# ./crsctl delete css votedisk /dev/raw/raw4
Cluster is not in a ready state for online disk removal
[root@dbrac1 bin]# ./crsctl delete css votedisk /dev/raw/raw4 -force
successful deletion of votedisk /dev/raw/raw4.
Again please note that the force option should be used only after the CRS stack
is shutdown on all nodes. Failure to do so could result in OCR corruption.
I tried this out for the first time today and things didn't go as I expected.
[root@dbrac1 ~]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl add css votedisk /dev/raw/raw4
Cluster is not in a ready state for online disk addition
You have to use the force option in the above command due to
Bug 4898020 ADDING VOTING DISK ONLINE CRASH THE CRS
which is fixed in the 10.2.0.4 patchset. This is again another case where clearly documented functionality in 10.2 does not work.
Before using the force option make sure the CRS stack is not up on any of the nodes.
[root@dbrac1 init.d]# ps -ef | grep d.bin | grep -v grep
should retrun no rows.
[root@dbrac1 init.d]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl add css votedisk /dev/raw/raw4 -force
Now formatting voting disk: /dev/raw/raw4
successful addition of votedisk /dev/raw/raw4.
[root@dbrac1 init.d]#
Check that the newly added disk can be seen from both nodes.
[root@dbrac1 init.d]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl query css votedisk
0. 0 /dev/raw/raw2
1. 0 /dev/raw/raw4
located 2 votedisk(s).
[root@dbrac2 init.d]# /home/oracle/oracle/product/10.2.0/crs/bin/crsctl query css votedisk
0. 0 /dev/raw/raw2
1. 0 /dev/raw/raw4
located 2 votedisk(s).
The same error is encountered should you wish to delete the newly added disk
[root@dbrac1 bin]# ./crsctl delete css votedisk /dev/raw/raw4
Cluster is not in a ready state for online disk removal
[root@dbrac1 bin]# ./crsctl delete css votedisk /dev/raw/raw4 -force
successful deletion of votedisk /dev/raw/raw4.
Again please note that the force option should be used only after the CRS stack
is shutdown on all nodes. Failure to do so could result in OCR corruption.
Wednesday, November 29, 2006
Size of ASM files
Whilst testing compression of archivelogs on ASM I was looking for ways to find the size of an archivelog stored on ASM
You can
• Use the bytes column in v$asm_file.
• Use the asmcmd line utility and use the du command there in
• Use an API called dbms_diskgroup as shown below
SQL> select thread#,blocks,compressed from v$archived_log where thread#=2 and SEQUENCE#=19
THREAD# SEQUENCE# BLOCKS COM
---------- ---------- ---------- ---
2 19 3794 YES
[oracle@dbrac2 ~]$ asmcmd
ASMCMD> cd ARCHIVELOG
ASMCMD> ls
2006_11_28/
ASMCMD> cd 2006_11_28
ASMCMD> ls
thread_1_seq_10.289.607731765
thread_1_seq_11.293.607732151
thread_1_seq_12.296.607732301
thread_1_seq_8.286.607731393
thread_1_seq_9.287.607731413
thread_2_seq_13.288.607731413
thread_2_seq_14.290.607731771
thread_2_seq_15.291.607732073
thread_2_seq_16.292.607732151
thread_2_seq_17.294.607732277
thread_2_seq_18.295.607732301
thread_2_seq_19.297.607732423
set serveroutput on
declare
vfilename varchar2(4000);
vfilesize number;
vlbks number;
vfiletype number;
begin
dbms_output.enable(5000);
vfilename := '&file_name';
dbms_diskgroup.getfileattr(vfilename,vfiletype,vfilesize,vlbks);
dbms_output.put_line('File: '||vfilename); dbms_output.new_line;
dbms_output.put_line('Size (Logical Block Size): '||vfilesize); dbms_output.new_line;
dbms_output.put_line('Logical Block Size: '||vlbks); dbms_output.new_line;
dbms_output.put_line('File type: '||vfiletype); dbms_output.new_line;
end;
/
Enter value for file_name: +ORADATA1/TEST/ARCHIVELOG/2006_11_28/thread_2_seq_19.297.607732423
old 9: vfilename := '&file_name';
new 9: vfilename := '+ORADATA1/TEST/ARCHIVELOG/2006_11_28/thread_2_seq_19.297.607732423';
File: +ORADATA1/TEST/ARCHIVELOG/2006_11_28/thread_2_seq_19.297.607732423
Size (Logical Block Size): 3794
Logical Block Size: 512
File type: 4
PL/SQL procedure successfully completed.
You can
• Use the bytes column in v$asm_file.
• Use the asmcmd line utility and use the du command there in
• Use an API called dbms_diskgroup as shown below
SQL> select thread#,blocks,compressed from v$archived_log where thread#=2 and SEQUENCE#=19
THREAD# SEQUENCE# BLOCKS COM
---------- ---------- ---------- ---
2 19 3794 YES
[oracle@dbrac2 ~]$ asmcmd
ASMCMD> cd ARCHIVELOG
ASMCMD> ls
2006_11_28/
ASMCMD> cd 2006_11_28
ASMCMD> ls
thread_1_seq_10.289.607731765
thread_1_seq_11.293.607732151
thread_1_seq_12.296.607732301
thread_1_seq_8.286.607731393
thread_1_seq_9.287.607731413
thread_2_seq_13.288.607731413
thread_2_seq_14.290.607731771
thread_2_seq_15.291.607732073
thread_2_seq_16.292.607732151
thread_2_seq_17.294.607732277
thread_2_seq_18.295.607732301
thread_2_seq_19.297.607732423
set serveroutput on
declare
vfilename varchar2(4000);
vfilesize number;
vlbks number;
vfiletype number;
begin
dbms_output.enable(5000);
vfilename := '&file_name';
dbms_diskgroup.getfileattr(vfilename,vfiletype,vfilesize,vlbks);
dbms_output.put_line('File: '||vfilename); dbms_output.new_line;
dbms_output.put_line('Size (Logical Block Size): '||vfilesize); dbms_output.new_line;
dbms_output.put_line('Logical Block Size: '||vlbks); dbms_output.new_line;
dbms_output.put_line('File type: '||vfiletype); dbms_output.new_line;
end;
/
Enter value for file_name: +ORADATA1/TEST/ARCHIVELOG/2006_11_28/thread_2_seq_19.297.607732423
old 9: vfilename := '&file_name';
new 9: vfilename := '+ORADATA1/TEST/ARCHIVELOG/2006_11_28/thread_2_seq_19.297.607732423';
File: +ORADATA1/TEST/ARCHIVELOG/2006_11_28/thread_2_seq_19.297.607732423
Size (Logical Block Size): 3794
Logical Block Size: 512
File type: 4
PL/SQL procedure successfully completed.
Tuesday, November 28, 2006
Recovery using compressed archivelogs
A couple of days ago I mentioned about an undocumented feature where by it is
possible to generate compressed archivelogs
I did manage to test a couple of scenarios where I did manage to recover my database
with a mixture of compressed and non compressed archivelogs.
I guess this begs the question as to why this feature is not documented.
possible to generate compressed archivelogs
I did manage to test a couple of scenarios where I did manage to recover my database
with a mixture of compressed and non compressed archivelogs.
I guess this begs the question as to why this feature is not documented.
Sunday, November 26, 2006
Archivelog compression?
Apparently there is a command which could generate compressed archivelogs which goes something like this.
SQL> ALTER DATABASE ARCHIVELOG COMPRESS enable;
Database altered.
I have skimmed through the documentation but can't seem to find a reference to the same but the view v$archived_log seems to have a column called compressed which from the docs is
COMPRESSED VARCHAR2(3) This column is reserved for internal use only
I am assuming this is related to the underscore parameter _log_archive_compress_enable the description of which is below.
But changing the value of this parameter has not prevented the command from
working.
SQL> REM Formatting columns
SQL> set lines 9999
SQL> col indx format 9999
SQL> col inst_id heading "INST" format 9999
SQL> col ksppinm heading "NAME" format a30
SQL> col ksppdesc heading "DESC" format a60
SQL> col ksppstvl heading "CURR VAL" format a15
SQL> col ksppstdf heading "DEFAULT VAL" format a15
SQL> select v.indx,v.inst_id,ksppinm,ksppstvl,ksppstdf,ksppdesc from x$ksppi i ,
x$ksppcv v where i.indx=v.indx and ksppinm like '_log_archive_compress_enable';
INDX INST NAME CURR VAL DEFAULT VAL DESC
----- ----- ------------------------------ --------------- --------------- -------------------------
473 1 _log_archive_compress_enable FALSE TRUE Bypass database rules for enabling archiv
1 row selected.
So by default archivelogs are not compressed.
SQL> select distinct compressed from v$archived_log;
COM
---
NO
So I used this command, generated a few archivelogs and got the following message in one of the trace files.
Archivelog compression complete.
Input: 3139072 bytes Output: 810830 bytes
Compression Performance: 74.17 percent or 2.07 bits per byte
Lets verify the same.
SQL> alter database archivelog compress disable;
Database altered.
SQL> alter system switch logfile;
System altered.
SQL> create table scott.obj$ as select * from dba_objects;
Table created.
SQL> alter system switch logfile;
System altered.
SQL> alter database archivelog compress enable;
Database altered.
SQL> create table frego.obj$ as select * from dba_objects;
Table created.
SQL> alter system switch logfile;
System altered.
SQL> select sequence#,blocks,compressed from v$archived_log where sequence#>225;
SEQUENCE# BLOCKS COM
---------- ---------- ---
226 3820 NO
227 1 NO
228 11579 NO
229 83 YES
230 8 YES
231 1911 YES
6 rows selected.
If you compare the 2 logs of interest (228 and 231 since there were other commands in the interim) you find a compression performance similar to the one indicated above.
It would be wise to check if I can recover using compressed archivelogs though.
P.S This is an undocumented feature (AFAIK) and hence I do not advise anyone to use the same in a Production environment.
SQL> ALTER DATABASE ARCHIVELOG COMPRESS enable;
Database altered.
I have skimmed through the documentation but can't seem to find a reference to the same but the view v$archived_log seems to have a column called compressed which from the docs is
COMPRESSED VARCHAR2(3) This column is reserved for internal use only
I am assuming this is related to the underscore parameter _log_archive_compress_enable the description of which is below.
But changing the value of this parameter has not prevented the command from
working.
SQL> REM Formatting columns
SQL> set lines 9999
SQL> col indx format 9999
SQL> col inst_id heading "INST" format 9999
SQL> col ksppinm heading "NAME" format a30
SQL> col ksppdesc heading "DESC" format a60
SQL> col ksppstvl heading "CURR VAL" format a15
SQL> col ksppstdf heading "DEFAULT VAL" format a15
SQL> select v.indx,v.inst_id,ksppinm,ksppstvl,ksppstdf,ksppdesc from x$ksppi i ,
x$ksppcv v where i.indx=v.indx and ksppinm like '_log_archive_compress_enable';
INDX INST NAME CURR VAL DEFAULT VAL DESC
----- ----- ------------------------------ --------------- --------------- -------------------------
473 1 _log_archive_compress_enable FALSE TRUE Bypass database rules for enabling archiv
1 row selected.
So by default archivelogs are not compressed.
SQL> select distinct compressed from v$archived_log;
COM
---
NO
So I used this command, generated a few archivelogs and got the following message in one of the trace files.
Archivelog compression complete.
Input: 3139072 bytes Output: 810830 bytes
Compression Performance: 74.17 percent or 2.07 bits per byte
Lets verify the same.
SQL> alter database archivelog compress disable;
Database altered.
SQL> alter system switch logfile;
System altered.
SQL> create table scott.obj$ as select * from dba_objects;
Table created.
SQL> alter system switch logfile;
System altered.
SQL> alter database archivelog compress enable;
Database altered.
SQL> create table frego.obj$ as select * from dba_objects;
Table created.
SQL> alter system switch logfile;
System altered.
SQL> select sequence#,blocks,compressed from v$archived_log where sequence#>225;
SEQUENCE# BLOCKS COM
---------- ---------- ---
226 3820 NO
227 1 NO
228 11579 NO
229 83 YES
230 8 YES
231 1911 YES
6 rows selected.
If you compare the 2 logs of interest (228 and 231 since there were other commands in the interim) you find a compression performance similar to the one indicated above.
It would be wise to check if I can recover using compressed archivelogs though.
P.S This is an undocumented feature (AFAIK) and hence I do not advise anyone to use the same in a Production environment.
Block Change Tracking
If anybody is using the new 10G backup feature of incremental backups using a block change tracking file or thinking of doing so I sincerely recommend reading Alex Gorbachev's presentation
Tuesday, November 21, 2006
Converting database at target
There was a valid response for yesterday's post by The Human Fly where he suggested that this method was not feasible for large databases because of the need for a staging area to hold the converted files.
Well according to Oracle disk space is cheap :-) and I know a lot of bugs that have been closed on this premise (where the workaround has meant additional disk space)
But seriously you can get around this
• You can take a backup of the source database to tape and then overwrite the datafile location (assuming you don’t need the source database anymore)
CONVERT DATABASE NEW DATABASE 'USYD'
transport script 'C:\temp\newtransportscript'
to platform 'Linux 64-bit for AMD'
db_file_name_convert 'C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\' 'C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\'
;
It is also possible to convert the database at the target to free up CPU on the source. You need to run the following on the source database
CONVERT DATABASE ON TARGET PLATFORM
CONVERT SCRIPT 'C:\temp\convertscript.rman'
TRANSPORT SCRIPT 'C:\temp\transport_at_target.sql'
new database 'USYD'
FORMAT 'C:\temp\target\%U';
This will create 2 scripts
• convertscript.rman which will have convert commands for each datafile of the database to be converted.
• transport_at_target.sql contains the syntax to create the controlfile and start the database at the target
LINUX64>>startup nomount pfile='/u01/oradata/USYD/initUSYD.ora'
ORACLE instance started.
Total System Global Area 171966464 bytes
Fixed Size 2069712 bytes
Variable Size 113249072 bytes
Database Buffers 50331648 bytes
Redo Buffers 6316032 bytes
LINUX64>>CREATE CONTROLFILE REUSE SET DATABASE "USYD" RESETLOGS ARCHIVELOG
2 MAXLOGFILES 16
3 MAXLOGMEMBERS 3
4 MAXDATAFILES 100
5 MAXINSTANCES 8
6 MAXLOGHISTORY 292
7 LOGFILE
8 GROUP 1
9 '/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-212_T-1_A-600609988_00I2I4PR' SIZE 50M,
GROUP 2
10 11 '/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-213_T-1_A-600609988_00I2I4PR' SIZE 50M,
GROUP 3
12 13 '/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-211_T-1_A-600609988_00I2I4PR' SIZE 50M
14 DATAFILE
15 '/u01/oradata/USYD/SYSTEM01.DBF',
16 '/u01/oradata/USYD/UNDOTBS01.DBF',
17 '/u01/oradata/USYD/SYSAUX01.DBF',
18 '/u01/oradata/USYD/USERS01.DBF'
19 CHARACTER SET WE8MSWIN1252;
Control file created.
Now convert the files using the script convertscript.rman
eiger-> rman target /
Recovery Manager: Release 10.2.0.2.0 - Production on Wed Nov 22 14:32:33 2006
Copyright (c) 1982, 2005, Oracle. All rights reserved.
connected to target database: USYD (DBID=768471617, not open)
RUN {
CONVERT DATAFILE '/u01/oradata/USYD/SYSTEM01.DBF'
FROM PLATFORM 'Microsoft Windows IA (32-bit)'
FORMAT '/u01/oradata/USYD/NEW/SYSTEM01.DBF';
CONVERT DATAFILE '/u01/oradata/USYD/SYSAUX01.DBF'
FROM PLATFORM 'Microsoft Windows IA (32-bit)'
FORMAT '/u01/oradata/USYD/NEW/SYSAUX01.DBF';
CONVERT DATAFILE '/u01/oradata/USYD/UNDOTBS01.DBF'
FROM PLATFORM 'Microsoft Windows IA (32-bit)'
FORMAT '/u01/oradata/USYD/NEW/UNDOTBS01.DBF';
CONVERT DATAFILE '/u01/oradata/USYD/USERS01.DBF'
FROM PLATFORM 'Microsoft Windows IA (32-bit)'
FORMAT '/u01/oradata/USYD/NEW/USERS01.DBF';
}
Starting backup at 22-NOV-06
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile conversion
input filename=/u01/oradata/USYD/SYSTEM01.DBF
converted datafile=/u01/oradata/USYD/NEW/SYSTEM01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:46
Finished backup at 22-NOV-06
Starting backup at 22-NOV-06
channel ORA_DISK_1: sid=39 devtype=DISK
channel ORA_DISK_1: starting datafile conversion
input filename=/u01/oradata/USYD/SYSAUX01.DBF
converted datafile=/u01/oradata/USYD/NEW/SYSAUX01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:02:17
Finished backup at 22-NOV-06
Starting backup at 22-NOV-06
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile conversion
input filename=/u01/oradata/USYD/UNDOTBS01.DBF
converted datafile=/u01/oradata/USYD/NEW/UNDOTBS01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:02:48
Finished backup at 22-NOV-06
Starting backup at 22-NOV-06
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile conversion
input filename=/u01/oradata/USYD/USERS01.DBF
converted datafile=/u01/oradata/USYD/NEW/USERS01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:07
Finished backup at 22-NOV-06
Unfortunately it appears that the file cannot be overwritten and hence
a staging area is required because when I tried to do that
RMAN> CONVERT DATAFILE '/u01/oradata/USYD/SYSTEM01.DBF'
FROM PLATFORM 'Microsoft Windows IA (32-bit)'
FORMAT '/u01/oradata/USYD/SYSTEM01.DBF';
Starting backup at 22-NOV-06
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=36 devtype=DISK
channel ORA_DISK_1: starting datafile conversion
input filename=/u01/oradata/USYD/SYSTEM01.DBF
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 11/22/2006 14:35:00
ORA-19635: input and output filenames are identical: /u01/oradata/USYD/SYSTEM01.DBF
You then need to move the files from the staging area and resetlogs to start
the database
SQL> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
eiger-> cd USYD/N*
eiger-> ls
SYSAUX01.DBF SYSTEM01.DBF UNDOTBS01.DBF USERS01.DBF
eiger-> pwd
/u01/oradata/USYD/NEW
eiger-> mv * ../
SQL> startup mount pfile='/u01/oradata/USYD/initUSYD.ora'
ORACLE instance started.
Total System Global Area 171966464 bytes
Fixed Size 2069712 bytes
Variable Size 113249072 bytes
Database Buffers 50331648 bytes
Redo Buffers 6316032 bytes
Database mounted.
SQL> recover database using BACKUP CONTROLFILE until cancel;
ORA-00279: change 6936865 generated at 11/22/2006 10:13:25 needed for thread 1
ORA-00289: suggestion :
/u00/app/oracle/product/10.2.0/dbs/arch1_219_600609988.dbf
ORA-00280: change 6936865 for thread 1 is in sequence #219
Specify log: {=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SQL> alter database open resetlogs;
Database altered.
Well according to Oracle disk space is cheap :-) and I know a lot of bugs that have been closed on this premise (where the workaround has meant additional disk space)
But seriously you can get around this
• You can take a backup of the source database to tape and then overwrite the datafile location (assuming you don’t need the source database anymore)
CONVERT DATABASE NEW DATABASE 'USYD'
transport script 'C:\temp\newtransportscript'
to platform 'Linux 64-bit for AMD'
db_file_name_convert 'C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\' 'C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\'
;
It is also possible to convert the database at the target to free up CPU on the source. You need to run the following on the source database
CONVERT DATABASE ON TARGET PLATFORM
CONVERT SCRIPT 'C:\temp\convertscript.rman'
TRANSPORT SCRIPT 'C:\temp\transport_at_target.sql'
new database 'USYD'
FORMAT 'C:\temp\target\%U';
This will create 2 scripts
• convertscript.rman which will have convert commands for each datafile of the database to be converted.
• transport_at_target.sql contains the syntax to create the controlfile and start the database at the target
LINUX64>>startup nomount pfile='/u01/oradata/USYD/initUSYD.ora'
ORACLE instance started.
Total System Global Area 171966464 bytes
Fixed Size 2069712 bytes
Variable Size 113249072 bytes
Database Buffers 50331648 bytes
Redo Buffers 6316032 bytes
LINUX64>>CREATE CONTROLFILE REUSE SET DATABASE "USYD" RESETLOGS ARCHIVELOG
2 MAXLOGFILES 16
3 MAXLOGMEMBERS 3
4 MAXDATAFILES 100
5 MAXINSTANCES 8
6 MAXLOGHISTORY 292
7 LOGFILE
8 GROUP 1
9 '/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-212_T-1_A-600609988_00I2I4PR' SIZE 50M,
GROUP 2
10 11 '/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-213_T-1_A-600609988_00I2I4PR' SIZE 50M,
GROUP 3
12 13 '/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-211_T-1_A-600609988_00I2I4PR' SIZE 50M
14 DATAFILE
15 '/u01/oradata/USYD/SYSTEM01.DBF',
16 '/u01/oradata/USYD/UNDOTBS01.DBF',
17 '/u01/oradata/USYD/SYSAUX01.DBF',
18 '/u01/oradata/USYD/USERS01.DBF'
19 CHARACTER SET WE8MSWIN1252;
Control file created.
Now convert the files using the script convertscript.rman
eiger-> rman target /
Recovery Manager: Release 10.2.0.2.0 - Production on Wed Nov 22 14:32:33 2006
Copyright (c) 1982, 2005, Oracle. All rights reserved.
connected to target database: USYD (DBID=768471617, not open)
RUN {
CONVERT DATAFILE '/u01/oradata/USYD/SYSTEM01.DBF'
FROM PLATFORM 'Microsoft Windows IA (32-bit)'
FORMAT '/u01/oradata/USYD/NEW/SYSTEM01.DBF';
CONVERT DATAFILE '/u01/oradata/USYD/SYSAUX01.DBF'
FROM PLATFORM 'Microsoft Windows IA (32-bit)'
FORMAT '/u01/oradata/USYD/NEW/SYSAUX01.DBF';
CONVERT DATAFILE '/u01/oradata/USYD/UNDOTBS01.DBF'
FROM PLATFORM 'Microsoft Windows IA (32-bit)'
FORMAT '/u01/oradata/USYD/NEW/UNDOTBS01.DBF';
CONVERT DATAFILE '/u01/oradata/USYD/USERS01.DBF'
FROM PLATFORM 'Microsoft Windows IA (32-bit)'
FORMAT '/u01/oradata/USYD/NEW/USERS01.DBF';
}
Starting backup at 22-NOV-06
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile conversion
input filename=/u01/oradata/USYD/SYSTEM01.DBF
converted datafile=/u01/oradata/USYD/NEW/SYSTEM01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:46
Finished backup at 22-NOV-06
Starting backup at 22-NOV-06
channel ORA_DISK_1: sid=39 devtype=DISK
channel ORA_DISK_1: starting datafile conversion
input filename=/u01/oradata/USYD/SYSAUX01.DBF
converted datafile=/u01/oradata/USYD/NEW/SYSAUX01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:02:17
Finished backup at 22-NOV-06
Starting backup at 22-NOV-06
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile conversion
input filename=/u01/oradata/USYD/UNDOTBS01.DBF
converted datafile=/u01/oradata/USYD/NEW/UNDOTBS01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:02:48
Finished backup at 22-NOV-06
Starting backup at 22-NOV-06
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile conversion
input filename=/u01/oradata/USYD/USERS01.DBF
converted datafile=/u01/oradata/USYD/NEW/USERS01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:07
Finished backup at 22-NOV-06
Unfortunately it appears that the file cannot be overwritten and hence
a staging area is required because when I tried to do that
RMAN> CONVERT DATAFILE '/u01/oradata/USYD/SYSTEM01.DBF'
FROM PLATFORM 'Microsoft Windows IA (32-bit)'
FORMAT '/u01/oradata/USYD/SYSTEM01.DBF';
Starting backup at 22-NOV-06
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=36 devtype=DISK
channel ORA_DISK_1: starting datafile conversion
input filename=/u01/oradata/USYD/SYSTEM01.DBF
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 11/22/2006 14:35:00
ORA-19635: input and output filenames are identical: /u01/oradata/USYD/SYSTEM01.DBF
You then need to move the files from the staging area and resetlogs to start
the database
SQL> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
eiger-> cd USYD/N*
eiger-> ls
SYSAUX01.DBF SYSTEM01.DBF UNDOTBS01.DBF USERS01.DBF
eiger-> pwd
/u01/oradata/USYD/NEW
eiger-> mv * ../
SQL> startup mount pfile='/u01/oradata/USYD/initUSYD.ora'
ORACLE instance started.
Total System Global Area 171966464 bytes
Fixed Size 2069712 bytes
Variable Size 113249072 bytes
Database Buffers 50331648 bytes
Redo Buffers 6316032 bytes
Database mounted.
SQL> recover database using BACKUP CONTROLFILE until cancel;
ORA-00279: change 6936865 generated at 11/22/2006 10:13:25 needed for thread 1
ORA-00289: suggestion :
/u00/app/oracle/product/10.2.0/dbs/arch1_219_600609988.dbf
ORA-00280: change 6936865 for thread 1 is in sequence #219
Specify log: {
cancel
Media recovery cancelled.
SQL> alter database open resetlogs;
Database altered.
Monday, November 20, 2006
Performing Cross-Platform Database Transport
Starting with 10g R2 it is now possible to transport an entire database (not just a few tablespaces) using the convert database RMAN command.
In the below scenario we are converting from a Windows 32 bit system to a Linux x86-64 bit system.
First we check if both platforms (source and destination) have the same endianness
using the dbms_tdb package.
The database needs to be open in read only mode for the following procedure to
work.
WINDOWS>>set serveroutput on
WINDOWS>>declare
2 db_ready boolean;
3 begin
4 db_ready := dbms_tdb.check_db('Linux 64-bit for
AMD',dbms_tdb.skip_readonly);
5 end;
6 /
PL/SQL procedure successfully completed.
If they did not have the same endianness we would have got a message like the one shown below
WINDOWS>>declare
2 db_ready boolean;
3 begin
4 db_ready := dbms_tdb.check_db('Solaris[tm] OE
(64-bit)',dbms_tdb.skip_readonly);
5 end;
6 /
The specified target platform name 'Solaris[tm] OE (64-bit)' is invalid
or the target platform is not transportable.
PL/SQL procedure successfully completed.
Next we need to identify any external objects which will not be transported.
These include external tables,directories or BFILES.
WINDOWS>>declare
2 external boolean;
3 begin
4 /* value of external is ignored, but with SERVEROUTPUT set to
ON
5 * dbms_tdb.check_external displays report of external objects
6 * on console */
7 external := dbms_tdb.check_external;
8 end;
9
10 /
The following directories exist in the database:
SYS.WORK_DIR, SYS.ADMIN_DIR, SYS.DATA_PUMP_DIR
RMAN> CONVERT DATABASE NEW DATABASE 'USYD'
2> transport script 'C:\temp\transportscript'
3> to platform 'Linux 64-bit for AMD'
4> db_file_name_convert 'C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\'
'C:\temp\USYD\'
5> ;
In the above command C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD is the
source destination for the datafiles and C:\temp\USYD\ is the folder in which
the files will be placed after they are converted. You can open this database in read write mode once the database is converted.
Starting convert at 16/NOV/06
using channel ORA_DISK_1
Directory SYS.WORK_DIR found in the database
Directory SYS.ADMIN_DIR found in the database
Directory SYS.DATA_PUMP_DIR found in the database
User SYS with SYSDBA and SYSOPER privilege found in password file
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00001
name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\SYSTEM01.DBF
converted datafile=C:\TEMP\USYD\SYSTEM01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time:
00:00:25
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00003
name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\SYSAUX01.DBF
converted datafile=C:\TEMP\USYD\SYSAUX01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time:
00:00:27
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00002
name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\UNDOTBS01.DBF
converted datafile=C:\TEMP\USYD\UNDOTBS01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time:
00:00:25
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00004
name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\USERS01.DBF
converted datafile=C:\TEMP\USYD\USERS01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time:
00:00:08
Run SQL script C:\TEMP\TRANSPORTSCRIPT on the target platform to create
database
Edit init.ora file C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\INIT_00I2I4PR_1_0.ORA.
This pfile will be used to create the database on the target platform
To recompile all PL/SQL modules, run utlirp.sql and utlrp.sql on the
target platform
To change the internal database identifier, use DBNEWID Utility
Finished backup at 16/NOV/06
Since the source is a Windows Machine and the destination Linux x86-64
bit you have to change the location of the files in the transportscript created
Create an init.ora file called initUSYD.ora with the following parameters
control_files = "/u01/oradata/USYD/cf_D-USYD_id-1778429277_00gb9u2s"
instance_name = "USYD"
plsql_native_library_dir = "/tmp/convertdb/plsqlnld1"
db_name = "USYD"
sga_max_size =5000000
compatible='10.2.0'
eiger-> sqlplus
SQL*Plus: Release 10.2.0.2.0 - Production on Mon Nov 20 17:10:28 2006
Copyright (c) 1982, 2005, Oracle. All Rights Reserved.
Enter user-name: / as sysdba
Connected to an idle instance.
SQL> startup nomount pfile='/u01/oradata/USYD/initUSYD.ora'
ORACLE instance started.
Total System Global Area 171966464 bytes
Fixed Size 2069712 bytes
Variable Size 113249072 bytes
Database Buffers 50331648 bytes
Redo Buffers 6316032 bytes
Cut and paste the commands from the transportscript created.
SQL> CREATE CONTROLFILE REUSE SET DATABASE "USYD" RESETLOGS ARCHIVELOG
2 MAXLOGFILES 16
3 MAXLOGMEMBERS 3
4 MAXDATAFILES 100
MAXINSTANCES 8
5 MAXLOGHISTORY 292
7 LOGFILE
8 GROUP 1
'/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-212_T-1_A-600609988_00I2I4PR' SIZE 50M,
9 GROUP 2
'/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-213_T-1_A-600609988_00I2I4PR' SIZE 50M,
10 GROUP 3
'/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-211_T-1_A-600609988_00I2I4PR' SIZE 50M
11 DATAFILE
12 '/u01/oradata/USYD/SYSTEM01.DBF',
13 '/u01/oradata/USYD/UNDOTBS01.DBF',
14 '/u01/oradata/USYD/SYSAUX01.DBF',
15 '/u01/oradata/USYD/USERS01.DBF'
16 CHARACTER SET WE8MSWIN1252
17 ;
Control file created.
SQL> set numwidth 20
SQL> select distinct checkpoint_change# from v$datafile;
CHECKPOINT_CHANGE#
--------------------
6631021
SQL> select distinct checkpoint_change# from v$datafile_header;
CHECKPOINT_CHANGE#
--------------------
6631021
SQL> ALTER DATABASE OPEN RESETLOGS;
Database altered.
SQL> ALTER TABLESPACE TEMP ADD TEMPFILE '/u01/oradata/USYD/DATA_D-USYD_I-768471617_TS-TEMP_FNO-1_00I2I4PR' SIZE 27262976 AUTOEXTEND ON NEXT 655360 MAXSIZE 32767M;
Tablespace altered.
In the below scenario we are converting from a Windows 32 bit system to a Linux x86-64 bit system.
First we check if both platforms (source and destination) have the same endianness
using the dbms_tdb package.
The database needs to be open in read only mode for the following procedure to
work.
WINDOWS>>set serveroutput on
WINDOWS>>declare
2 db_ready boolean;
3 begin
4 db_ready := dbms_tdb.check_db('Linux 64-bit for
AMD',dbms_tdb.skip_readonly);
5 end;
6 /
PL/SQL procedure successfully completed.
If they did not have the same endianness we would have got a message like the one shown below
WINDOWS>>declare
2 db_ready boolean;
3 begin
4 db_ready := dbms_tdb.check_db('Solaris[tm] OE
(64-bit)',dbms_tdb.skip_readonly);
5 end;
6 /
The specified target platform name 'Solaris[tm] OE (64-bit)' is invalid
or the target platform is not transportable.
PL/SQL procedure successfully completed.
Next we need to identify any external objects which will not be transported.
These include external tables,directories or BFILES.
WINDOWS>>declare
2 external boolean;
3 begin
4 /* value of external is ignored, but with SERVEROUTPUT set to
ON
5 * dbms_tdb.check_external displays report of external objects
6 * on console */
7 external := dbms_tdb.check_external;
8 end;
9
10 /
The following directories exist in the database:
SYS.WORK_DIR, SYS.ADMIN_DIR, SYS.DATA_PUMP_DIR
RMAN> CONVERT DATABASE NEW DATABASE 'USYD'
2> transport script 'C:\temp\transportscript'
3> to platform 'Linux 64-bit for AMD'
4> db_file_name_convert 'C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\'
'C:\temp\USYD\'
5> ;
In the above command C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD is the
source destination for the datafiles and C:\temp\USYD\ is the folder in which
the files will be placed after they are converted. You can open this database in read write mode once the database is converted.
Starting convert at 16/NOV/06
using channel ORA_DISK_1
Directory SYS.WORK_DIR found in the database
Directory SYS.ADMIN_DIR found in the database
Directory SYS.DATA_PUMP_DIR found in the database
User SYS with SYSDBA and SYSOPER privilege found in password file
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00001
name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\SYSTEM01.DBF
converted datafile=C:\TEMP\USYD\SYSTEM01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time:
00:00:25
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00003
name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\SYSAUX01.DBF
converted datafile=C:\TEMP\USYD\SYSAUX01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time:
00:00:27
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00002
name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\UNDOTBS01.DBF
converted datafile=C:\TEMP\USYD\UNDOTBS01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time:
00:00:25
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00004
name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\USERS01.DBF
converted datafile=C:\TEMP\USYD\USERS01.DBF
channel ORA_DISK_1: datafile conversion complete, elapsed time:
00:00:08
Run SQL script C:\TEMP\TRANSPORTSCRIPT on the target platform to create
database
Edit init.ora file C:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\INIT_00I2I4PR_1_0.ORA.
This pfile will be used to create the database on the target platform
To recompile all PL/SQL modules, run utlirp.sql and utlrp.sql on the
target platform
To change the internal database identifier, use DBNEWID Utility
Finished backup at 16/NOV/06
Since the source is a Windows Machine and the destination Linux x86-64
bit you have to change the location of the files in the transportscript created
Create an init.ora file called initUSYD.ora with the following parameters
control_files = "/u01/oradata/USYD/cf_D-USYD_id-1778429277_00gb9u2s"
instance_name = "USYD"
plsql_native_library_dir = "/tmp/convertdb/plsqlnld1"
db_name = "USYD"
sga_max_size =5000000
compatible='10.2.0'
eiger-> sqlplus
SQL*Plus: Release 10.2.0.2.0 - Production on Mon Nov 20 17:10:28 2006
Copyright (c) 1982, 2005, Oracle. All Rights Reserved.
Enter user-name: / as sysdba
Connected to an idle instance.
SQL> startup nomount pfile='/u01/oradata/USYD/initUSYD.ora'
ORACLE instance started.
Total System Global Area 171966464 bytes
Fixed Size 2069712 bytes
Variable Size 113249072 bytes
Database Buffers 50331648 bytes
Redo Buffers 6316032 bytes
Cut and paste the commands from the transportscript created.
SQL> CREATE CONTROLFILE REUSE SET DATABASE "USYD" RESETLOGS ARCHIVELOG
2 MAXLOGFILES 16
3 MAXLOGMEMBERS 3
4 MAXDATAFILES 100
MAXINSTANCES 8
5 MAXLOGHISTORY 292
7 LOGFILE
8 GROUP 1
'/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-212_T-1_A-600609988_00I2I4PR' SIZE 50M,
9 GROUP 2
'/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-213_T-1_A-600609988_00I2I4PR' SIZE 50M,
10 GROUP 3
'/u01/oradata/USYD/ARCH_D-USYD_ID-768471617_S-211_T-1_A-600609988_00I2I4PR' SIZE 50M
11 DATAFILE
12 '/u01/oradata/USYD/SYSTEM01.DBF',
13 '/u01/oradata/USYD/UNDOTBS01.DBF',
14 '/u01/oradata/USYD/SYSAUX01.DBF',
15 '/u01/oradata/USYD/USERS01.DBF'
16 CHARACTER SET WE8MSWIN1252
17 ;
Control file created.
SQL> set numwidth 20
SQL> select distinct checkpoint_change# from v$datafile;
CHECKPOINT_CHANGE#
--------------------
6631021
SQL> select distinct checkpoint_change# from v$datafile_header;
CHECKPOINT_CHANGE#
--------------------
6631021
SQL> ALTER DATABASE OPEN RESETLOGS;
Database altered.
SQL> ALTER TABLESPACE TEMP ADD TEMPFILE '/u01/oradata/USYD/DATA_D-USYD_I-768471617_TS-TEMP_FNO-1_00I2I4PR' SIZE 27262976 AUTOEXTEND ON NEXT 655360 MAXSIZE 32767M;
Tablespace altered.
Tuesday, November 14, 2006
Searching for X$ tables
You can query the names of X$ tables from X$KQFTA.
I was looking for the X$ tables related to Dynamic Remastering and found them from
SQL> select KQFTANAM from x$kqfta where KQFTANAM like 'X$KJDR%';
KQFTANAM
------------------------------
X$KJDRHV
X$KJDRPCMHV
X$KJDRPCMPF
X$KJDRMAFNSTATS
X$KJDRMHVSTATS
X$KJDRMREQ
6 rows selected.
Similary to look for dictionary tables I query from the public synonym DICT which is based on SYS.DICTIONARY.
For e.g
SQL> select table_name from dict where table_name like '%FILEST%';
TABLE_NAME
------------------------------
DBA_HIST_FILESTATXS
GV$FILESTAT
V$FILESTAT
I was looking for the X$ tables related to Dynamic Remastering and found them from
SQL> select KQFTANAM from x$kqfta where KQFTANAM like 'X$KJDR%';
KQFTANAM
------------------------------
X$KJDRHV
X$KJDRPCMHV
X$KJDRPCMPF
X$KJDRMAFNSTATS
X$KJDRMHVSTATS
X$KJDRMREQ
6 rows selected.
Similary to look for dictionary tables I query from the public synonym DICT which is based on SYS.DICTIONARY.
For e.g
SQL> select table_name from dict where table_name like '%FILEST%';
TABLE_NAME
------------------------------
DBA_HIST_FILESTATXS
GV$FILESTAT
V$FILESTAT
Monday, November 13, 2006
Diagnosing unshared SQL in 10g
One of the ways to increase scalability and throughput of an application is to see to it that SQL statements are shared and reduce the number of children of sql statements. In versions prior to 10g the view V$SQL_SHARED_CURSOR would aid in diagnosing why sql statements were not being shared.In 10g there is a new event called CURSORTRACE which aids in the same.
Let us take the following simple sql statement.
SESSION1>> select count(*) from dept;
COUNT(*)
----------
4
SESSION1>> select sql_text, hash_value,address from v$sqlarea where sql_text like 'select count(*) from dept%';
SQL_TEXT HASH_VALUE ADDRESS
----------------------------------- ---------- --------
select count(*) from dept 761178024 19156098
This the first child of the above sql statement with all 'N's
SESSION1>> select * from v$sql_shared_cursor where address = '19156098';
SQL_ID ADDRESS CHILD_AD CHILD_NUMBER U S O O S L S E B P I S T A B D L T
------------- -------- -------- ------------ - - - - - - - - - - - - - - - - - -
R I I R L I O S M U T N F A I T D L D B P C S R P T M B M R O P M F L
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1jrz3ucqpx9x8 19156098 1CEFB584 0 N N N N N N N N N N N N N N N N N N
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
Now let us open a new session and issue the above sql statement but see to it that the cursor is not shared
SESSION2>>alter session set optimizer_mode='RULE';
Session altered.
SESSION2>>select count(*) from dept;
COUNT(*)
----------
4
Check from Session 1 how many children have been generated
SESSION1>>select * from v$sql_shared_cursor where address = '19156098';
SQL_ID ADDRESS CHILD_AD CHILD_NUMBER U S O O S L S E B P I S T A B D L T
------------- -------- -------- ------------ - - - - - - - - - - - - - - - - - -
R I I R L I O S M U T N F A I T D L D B P C S R P T M B M R O P M F L
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1jrz3ucqpx9x8 19156098 1CEFB584 0 N N N N N N N N N N N N N N N N N N
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
1jrz3ucqpx9x8 19156098 1BA90164 1 N N N N N N N N N N N N N N N N N N
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N Y N N N N
As expected we have one more child caused by the OPTIMIZER_MODE_MISMATCH which we
forced.
SESSION2>>alter session set events 'immediate trace name cursortrace level 612, address 761178024';
==> 761178024 is the hash value of the sql statement.
Session altered.
SESSION2>>select count(*) from dept;
COUNT(*)
----------
4
SESSION2>>alter session set optimizer_mode='FIRST_ROWS_1';
Session altered.
SESSION2>>select count(*) from dept;
COUNT(*)
----------
4
The resultant trace file generated in user_dump_dest will throw more
light into why the sql statement is not shared.
Here is a snippet from the trace file
*************************************************************************
PARSING SQLTEXT=select count(*) from dept
SQLHASH=2d5ea7a8
Checking for already pinned child.
No valid child pinned
Parent 1EA853E8(19156098) ready for search
kksSearchChildList outside while loop
kksCheckCursor: next child is #1
kksCheckCursor: pinning child #1 in shared mode 1eb62650 1ba90164
Compilation environment difference Failed sharing : 0
optimizer_mode = first_rows_1 rule
sqlstat_enabled = false true
SQL pgadep:0 pgapls:0 user
Failed sharing : Compilation environment mismatch
kksUnlockChild: releasing child
*************************************************************************
Let us take the following simple sql statement.
SESSION1>> select count(*) from dept;
COUNT(*)
----------
4
SESSION1>> select sql_text, hash_value,address from v$sqlarea where sql_text like 'select count(*) from dept%';
SQL_TEXT HASH_VALUE ADDRESS
----------------------------------- ---------- --------
select count(*) from dept 761178024 19156098
This the first child of the above sql statement with all 'N's
SESSION1>> select * from v$sql_shared_cursor where address = '19156098';
SQL_ID ADDRESS CHILD_AD CHILD_NUMBER U S O O S L S E B P I S T A B D L T
------------- -------- -------- ------------ - - - - - - - - - - - - - - - - - -
R I I R L I O S M U T N F A I T D L D B P C S R P T M B M R O P M F L
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1jrz3ucqpx9x8 19156098 1CEFB584 0 N N N N N N N N N N N N N N N N N N
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
Now let us open a new session and issue the above sql statement but see to it that the cursor is not shared
SESSION2>>alter session set optimizer_mode='RULE';
Session altered.
SESSION2>>select count(*) from dept;
COUNT(*)
----------
4
Check from Session 1 how many children have been generated
SESSION1>>select * from v$sql_shared_cursor where address = '19156098';
SQL_ID ADDRESS CHILD_AD CHILD_NUMBER U S O O S L S E B P I S T A B D L T
------------- -------- -------- ------------ - - - - - - - - - - - - - - - - - -
R I I R L I O S M U T N F A I T D L D B P C S R P T M B M R O P M F L
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1jrz3ucqpx9x8 19156098 1CEFB584 0 N N N N N N N N N N N N N N N N N N
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
1jrz3ucqpx9x8 19156098 1BA90164 1 N N N N N N N N N N N N N N N N N N
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N Y N N N N
As expected we have one more child caused by the OPTIMIZER_MODE_MISMATCH which we
forced.
SESSION2>>alter session set events 'immediate trace name cursortrace level 612, address 761178024';
==> 761178024 is the hash value of the sql statement.
Session altered.
SESSION2>>select count(*) from dept;
COUNT(*)
----------
4
SESSION2>>alter session set optimizer_mode='FIRST_ROWS_1';
Session altered.
SESSION2>>select count(*) from dept;
COUNT(*)
----------
4
The resultant trace file generated in user_dump_dest will throw more
light into why the sql statement is not shared.
Here is a snippet from the trace file
*************************************************************************
PARSING SQLTEXT=select count(*) from dept
SQLHASH=2d5ea7a8
Checking for already pinned child.
No valid child pinned
Parent 1EA853E8(19156098) ready for search
kksSearchChildList outside while loop
kksCheckCursor: next child is #1
kksCheckCursor: pinning child #1 in shared mode 1eb62650 1ba90164
Compilation environment difference Failed sharing : 0
optimizer_mode = first_rows_1 rule
sqlstat_enabled = false true
SQL pgadep:0 pgapls:0 user
Failed sharing : Compilation environment mismatch
kksUnlockChild: releasing child
*************************************************************************
Sunday, November 05, 2006
To use DRM or not to use DRM
10g Real Application Clusters introduced a concept of resource remastering via Dynamic Resource Mastering (DRM). With DRM a resource can be re-mastered on another node in the cluster if it is found that the cache resource is accessed more frequently from that node.
In 10G R1 this was file based whereas the 10G R2 it is object based.
In 10G R1 due to a few bugs many related to high CPU usage during the DRM freeze window most customers disabled DRM by setting the following parameters
_gc_affinity_time=0
_gc_undo_affinity=FALSE
_gc_affinity_time defines the frequency in minutes to check if remastering
is needed.
_gc_affinity_limit defines the number of times a node must access an object
for it to be a DRM candidate
_gc_affinity_minimum defines the minimum number of times an object is accessed per minute before affinity kicks in
The performance problems may manifest themselves in terms of a DRM related wait event like 'gcs drm freeze in enter server mode'
In 10G R2 this feature appears to be more stable.
You can also manually remaster an object on a different node which
is different from the node on which the object is currrently mastered as shown below
SQL> select object_id,current_master, previous_master ,remaster_cnt from V$GCSPFMASTER_INFO where object_id = 144615
OBJECT_ID CURRENT_MASTER PREVIOUS_MASTER REMASTER_CNT
---------- -------------- --------------- ------------
144615 0 2 0
The object 144615 is currently mastered on node 0.
To remaster the object onto node 2 connect to node 2 as sysdba
Go to instance 2
NODE2> oradebug setmypid
Statement processed.
NODE2> oradebug lkdebug -m pkey 144615
Statement processed.
NODE2> select object_id,current_master, previous_master ,remaster_cnt from V$GCSPFMASTER_INFO where object_id = 144615
OBJECT_ID CURRENT_MASTER PREVIOUS_MASTER REMASTER_CNT
---------- -------------- --------------- ------------
144615 2 0 0
Note: In V$GCSPFMASTER_INFO you will also see resources with object ids in the 4Gb range (e.g. 4294950913)
These are for undo segments.
To dissolve remastering of this object on this instance
SQL> oradebug lkdebug -m dpkey 144615
Statement processed.
SQL> select object_id,current_master, previous_master ,remaster_cnt from V$GCSPFMASTER_INFO where object_id = 144615;
no rows selected
The remaster_cnt appears to be 0 for all objects. I have got Oracle
to log bug 5649377 on this issue.
SQL> select distinct remaster_cnt from V$GCSPFMASTER_INFO ;
REMASTER_CNT
------------
0
DRM statistics are available in X$KJDRMAFNSTATS
SQL> select * from X$KJDRMAFNSTATS
2 /
ADDR INDX INST_ID DRMS AVG_DRM_TIME OBJECTS_PER_DRM QUISCE_T FRZ_T CLEANUP_T REPLAY_T FIXWRITE_T SYNC_T
-------- ---------- ---------- ---------- ------------ --------------- ---------- ---------- ---------- ---------- ---------- ----------
RES_CLEANED REPLAY_S REPLAY_R MY_OBJECTS
----------- ---------- ---------- ----------
200089CC 0 1 32 214 1 3 14 0 0 0 99
0 2441 6952 30
The column MY_OBJECTS denotes the number of objects mastered on that node.
This should match with the following
SQL> select count(*) from V$GCSPFMASTER_INFO where current_master=0
2 /
COUNT(*)
----------
30
In 10G R1 this was file based whereas the 10G R2 it is object based.
In 10G R1 due to a few bugs many related to high CPU usage during the DRM freeze window most customers disabled DRM by setting the following parameters
_gc_affinity_time=0
_gc_undo_affinity=FALSE
_gc_affinity_time defines the frequency in minutes to check if remastering
is needed.
_gc_affinity_limit defines the number of times a node must access an object
for it to be a DRM candidate
_gc_affinity_minimum defines the minimum number of times an object is accessed per minute before affinity kicks in
The performance problems may manifest themselves in terms of a DRM related wait event like 'gcs drm freeze in enter server mode'
In 10G R2 this feature appears to be more stable.
You can also manually remaster an object on a different node which
is different from the node on which the object is currrently mastered as shown below
SQL> select object_id,current_master, previous_master ,remaster_cnt from V$GCSPFMASTER_INFO where object_id = 144615
OBJECT_ID CURRENT_MASTER PREVIOUS_MASTER REMASTER_CNT
---------- -------------- --------------- ------------
144615 0 2 0
The object 144615 is currently mastered on node 0.
To remaster the object onto node 2 connect to node 2 as sysdba
Go to instance 2
NODE2> oradebug setmypid
Statement processed.
NODE2> oradebug lkdebug -m pkey 144615
Statement processed.
NODE2> select object_id,current_master, previous_master ,remaster_cnt from V$GCSPFMASTER_INFO where object_id = 144615
OBJECT_ID CURRENT_MASTER PREVIOUS_MASTER REMASTER_CNT
---------- -------------- --------------- ------------
144615 2 0 0
Note: In V$GCSPFMASTER_INFO you will also see resources with object ids in the 4Gb range (e.g. 4294950913)
These are for undo segments.
To dissolve remastering of this object on this instance
SQL> oradebug lkdebug -m dpkey 144615
Statement processed.
SQL> select object_id,current_master, previous_master ,remaster_cnt from V$GCSPFMASTER_INFO where object_id = 144615;
no rows selected
The remaster_cnt appears to be 0 for all objects. I have got Oracle
to log bug 5649377 on this issue.
SQL> select distinct remaster_cnt from V$GCSPFMASTER_INFO ;
REMASTER_CNT
------------
0
DRM statistics are available in X$KJDRMAFNSTATS
SQL> select * from X$KJDRMAFNSTATS
2 /
ADDR INDX INST_ID DRMS AVG_DRM_TIME OBJECTS_PER_DRM QUISCE_T FRZ_T CLEANUP_T REPLAY_T FIXWRITE_T SYNC_T
-------- ---------- ---------- ---------- ------------ --------------- ---------- ---------- ---------- ---------- ---------- ----------
RES_CLEANED REPLAY_S REPLAY_R MY_OBJECTS
----------- ---------- ---------- ----------
200089CC 0 1 32 214 1 3 14 0 0 0 99
0 2441 6952 30
The column MY_OBJECTS denotes the number of objects mastered on that node.
This should match with the following
SQL> select count(*) from V$GCSPFMASTER_INFO where current_master=0
2 /
COUNT(*)
----------
30
Tuesday, October 24, 2006
Recovering a Standby database from a missing archivelog
A Physical Standby database relies on continuous application of archivelogs from a Primary Database to be in synch with it. In Oracle Database versions prior to 10g in the event of an archivelog gone missing or corrupt you had to rebuild the standby database from scratch.
In 10g you can use an incremental backup and recover the standby using the same to compensate for the missing archivelogs as shown below
In the case below archivelogs with sequence numbers 137 and 138 which are required on the standby are deleted to simulate this problem.
Step 1: On the standby database check the current scn.
STDBY>>select current_scn from v$database;
CURRENT_SCN
-----------
4793543
Step 2: On the primary database create the needed incremental backup from the above SCN
C:\Documents and Settings\frego>rman target /
Recovery Manager: Release 10.2.0.2.0 - Production on Wed Oct 25 13:39:04 2006
Copyright (c) 1982, 2005, Oracle. All rights reserved.
connected to target database: USYD (DBID=768471617)
RMAN> BACKUP DEVICE TYPE DISK INCREMENTAL FROM SCN 4793543 DATABASE
FORMAT 'C:\temp\bkup_%U';
Starting backup at 25/OCT/06
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00001 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\SYSTEM01.DBF
input datafile fno=00002 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\UNDOTBS01.DBF
input datafile fno=00003 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\SYSAUX01.DBF
input datafile fno=00004 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\USERS01.DBF
channel ORA_DISK_1: starting piece 1 at 25/OCT/06
channel ORA_DISK_1: finished piece 1 at 25/OCT/06
piece handle=C:\TEMP\BKUP_02I0NSNE_1_1 tag=TAG20061025T134102 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:35
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
including current SPFILE in backupset
channel ORA_DISK_1: starting piece 1 at 25/OCT/06
channel ORA_DISK_1: finished piece 1 at 25/OCT/06
piece handle=C:\TEMP\BKUP_03I0NSOI_1_1 tag=TAG20061025T134102 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 25/OCT/06
Step 3: Cancel managed recovery at the standby database
STDBY>>recover managed standby database cancel;
Media recovery complete.
Move the backup files to a new folder called INCR_BACKUP so that they are the only files in that folder.
Step 4: Catalog the Incremental Backup Files at the Standby Database
C:\oracle\product\10.2.0\db_1\database>rman target /
Recovery Manager: Release 10.2.0.2.0 - Production on Wed Oct 25 14:54:27 2006
Copyright (c) 1982, 2005, Oracle. All rights reserved.
connected to target database: USYD (DBID=768471617, not open)
RMAN> CATALOG START WITH 'C:\temp\INCR_BACKUP';
searching for all files that match the pattern C:\temp\INCR_BACKUP
List of Files Unknown to the Database
=====================================
File Name: C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_1_1
File Name: C:\TEMP\INCR_BACKUP\BKUP_03I0NSOI_1_1
Do you really want to catalog the above files (enter YES or NO)? YES
cataloging files...
cataloging done
List of Cataloged Files
=======================
File Name: C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_1_1
File Name: C:\TEMP\INCR_BACKUP\BKUP_03I0NSOI_1_1
Step 5: Apply the Incremental Backup to the Standby Database
RMAN> RECOVER DATABASE NOREDO;
Starting recover at 25/OCT/06
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=151 devtype=DISK
channel ORA_DISK_1: starting incremental datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\SYSTEM01.DBF
destination for restore of datafile 00002: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\UNDOTBS01.DBF
destination for restore of datafile 00003: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\SYSAUX01.DBF
destination for restore of datafile 00004: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\USERS01.DBF
channel ORA_DISK_1: reading from backup piece C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_
1_1
channel ORA_DISK_1: restored backup piece 1
piece handle=C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_1_1 tag=TAG20061025T134102
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
Finished recover at 25/OCT/06
Step 6: Put the standby database back to managed recovery mode.
STDBY>> recover managed standby database nodelay disconnect;
Media recovery complete.
From the alert.log you will notice that the standby database is still looking for the old log files
*************************************************
FAL[client]: Failed to request gap sequence
GAP - thread 1 sequence 137-137
DBID 768471617 branch 600609988
**************************************************
This is because the controlfile has not been updated.
Hence the standby controlfile has to be recreated
On the primary
PRIM>>alter database create standby controlfile as 'C:\temp\standby01.ctl';
System altered.
Copy the standby control file to the standby site and restart the standby database in managed recovery mode...
Managed Standby Recovery not using Real Time Apply
parallel recovery started with 2 processes
Media Recovery Waiting for thread 1 sequence 139
As you can see from the above output the standby is now looking for the next archivelog.
If you have the luxury of time at your disposal you can do an export of the standby database in read only mode.
In 10g you can use an incremental backup and recover the standby using the same to compensate for the missing archivelogs as shown below
In the case below archivelogs with sequence numbers 137 and 138 which are required on the standby are deleted to simulate this problem.
Step 1: On the standby database check the current scn.
STDBY>>select current_scn from v$database;
CURRENT_SCN
-----------
4793543
Step 2: On the primary database create the needed incremental backup from the above SCN
C:\Documents and Settings\frego>rman target /
Recovery Manager: Release 10.2.0.2.0 - Production on Wed Oct 25 13:39:04 2006
Copyright (c) 1982, 2005, Oracle. All rights reserved.
connected to target database: USYD (DBID=768471617)
RMAN> BACKUP DEVICE TYPE DISK INCREMENTAL FROM SCN 4793543 DATABASE
FORMAT 'C:\temp\bkup_%U';
Starting backup at 25/OCT/06
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00001 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\SYSTEM01.DBF
input datafile fno=00002 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\UNDOTBS01.DBF
input datafile fno=00003 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\SYSAUX01.DBF
input datafile fno=00004 name=C:\ORACLE\PRODUCT\10.2.0\ORADATA\USYD\USERS01.DBF
channel ORA_DISK_1: starting piece 1 at 25/OCT/06
channel ORA_DISK_1: finished piece 1 at 25/OCT/06
piece handle=C:\TEMP\BKUP_02I0NSNE_1_1 tag=TAG20061025T134102 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:35
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
including current SPFILE in backupset
channel ORA_DISK_1: starting piece 1 at 25/OCT/06
channel ORA_DISK_1: finished piece 1 at 25/OCT/06
piece handle=C:\TEMP\BKUP_03I0NSOI_1_1 tag=TAG20061025T134102 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 25/OCT/06
Step 3: Cancel managed recovery at the standby database
STDBY>>recover managed standby database cancel;
Media recovery complete.
Move the backup files to a new folder called INCR_BACKUP so that they are the only files in that folder.
Step 4: Catalog the Incremental Backup Files at the Standby Database
C:\oracle\product\10.2.0\db_1\database>rman target /
Recovery Manager: Release 10.2.0.2.0 - Production on Wed Oct 25 14:54:27 2006
Copyright (c) 1982, 2005, Oracle. All rights reserved.
connected to target database: USYD (DBID=768471617, not open)
RMAN> CATALOG START WITH 'C:\temp\INCR_BACKUP';
searching for all files that match the pattern C:\temp\INCR_BACKUP
List of Files Unknown to the Database
=====================================
File Name: C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_1_1
File Name: C:\TEMP\INCR_BACKUP\BKUP_03I0NSOI_1_1
Do you really want to catalog the above files (enter YES or NO)? YES
cataloging files...
cataloging done
List of Cataloged Files
=======================
File Name: C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_1_1
File Name: C:\TEMP\INCR_BACKUP\BKUP_03I0NSOI_1_1
Step 5: Apply the Incremental Backup to the Standby Database
RMAN> RECOVER DATABASE NOREDO;
Starting recover at 25/OCT/06
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=151 devtype=DISK
channel ORA_DISK_1: starting incremental datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\SYSTEM01.DBF
destination for restore of datafile 00002: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\UNDOTBS01.DBF
destination for restore of datafile 00003: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\SYSAUX01.DBF
destination for restore of datafile 00004: C:\ORACLE\PRODUCT\10.2.0\ORADATA\STDB
Y\USERS01.DBF
channel ORA_DISK_1: reading from backup piece C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_
1_1
channel ORA_DISK_1: restored backup piece 1
piece handle=C:\TEMP\INCR_BACKUP\BKUP_02I0NSNE_1_1 tag=TAG20061025T134102
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
Finished recover at 25/OCT/06
Step 6: Put the standby database back to managed recovery mode.
STDBY>> recover managed standby database nodelay disconnect;
Media recovery complete.
From the alert.log you will notice that the standby database is still looking for the old log files
*************************************************
FAL[client]: Failed to request gap sequence
GAP - thread 1 sequence 137-137
DBID 768471617 branch 600609988
**************************************************
This is because the controlfile has not been updated.
Hence the standby controlfile has to be recreated
On the primary
PRIM>>alter database create standby controlfile as 'C:\temp\standby01.ctl';
System altered.
Copy the standby control file to the standby site and restart the standby database in managed recovery mode...
Managed Standby Recovery not using Real Time Apply
parallel recovery started with 2 processes
Media Recovery Waiting for thread 1 sequence 139
As you can see from the above output the standby is now looking for the next archivelog.
If you have the luxury of time at your disposal you can do an export of the standby database in read only mode.
Sunday, October 15, 2006
Case Study on How to diagnose a node reboot
The following article looks at how to diagnose a node reboot in a 10G RAC cluster.
Configuration:
2 node RAC on 10.2.0.1 on RH ES 4.0 (kernel 2.6.9-34.0.2.ELsmp)
To diagnose such issues it is very important that you get the timelines of the occurrences of the reboot spot on. It is a recommended practice to have the NTP daemon running on all nodes in the cluster so that the wall clocks of all nodes in the cluster are in sync.
hippo >ps -ef | grep ntp | grep -v 'grep'
ntp 3349 1 0 Oct15 ? 00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
1. Check the Linux log file (/var/log/messages) to determine the time of the reboot
In this case the reboot occurred at 02:09 hours on the node hippo
Oct 12 09:24:34 hippo sshd(pam_unix)[5885]: session opened for user frego by frego (uid=0)
Oct 15 02:09:17 hippo syslogd 1.4.1: restart.
Oct 15 02:09:17 hippo syslog: syslogd startup succeeded
Oct 15 02:09:17 hippo kernel: klogd 1.4.1, log source = /proc/kmsg started.
Check the Linux log file on the other node (elephant) at the same point in time.
Oct 14 17:42:01 elephant su(pam_unix)[18041]: session closed for user root
Oct 15 02:09:12 elephant syslogd 1.4.1: restart.
Oct 15 02:09:12 elephant syslog: syslogd startup succeeded
Oct 15 02:09:12 elephant kernel: klogd 1.4.1, log source = /proc/kmsg started.
From the above output it is clear that both the nodes rebooted at approximately the same time.
2. Check which process is responsible for the reboot.
Review the log file of the CSS daemon. The log file resides in $ORA_CRS_HOME/log/
hippo >pwd
/u00/app/oracle/product/10.2.0/crs/log/hippo
hippo >cd cssd
hippo >pwd
/u00/app/oracle/product/10.2.0/crs/log/hippo/cssd
hippo >ls -ltr
total 9376
drwxr-xr-x 2 oracle dba 4096 Sep 8 18:08 oclsmon
-rw-r----- 1 oracle dba 105 Oct 15 02:02 ocssd.trc
-rw-r--r-- 1 oracle dba 5 Oct 15 02:11 hippo.pid
-rw-r--r-- 1 oracle dba 17 Oct 15 02:11 cssdOUT.log
-rw-r--r-- 1 oracle dba 9567227 Oct 16 15:13 ocssd.log
From the ocssd.log file
[ CSSD]2006-10-15 02:02:48.697 [50609072] >WARNING: clssnmDiskPMT: long disk latency (54990 ms) to voting disk (0//crs/voting.dsk)
[ CSSD]2006-10-15 02:02:50.719 [50609072] >ERROR: clssnmDiskPMT: 1 of 1 voting disks unavailable (0/0/1)
[ CSSD]2006-10-15 02:02:50.734 [50609072] >TRACE: clssscctx: dump of 0x0x810d500, len 3528
Hence the reboot has been caused due to latency issues in accessing the voting disk. The specific function which monitors this is the DiskPing Monitor Thread which monitors the length of a single I/O to the voting disk and marks the voting disk is unavailable. This has caused the CSS daemon to fail.
As seen from the /etc/inittab
h2:35:respawn:/etc/init.d/init.cssd fatal > /dev/null 2>&1 < /dev/null
a failure exit of the CSS daemon causes the machine to reboot.
This indicates the node is rebooted in case of CSS daemon failure.
This is also logged in the alertelephant.log in the $ORA_CRS_HOME/log/ which is the alert.log of the CRS stack.
2006-10-15 02:02:50.740
[cssd(8533)]CRS-1606:CSSD Insufficient voting files available [0 of 1]. Details in /u00/app/oracle/product/10.2.0/crs/log/elephant/cssd/ocssd.log.
Check the ocssd.log file on the other node (elephant) too
[ CSSD]2006-10-15 02:01:40.255 [50617264] >TRACE: clssgmClientConnectMsg: Connect from con(0xb7c3f580) proc(0xb7c67918) pid() proto(10:2:1:1)
[ CSSD]2006-10-15 02:02:50.724 [112925616] >ERROR: clssnmDiskPMT: 1 of 1 voting disks unavailable (0/0/1)
[ CSSD]2006-10-15 02:02:50.777 [112925616] >TRACE: clssscctx: dump of 0x0x810d500, len 3528
There have been a couple of additional CSS related parameters introduced in the 10.2.0.2 patchset to address long I/O requirements of storage vendors such as EMC and NetApp.
• reboottime: (default 3 seconds)
The amount of time allowed for a node to complete a reboot after the CSS daemon has been evicted.
This parameter can be set via the command
crsctl set css reboottime R [-force] (R is seconds)
• disktimeout (default 200 seconds)
The maximum amount of time allowed for a voting file I/O to complete; if this time is exceeded the voting disk will be marked as unavailable
This parameter can be set via the command
crsctl set css disktimeout D [-force] (D is seconds)
These commands must be run as root on a node where the CRSD is up, unless
'-force' is specified, in which case the the CRS stack should not be up on any other node.
After the commands are run all CSS daemons must be restarted for the settings to take effect on all nodes asap.
You can verify the settings by doing an ocrdump and checking the values in the OCRDUMPFILE
Configuration:
2 node RAC on 10.2.0.1 on RH ES 4.0 (kernel 2.6.9-34.0.2.ELsmp)
To diagnose such issues it is very important that you get the timelines of the occurrences of the reboot spot on. It is a recommended practice to have the NTP daemon running on all nodes in the cluster so that the wall clocks of all nodes in the cluster are in sync.
hippo >ps -ef | grep ntp | grep -v 'grep'
ntp 3349 1 0 Oct15 ? 00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
1. Check the Linux log file (/var/log/messages) to determine the time of the reboot
In this case the reboot occurred at 02:09 hours on the node hippo
Oct 12 09:24:34 hippo sshd(pam_unix)[5885]: session opened for user frego by frego (uid=0)
Oct 15 02:09:17 hippo syslogd 1.4.1: restart.
Oct 15 02:09:17 hippo syslog: syslogd startup succeeded
Oct 15 02:09:17 hippo kernel: klogd 1.4.1, log source = /proc/kmsg started.
Check the Linux log file on the other node (elephant) at the same point in time.
Oct 14 17:42:01 elephant su(pam_unix)[18041]: session closed for user root
Oct 15 02:09:12 elephant syslogd 1.4.1: restart.
Oct 15 02:09:12 elephant syslog: syslogd startup succeeded
Oct 15 02:09:12 elephant kernel: klogd 1.4.1, log source = /proc/kmsg started.
From the above output it is clear that both the nodes rebooted at approximately the same time.
2. Check which process is responsible for the reboot.
Review the log file of the CSS daemon. The log file resides in $ORA_CRS_HOME/log/
hippo >pwd
/u00/app/oracle/product/10.2.0/crs/log/hippo
hippo >cd cssd
hippo >pwd
/u00/app/oracle/product/10.2.0/crs/log/hippo/cssd
hippo >ls -ltr
total 9376
drwxr-xr-x 2 oracle dba 4096 Sep 8 18:08 oclsmon
-rw-r----- 1 oracle dba 105 Oct 15 02:02 ocssd.trc
-rw-r--r-- 1 oracle dba 5 Oct 15 02:11 hippo.pid
-rw-r--r-- 1 oracle dba 17 Oct 15 02:11 cssdOUT.log
-rw-r--r-- 1 oracle dba 9567227 Oct 16 15:13 ocssd.log
From the ocssd.log file
[ CSSD]2006-10-15 02:02:48.697 [50609072] >WARNING: clssnmDiskPMT: long disk latency (54990 ms) to voting disk (0//crs/voting.dsk)
[ CSSD]2006-10-15 02:02:50.719 [50609072] >ERROR: clssnmDiskPMT: 1 of 1 voting disks unavailable (0/0/1)
[ CSSD]2006-10-15 02:02:50.734 [50609072] >TRACE: clssscctx: dump of 0x0x810d500, len 3528
Hence the reboot has been caused due to latency issues in accessing the voting disk. The specific function which monitors this is the DiskPing Monitor Thread which monitors the length of a single I/O to the voting disk and marks the voting disk is unavailable. This has caused the CSS daemon to fail.
As seen from the /etc/inittab
h2:35:respawn:/etc/init.d/init.cssd fatal > /dev/null 2>&1 < /dev/null
a failure exit of the CSS daemon causes the machine to reboot.
This indicates the node is rebooted in case of CSS daemon failure.
This is also logged in the alertelephant.log in the $ORA_CRS_HOME/log/
2006-10-15 02:02:50.740
[cssd(8533)]CRS-1606:CSSD Insufficient voting files available [0 of 1]. Details in /u00/app/oracle/product/10.2.0/crs/log/elephant/cssd/ocssd.log.
Check the ocssd.log file on the other node (elephant) too
[ CSSD]2006-10-15 02:01:40.255 [50617264] >TRACE: clssgmClientConnectMsg: Connect from con(0xb7c3f580) proc(0xb7c67918) pid() proto(10:2:1:1)
[ CSSD]2006-10-15 02:02:50.724 [112925616] >ERROR: clssnmDiskPMT: 1 of 1 voting disks unavailable (0/0/1)
[ CSSD]2006-10-15 02:02:50.777 [112925616] >TRACE: clssscctx: dump of 0x0x810d500, len 3528
There have been a couple of additional CSS related parameters introduced in the 10.2.0.2 patchset to address long I/O requirements of storage vendors such as EMC and NetApp.
• reboottime: (default 3 seconds)
The amount of time allowed for a node to complete a reboot after the CSS daemon has been evicted.
This parameter can be set via the command
crsctl set css reboottime R [-force] (R is seconds)
• disktimeout (default 200 seconds)
The maximum amount of time allowed for a voting file I/O to complete; if this time is exceeded the voting disk will be marked as unavailable
This parameter can be set via the command
crsctl set css disktimeout D [-force] (D is seconds)
These commands must be run as root on a node where the CRSD is up, unless
'-force' is specified, in which case the the CRS stack should not be up on any other node.
After the commands are run all CSS daemons must be restarted for the settings to take effect on all nodes asap.
You can verify the settings by doing an ocrdump and checking the values in the OCRDUMPFILE
Tuesday, October 03, 2006
Clearing UNKNOWN status of CRS resources
If you get the following errors whilst starting up nodeapps on a 10G RAC
cluster
bart-> srvctl start nodeapps -n bart
CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.bart.gsd' has placement error.
CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.bart.ons' has placement error.
Check the status of these resources.
bart-> ./ShowCRS | grep UNK
ora.bart.gsd ONLINE UNKNOWN on bart
ora.bart.ons ONLINE UNKNOWN on bart
To clear the state of the resource you have to use the force option
in the crs_stop command
bart-> crs_stop ora.bart.gsd -f
Attempting to stop `ora.bart.gsd` on member `bart`
Stop of `ora.bart.gsd` on member `bart` succeeded.
bart-> crs_stop ora.bart.ons -f
Attempting to stop `ora.bart.ons` on member `bart`
Stop of `ora.bart.ons` on member `bart` succeeded.
bart-> srvctl start nodeapps -n bart
bart-> ./ShowCRS | grep UNK
bart-> ./ShowCRS | grep bart
ora.bart.gsd ONLINE ONLINE on bart
ora.bart.ons ONLINE ONLINE on bart
ora.bart.vip ONLINE ONLINE on bart
ShowCRS is a shell script which is based on Metalink Note 259301.1
cluster
bart-> srvctl start nodeapps -n bart
CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.bart.gsd' has placement error.
CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.bart.ons' has placement error.
Check the status of these resources.
bart-> ./ShowCRS | grep UNK
ora.bart.gsd ONLINE UNKNOWN on bart
ora.bart.ons ONLINE UNKNOWN on bart
To clear the state of the resource you have to use the force option
in the crs_stop command
bart-> crs_stop ora.bart.gsd -f
Attempting to stop `ora.bart.gsd` on member `bart`
Stop of `ora.bart.gsd` on member `bart` succeeded.
bart-> crs_stop ora.bart.ons -f
Attempting to stop `ora.bart.ons` on member `bart`
Stop of `ora.bart.ons` on member `bart` succeeded.
bart-> srvctl start nodeapps -n bart
bart-> ./ShowCRS | grep UNK
bart-> ./ShowCRS | grep bart
ora.bart.gsd ONLINE ONLINE on bart
ora.bart.ons ONLINE ONLINE on bart
ora.bart.vip ONLINE ONLINE on bart
ShowCRS is a shell script which is based on Metalink Note 259301.1
Wednesday, September 06, 2006
How to find the location of your voting disk
Below is a quick way to find the location of your voting disk in a 10G RAC cluster
bart-> ocrdump -stdout -keyname SYSTEM.css.diskfile
09/07/2006 16:51:53
ocrdump -stdout -keyname SYSTEM.css.diskfile
[SYSTEM.css.diskfile]
ORATEXT : /crs/voting.dsk
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION : PROCR_READ, OTHER_PERMISSION : PROCR_READ, USER_NAME : root, GROUP_NAME : root}
bart-> ocrdump -stdout -keyname SYSTEM.css.diskfile
09/07/2006 16:51:53
ocrdump -stdout -keyname SYSTEM.css.diskfile
[SYSTEM.css.diskfile]
ORATEXT : /crs/voting.dsk
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION : PROCR_READ, OTHER_PERMISSION : PROCR_READ, USER_NAME : root, GROUP_NAME : root}
Wednesday, August 30, 2006
Automatic Shared Memory Management
Starting with Oracle 10G R1 there is a new parameter sga_target which determines whether you want Oracle to manage the sizes of various pools within the SGA or whether you want to manually manage memory allocated to Oracle.
To use this feature the value of STATISTICS_LEVEL should either be set to TYPICAL or ALL.
But this feature is not without its fair share of issues especially on large and intensive databases.
The reason for this is two fold.
The following bugs exist
4587117 (P) LARGE_POOL_SIZE AND JAVA_POOL_SIZE DO NOT SHRINK IN ASMM Fixed in 11.0
4466399 (U)
4472338 (U)
(duplicate of the above 4466399)
4507532 (U)
(turned out to be a bad diagnosis, but the bug is real)
4920199 (U)
Secondly the default behaviour is that if the shared pool is stressed and requires more memory it is removed from the buffer cache. But the reverse is not true. However this default behaviour can be changed by setting _memory_broker_shrink_heaps=30
The default value for the above parameter is 0
The shrinking of the Java pool depends on the value of _memory_broker_shrink_java_heaps. The default value is 900.
You can set the following event to trace memory management. The MMON process trace file will update statistics related to the different pools every 5 mins.
SQL> alter system set "_memory_management_tracing"=31;
System altered.
The default value for the above parameter is 0
To use this feature the value of STATISTICS_LEVEL should either be set to TYPICAL or ALL.
But this feature is not without its fair share of issues especially on large and intensive databases.
The reason for this is two fold.
The following bugs exist
4587117 (P) LARGE_POOL_SIZE AND JAVA_POOL_SIZE DO NOT SHRINK IN ASMM Fixed in 11.0
4466399 (U)
4472338 (U)
(duplicate of the above 4466399)
4507532 (U)
(turned out to be a bad diagnosis, but the bug is real)
4920199 (U)
Secondly the default behaviour is that if the shared pool is stressed and requires more memory it is removed from the buffer cache. But the reverse is not true. However this default behaviour can be changed by setting _memory_broker_shrink_heaps=30
The default value for the above parameter is 0
The shrinking of the Java pool depends on the value of _memory_broker_shrink_java_heaps. The default value is 900.
You can set the following event to trace memory management. The MMON process trace file will update statistics related to the different pools every 5 mins.
SQL> alter system set "_memory_management_tracing"=31;
System altered.
The default value for the above parameter is 0
Wednesday, August 23, 2006
Shrinking segments
Shrinking of a segment makes unused space available to other segments in the tablespace and may improve the performance of queries and DML operations.
The function VERIFY_SHRINK_CANDIDATE in the DBMS_SPACE is an undocumented function which informs us about the size of the segment upto which we can shrink.
declare
res boolean;
begin
res := dbms_space.verify_shrink_candidate('SCOTT','EMP','TABLE',1032192);
if (res)
then dbms_output.put_line('Can shrink to this size');
end if;
end;
/
Can shrink to this size
PL/SQL procedure successfully completed.
declare
res boolean;
begin
res := dbms_space.verify_shrink_candidate('SCOTT','EMP','TABLE',1032191);
if (res)
then dbms_output.put_line('Can shrink to this size');
end if;
end;
/
PL/SQL procedure successfully completed.
In the above example we can shrink a segment to 1032192 bytes.
The function VERIFY_SHRINK_CANDIDATE in the DBMS_SPACE is an undocumented function which informs us about the size of the segment upto which we can shrink.
declare
res boolean;
begin
res := dbms_space.verify_shrink_candidate('SCOTT','EMP','TABLE',1032192);
if (res)
then dbms_output.put_line('Can shrink to this size');
end if;
end;
/
Can shrink to this size
PL/SQL procedure successfully completed.
declare
res boolean;
begin
res := dbms_space.verify_shrink_candidate('SCOTT','EMP','TABLE',1032191);
if (res)
then dbms_output.put_line('Can shrink to this size');
end if;
end;
/
PL/SQL procedure successfully completed.
In the above example we can shrink a segment to 1032192 bytes.
Friday, July 14, 2006
OCR backups
In Oracle 10G Release 1 there is no provision to mirror the Oracle cluster registry (OCR) at the software level. In 10G R2 you can specify a mirror location.
Oracle automatically backs up the OCR every 4 hours. If your Oracle Home is not sitting on a shared mountpoint the backups go to
$ORA_CRS_HOME/cdata on the master node. This presents a single point of failure.
Oracle recommends that the CRS backups are put on shared storage. This can be done using the ocrconfig tool.
The syntax for the same is
ocrconfig -backuploc "dir"
Please note that the above command has to be run as root. If you run the command as the oracle user it does not take effect although the command returns successfully and you will see the following error in the ocrconfig.log file
ocrconfig starts...
Warning!! cluster checking is disabled
Failure in creating key [SYSTEM.OCR.BACKUP.LOC] OCR error [PROC-5: User does not have permission to perform a cluster registry operation on this key.]
Use the ocrconfig -showbackup command to check if the destination has changed
You can also take a logical backup using the export option in the ocrconfig command.
The command needs to be run as root
# ocrconfig -export ocr.tmp -s online
As you can see from the below output there is a difference in size between a backup taken by the export option and the OCR backup which is automatically taken every 4 hours.
/tmp> ls -ltr
-rw-r----- 1 root root 26951680 Aug 26 09:08 backup00.ocr
-rw-r--r-- 1 root other 690062 Aug 26 09:27 ocr.tmp
This is because the export is a logical backup and is to be restored using the import option. The OCR backups which happen every 4 hours are binary block by block
copies of the OCR.
As with all backups the only way to test their integrity is to restore them.
But you can use the following command to check the CRS stack and OCR integrity
bart:TEST1:/tmp/bin> ./cluvfy stage -post crsinst -n all -verbose
Performing post-checks for cluster services setup
Checking node reachability...
Check: Node reachability from node "bart"
Destination Node Reachable?
------------------------------------ ------------------------
homer yes
bart yes
Result: Node reachability check passed from node "bart".
Checking user equivalence...
Check: User equivalence for user "oracle"
Node Name Comment
------------------------------------ ------------------------
homer passed
bart passed
Result: User equivalence check passed for user "oracle".
Checking Cluster manager integrity...
Checking CSS daemon...
Node Name Status
------------------------------------ ------------------------
homer running
bart running
Result: Daemon status check passed for "CSS daemon".
Cluster manager integrity check passed.
Checking cluster integrity...
Node Name
------------------------------------
bart
homer
Cluster integrity check passed
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.
Checking the version of OCR...
OCR of correct Version "2" exists.
Checking data integrity of OCR...
Data integrity check for OCR passed.
OCR integrity check passed.
Checking CRS integrity...
Checking daemon liveness...
Check: Liveness for "CRS daemon"
Node Name Running
------------------------------------ ------------------------
homer yes
bart yes
Result: Liveness check passed for "CRS daemon".
Checking daemon liveness...
Check: Liveness for "CSS daemon"
Node Name Running
------------------------------------ ------------------------
homer yes
bart yes
Result: Liveness check passed for "CSS daemon".
Checking daemon liveness...
Check: Liveness for "EVM daemon"
Node Name Running
------------------------------------ ------------------------
homer yes
bart yes
Result: Liveness check passed for "EVM daemon".
Liveness of all the daemons
Node Name CRS daemon CSS daemon EVM daemon
------------ ------------------------ ------------------------ ----------
homer yes yes yes
bart yes yes yes
Checking CRS health...
Check: Health of CRS
Node Name CRS OK?
------------------------------------ ------------------------
homer yes
bart yes
Result: CRS health check passed.
CRS integrity check passed.
Post-check for cluster services setup was successful.
bart:TEST1:/tmp/bin> ./cluvfy comp ocr -n all -verbose
Verifying OCR integrity
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.
Checking the version of OCR...
OCR of correct Version "2" exists.
Checking data integrity of OCR...
Data integrity check for OCR passed.
OCR integrity check passed.
Verification of OCR integrity was successful.
bart:TEST1:/tmp/bin> ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 262144
Used space (kbytes) : 16256
Available space (kbytes) : 245888
Cluster registry integrity check succeeded
You can also compare the properties of different nodes in the cluster
using cluvfy comp peer -n all
Oracle automatically backs up the OCR every 4 hours. If your Oracle Home is not sitting on a shared mountpoint the backups go to
$ORA_CRS_HOME/cdata on the master node. This presents a single point of failure.
Oracle recommends that the CRS backups are put on shared storage. This can be done using the ocrconfig tool.
The syntax for the same is
ocrconfig -backuploc "dir"
Please note that the above command has to be run as root. If you run the command as the oracle user it does not take effect although the command returns successfully and you will see the following error in the ocrconfig.log file
ocrconfig starts...
Warning!! cluster checking is disabled
Failure in creating key [SYSTEM.OCR.BACKUP.LOC] OCR error [PROC-5: User does not have permission to perform a cluster registry operation on this key.]
Use the ocrconfig -showbackup command to check if the destination has changed
You can also take a logical backup using the export option in the ocrconfig command.
The command needs to be run as root
# ocrconfig -export ocr.tmp -s online
As you can see from the below output there is a difference in size between a backup taken by the export option and the OCR backup which is automatically taken every 4 hours.
/tmp> ls -ltr
-rw-r----- 1 root root 26951680 Aug 26 09:08 backup00.ocr
-rw-r--r-- 1 root other 690062 Aug 26 09:27 ocr.tmp
This is because the export is a logical backup and is to be restored using the import option. The OCR backups which happen every 4 hours are binary block by block
copies of the OCR.
As with all backups the only way to test their integrity is to restore them.
But you can use the following command to check the CRS stack and OCR integrity
bart:TEST1:/tmp/bin> ./cluvfy stage -post crsinst -n all -verbose
Performing post-checks for cluster services setup
Checking node reachability...
Check: Node reachability from node "bart"
Destination Node Reachable?
------------------------------------ ------------------------
homer yes
bart yes
Result: Node reachability check passed from node "bart".
Checking user equivalence...
Check: User equivalence for user "oracle"
Node Name Comment
------------------------------------ ------------------------
homer passed
bart passed
Result: User equivalence check passed for user "oracle".
Checking Cluster manager integrity...
Checking CSS daemon...
Node Name Status
------------------------------------ ------------------------
homer running
bart running
Result: Daemon status check passed for "CSS daemon".
Cluster manager integrity check passed.
Checking cluster integrity...
Node Name
------------------------------------
bart
homer
Cluster integrity check passed
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.
Checking the version of OCR...
OCR of correct Version "2" exists.
Checking data integrity of OCR...
Data integrity check for OCR passed.
OCR integrity check passed.
Checking CRS integrity...
Checking daemon liveness...
Check: Liveness for "CRS daemon"
Node Name Running
------------------------------------ ------------------------
homer yes
bart yes
Result: Liveness check passed for "CRS daemon".
Checking daemon liveness...
Check: Liveness for "CSS daemon"
Node Name Running
------------------------------------ ------------------------
homer yes
bart yes
Result: Liveness check passed for "CSS daemon".
Checking daemon liveness...
Check: Liveness for "EVM daemon"
Node Name Running
------------------------------------ ------------------------
homer yes
bart yes
Result: Liveness check passed for "EVM daemon".
Liveness of all the daemons
Node Name CRS daemon CSS daemon EVM daemon
------------ ------------------------ ------------------------ ----------
homer yes yes yes
bart yes yes yes
Checking CRS health...
Check: Health of CRS
Node Name CRS OK?
------------------------------------ ------------------------
homer yes
bart yes
Result: CRS health check passed.
CRS integrity check passed.
Post-check for cluster services setup was successful.
bart:TEST1:/tmp/bin> ./cluvfy comp ocr -n all -verbose
Verifying OCR integrity
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.
Checking the version of OCR...
OCR of correct Version "2" exists.
Checking data integrity of OCR...
Data integrity check for OCR passed.
OCR integrity check passed.
Verification of OCR integrity was successful.
bart:TEST1:/tmp/bin> ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 262144
Used space (kbytes) : 16256
Available space (kbytes) : 245888
Cluster registry integrity check succeeded
You can also compare the properties of different nodes in the cluster
using cluvfy comp peer -n all
Friday, May 26, 2006
How to check the category of a Stored Outline
Stored Outlines is a feature within Oracle which helps preserve Execution Plan Stability.
Unfortunately the parameter use_stored_oulines is not an initialisation parameter and hence cannot be set in the init.ora or spfile.The only way to determine if outlines are being used and their category is by using the oradebug utility.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug dumpvar sga sgauso
qolprm sgauso_ [3C4C708, 3C4C72C) = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0
The above output indicates that no outlines are being used
SQL> alter system set use_stored_outlines = true;
System altered.
SQL> oradebug dumpvar sga sgauso
qolprm sgauso_ [3C4C708, 3C4C72C) = 00000001 45440007 4C554146 00000054 00000000 00000000 00000000 0
The above output indicates that the stored outlines are being used and the category
is DEFAULT
(Ascii equivalent of 44 45 46 41 55 4C 54)
The 7 here stands for the length of the category.
If you set a value at the session level you need to check the UGA
SQL> alter session set use_stored_outlines = false;
Session altered.
SQL> oradebug dumpvar uga ugauso
qolprm ugauso_p [8EE7A90, 8EE7AB4) = 00000004 00000000 00000000 00000000 00000000 00000000 00000000
SQL> alter session set use_stored_outlines = true;
Session altered.
SQL> oradebug dumpvar uga ugauso
qolprm ugauso_p [8EE7A90, 8EE7AB4) = 00000004 45440007 4C554146 00000054 00000000 00000000 00000000
Unfortunately the parameter use_stored_oulines is not an initialisation parameter and hence cannot be set in the init.ora or spfile.The only way to determine if outlines are being used and their category is by using the oradebug utility.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug dumpvar sga sgauso
qolprm sgauso_ [3C4C708, 3C4C72C) = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0
The above output indicates that no outlines are being used
SQL> alter system set use_stored_outlines = true;
System altered.
SQL> oradebug dumpvar sga sgauso
qolprm sgauso_ [3C4C708, 3C4C72C) = 00000001 45440007 4C554146 00000054 00000000 00000000 00000000 0
The above output indicates that the stored outlines are being used and the category
is DEFAULT
(Ascii equivalent of 44 45 46 41 55 4C 54)
The 7 here stands for the length of the category.
If you set a value at the session level you need to check the UGA
SQL> alter session set use_stored_outlines = false;
Session altered.
SQL> oradebug dumpvar uga ugauso
qolprm ugauso_p [8EE7A90, 8EE7AB4) = 00000004 00000000 00000000 00000000 00000000 00000000 00000000
SQL> alter session set use_stored_outlines = true;
Session altered.
SQL> oradebug dumpvar uga ugauso
qolprm ugauso_p [8EE7A90, 8EE7AB4) = 00000004 45440007 4C554146 00000054 00000000 00000000 00000000
Tuesday, May 23, 2006
Block Recovery using RMAN
BLOCK RECOVERY USING RMAN
The purpose of this article is to simulate a block level corruption and recover from the same using RMAN.
In this situation the datafile remains online throughout the recovery operation and hence other segments within the tablespace remain accessible.
Corrupt a block
Let us take the EMP table and find out the segment header of the same.
SQL> select header_file,header_block from dba_segments where segment_name='EMP';
HEADER_FILE HEADER_BLOCK
----------- ------------
4 24163
We now use the bbed utility to corrupt a block below the segment header which contains the actual data belonging to the EMP table.
In the exercise below we are corrupting the block which belongs to file 4 and block 24165
C:\>bbed parfile=bbed.par
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sat Feb 19 16:25:09 2005
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set dba 4,10
DBA 0x0100000a (16777226 4,10)
BBED> copy to block 24166
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) Y
File: E:\ORACLE\ORADATA\TENG\USERS01.DBF (4)
Block: 24166 Offsets: 0 to 511 Dba:0x01005e66
------------------------------------------------------------------------
1e820000 09000001 33280000 00000104 2fe10000 04000000 11002d00 00000000
00000000 00780000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<32 bytes per line>
SQL> select * from emp;
select * from emp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 24165)
ORA-01110: data file 4: 'E:\ORACLE\ORADATA\TENG\USERS01.DBF'
Recover the block
We now run a validate database command which
- Checks datafiles for physical and logical corruption
- Confirms that all database files exist and are in the correct locations
RMAN> run {BACKUP VALIDATE DATABASE;}
Starting backup at 19-FEB-05
using target database controlfile instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=91 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00004 name=E:\ORACLE\ORADATA\TENG\USERS01.DBF
input datafile fno=00001 name=E:\ORACLE\ORADATA\TENG\SYSTEM01.DBF
input datafile fno=00002 name=E:\ORACLE\ORADATA\TENG\UNDOTBS01.DBF
input datafile fno=00003 name=E:\ORACLE\ORADATA\TENG\SYSAUX01.DBF
channel ORA_DISK_1: backup set complete, elapsed time: 00:01:35
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current controlfile in backupset
including current SPFILE in backupset
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:11
Finished backup at 19-FEB-05
SQL> select * from V$backup_corruption;
RECID STAMP SET_STAMP SET_COUNT PIECE# FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# MAR CO
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- --------------------------
1 550688483 550688389 46 1 4 24165 1 0 YES CORRUPT
The above output confirms that block 24165 in file 4 is indeed corrupt!
We can recover the same using the following command.
RMAN> run {blockrecover datafile 4 block 24165;}
Starting blockrecover at 19-FEB-05
using channel ORA_DISK_1
channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00004
channel ORA_DISK_1: restored block(s) from backup piece 1
piece handle=E:\ORACLE\FLASH_RECOVERY_AREA\TENG\BACKUPSET\2005_02_19\O1_MF_NNNDF_TAG20050219T164615_11FNO9BQ_.BKP tag=TAG20050219T164615
channel ORA_DISK_1: block restore complete
starting media recovery
media recovery complete
Finished blockrecover at 19-FEB-05
SQL> select * from emp;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10
14 rows selected.
The purpose of this article is to simulate a block level corruption and recover from the same using RMAN.
In this situation the datafile remains online throughout the recovery operation and hence other segments within the tablespace remain accessible.
Corrupt a block
Let us take the EMP table and find out the segment header of the same.
SQL> select header_file,header_block from dba_segments where segment_name='EMP';
HEADER_FILE HEADER_BLOCK
----------- ------------
4 24163
We now use the bbed utility to corrupt a block below the segment header which contains the actual data belonging to the EMP table.
In the exercise below we are corrupting the block which belongs to file 4 and block 24165
C:\>bbed parfile=bbed.par
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sat Feb 19 16:25:09 2005
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set dba 4,10
DBA 0x0100000a (16777226 4,10)
BBED> copy to block 24166
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) Y
File: E:\ORACLE\ORADATA\TENG\USERS01.DBF (4)
Block: 24166 Offsets: 0 to 511 Dba:0x01005e66
------------------------------------------------------------------------
1e820000 09000001 33280000 00000104 2fe10000 04000000 11002d00 00000000
00000000 00780000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<32 bytes per line>
SQL> select * from emp;
select * from emp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 24165)
ORA-01110: data file 4: 'E:\ORACLE\ORADATA\TENG\USERS01.DBF'
Recover the block
We now run a validate database command which
- Checks datafiles for physical and logical corruption
- Confirms that all database files exist and are in the correct locations
RMAN> run {BACKUP VALIDATE DATABASE;}
Starting backup at 19-FEB-05
using target database controlfile instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=91 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00004 name=E:\ORACLE\ORADATA\TENG\USERS01.DBF
input datafile fno=00001 name=E:\ORACLE\ORADATA\TENG\SYSTEM01.DBF
input datafile fno=00002 name=E:\ORACLE\ORADATA\TENG\UNDOTBS01.DBF
input datafile fno=00003 name=E:\ORACLE\ORADATA\TENG\SYSAUX01.DBF
channel ORA_DISK_1: backup set complete, elapsed time: 00:01:35
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current controlfile in backupset
including current SPFILE in backupset
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:11
Finished backup at 19-FEB-05
SQL> select * from V$backup_corruption;
RECID STAMP SET_STAMP SET_COUNT PIECE# FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# MAR CO
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- --------------------------
1 550688483 550688389 46 1 4 24165 1 0 YES CORRUPT
The above output confirms that block 24165 in file 4 is indeed corrupt!
We can recover the same using the following command.
RMAN> run {blockrecover datafile 4 block 24165;}
Starting blockrecover at 19-FEB-05
using channel ORA_DISK_1
channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00004
channel ORA_DISK_1: restored block(s) from backup piece 1
piece handle=E:\ORACLE\FLASH_RECOVERY_AREA\TENG\BACKUPSET\2005_02_19\O1_MF_NNNDF_TAG20050219T164615_11FNO9BQ_.BKP tag=TAG20050219T164615
channel ORA_DISK_1: block restore complete
starting media recovery
media recovery complete
Finished blockrecover at 19-FEB-05
SQL> select * from emp;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10
14 rows selected.
Migrating an Oracle database from non ASM to ASM
Migrating an Oracle database from non ASM to ASM is a piece of cake with new functionality available in 10G R2 Database Console.
Details are available in my article on
OTN
Details are available in my article on
OTN
Subscribe to:
Posts (Atom)