One of the oft mentioned RAC best practices is that the priority of LMS processes which ship blocks across the interconnect be higher so that they are not competing for cpu cycles with other processes.
Starting with 10.2 LMS is supposed to run in the real time class. This is new functionality which is governed by the underscore parameter _os_sched_high_priority
However in 10.2.0.1 LMS still runs in the time-sharing TS class (SCHED_OTHER standard time-sharing) due to the absence of the oradism executable
buffalo >ls -al oradism
-r-sr-s--- 1 root dba 0 Jul 1 2005 oradism
buffalo >ps -efc|grep lms
oracle 27263 1 TS 24 2006 ? 00:12:38 ora_lms0_F8900DMO1
oracle 27273 1 TS 24 2006 ? 00:04:30 ora_lms1_F8900DMO1
When you apply the 10.2.0.3 patchset you notice that the oradism executable seems to be generated and LMS runs in the RR class (SCHED_RR round robin)
elephant-> ls -al oradism
-r-sr-s--- 1 root dba 14456 Nov 15 12:52 oradism
elephant-> ps -efc | grep lms
oracle 11554 1 RR 90 15:09 ? 00:00:01 ora_lms0_F8902PRD1
oracle 11558 1 RR 90 15:09 ? 00:00:01 ora_lms1_F8902PRD1
whereas other background processes like PMON still run in TS
elephant-> ps -efc | grep pmon | grep PRD
oracle 11544 1 TS 23 Feb14 ? 00:00:00 ora_pmon_F8902PRD1
I am not sure if this is ideal on a box with a low number of CPUs or if cache fusion traffic is not a major concern.If you want LMS to run in the same class as other processes you need to set _os_sched_high_priority back to 0 from its default value of 1 as seen from below
But doing this does not seem to change the class to TS
SQL> alter system set "_os_sched_high_priority"=0 scope=spfile;
System altered.
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
elephant-> srvctl stop database -d F8902PRD
elephant-> srvctl start database -d F8902PRD
elephant-> ps -efc | grep lms | grep PRD
oracle 17097 1 RR 90 09:54 ? 00:00:00 ora_lms0_F8902PRD1
oracle 17101 1 RR 90 09:54 ? 00:00:00 ora_lms1_F8902PRD1
1 select a.ksppinm "Parameter",
2 b.ksppstvl "Session Value",
3 c.ksppstvl "Instance Value"
4 from x$ksppi a, x$ksppcv b, x$ksppsv c
5 where a.indx = b.indx and a.indx = c.indx
6* and a.ksppinm like '%os_sched%'
SQL> /
Parameter
--------------------------------------------------------------------------------
Session Value
-------------------------------------------------------------------------------------------------------------------------------------------------
Instance Value
-------------------------------------------------------------------------------------------------------------------------------------------------
_os_sched_high_priority
0
0
From bug 5635098 it appears there is another parameter called
_high_priority_processes which needs to be set to null for this to work.
SQL> alter system set "_high_priority_processes"='' scope=spfile;
System altered.
elephant-> srvctl stop database -d F8902PRD
elephant-> srvctl start database -d F8902PRD
elephant-> ps -efc | grep lms | grep -v grep
oracle 31654 1 TS 24 00:58 ? 00:00:01 ora_lms0_F8902PRD1
oracle 31656 1 TS 24 00:58 ? 00:00:00 ora_lms1_F8902PRD1
As you can lms is now running in TS class.
On Solaris Sparc 64 bit be aware of bug 5258549 which causes boxes with low number of CPUs to freeze.
16 comments:
blimey, I just checked my 10.2.0.2 and I have the following:
oracle 7438 1 FF 41 2006 ? 00:00:04 asm_lms0_+ASM1
oracle 1194 1 FF 41 2006 ? 18:32:03 ora_lms0_NOM1
oracle 1198 1 FF 41 2006 ? 15:18:40 ora_lms1_NOM1
so we have 3 different scheduler classes for 3 different 10gR2 version!
oracle really could not make up their mind here.
jason.
Jason,
Which operating system are you on?
-Fairlie
Unfortunately, the bug report for Solaris is not very descriptive. It mentions Solaris 10 + CRS, but no clue on other common Solaris/Oracle combinations like Sun Cluster + Solaris 9. It could be a very specific combination of software components or a "general issue"
:(
Odd this did not get into the 10.2.0.3 Known Issues Note.
In any case, we have 3 Sun clusters running 9iR2 that I do not think they are going to see 10gR2 for a while. It seems too risky to go that path right now...
We will need to wait for 10.2.0.4 apparently...And that reminds me of an old saying coming from an Oracle rep (in the Twentieth Century) about the difference in quality between the even and odd releases (9208 vs 9207 or 715 vs 716), the ven release numbers were always better :)
Nilo.
Nilo,
It is very strange that you mention about the quality of odd and even number patchsets given that I had mentioned the same thing on http://www.freelists.org/archives/oracle-l/09-2006/msg00591.html
-Fairlie
Hi Fairlie,
I'm on RedHat AS update 2 2.6.9-22.ELsmp at 10.2.02 gives the FF at 10.2.0.2 and the same o/s gives RR at 10.2.0.3
Thanks Jason.If you talk to a Sales person he might say the "feature" is evolving but there is a code bug on this in the current 11g beta release.
Hi!
We have the same problem with oracle 10.2.0.3 and Solaris 2.8 on a FSC Box with two CPU's.
Is it a good idea to change the _os_sched_high_priority to NULL?
Regards
- Eric
If you are on 10.2.0.3 you should set _high_priority_processes to null.
But you should take advise from Oracle Support before using
such undocumented parameters in a production system.
Setting the LMS priority to "0" changed the class from FF to TS again
do anybody know what was FF??
Santhosh
Setting the _os_sched_high_priority to "0" changed the class from FF to TS again, what is FF anyway??
Santhosh
FF means SCHED_FIFO.
From "man ps":
cls CLS scheduling class of the process.
(alias policy, class). Field's possible
values are:
- not reported
TS SCHED_OTHER
FF SCHED_FIFO
RR SCHED_RR
? unknown value
Thanks, very useful
Thank You, this discussion is very useful.
Does anyone have instructions how to simulate this RR scheduled LMS CPU race? (Naturally it would be used only in test environment, not in production environment).
TCaJd6 Your blog is great. Articles is interesting!
Hi,
How about the ASM instance?
Should we also alter the priority of LMS on the ASM instance?
Thanks for the info,
Dean
Post a Comment