RAC - RAC on VM (ESX)
CRS eating CPU on VMware
http://www.oracloid.com/2007/10/crs-eating-cpu-on-vmware/#more-61October 30th, 2007 Alex Gorbachev Leave a comment Go to comments
Some time ago (yeah… shame on me) I mentioned having troubles running CRS on virtual machines using VMware Server. I found a solution a while ago and, since I promised to share if I find anything, now is the time.
First of all, I’m happy to admit that my observations regarding Windows hosted VM’s running better compare to Linux hosted were wrong. Indeed, how can Windows run faster than Linux?!
I used VMware Server 1.0.3. As host OS I used 64 bit Ubuntu or 32 bit Windows. Guest OS was 32 bit Oracle Enterprise Linux 4 (a la Larry Hat 4). As you could see later, I tried VMware Workstation 6.0 as well without any visible improvements. For shared storage I use either NFS exports from host OS (when using Ubuntu) or Openfiler when using Windows (even more CPU saturation).
To recall the problem… Virtual machine started to eat CPU like crazy when I start CRS inside virtual machine. Even without Oracle database - just starting CRS is enough. I could see that vmware-vmx process was consuming about 60% on one CPU core (AMD Athlon64 3800 X2). Inside virtual machine I could only see from time to time init.cssd in top and average CPU consumption jumping from 10% to 90% without any process in top that I could see. I tried strace on vmware-vmx processes in my host OS - could only see that most of the time is spent in poll system call.
It seems there were some short living processes consuming CPU and I couldn’t track them. I remember that on HP-UX I could easily catch them with glance but Linux is as usual behind (or is it me “behind”?). After some time spent inside init.cssd, init.crsd and init.evmd scripts, I tried to increase sleep time in couple places and, I couldn’t believe it, CPU was relieved — below 10% of a single core used per virtual machine.
What I did is replaced some sleep time in /etc/init.d/init.cssd file. In 10.2.0.3 I had to do it in two places:
- line 1132: $SLEEP 1 -> $SLEEP 30 - line 1249: $SLEEP 5 -> $SLEEP 35 Actually, the fist one should probably be enough
After that CRS started to run very smooth. Of course, this should never be done at your production environment. This would reduce frequency of checks for critical processes. It’s fine on my test/research environment but should never be done in other cases. I warned you. Anyway, you should be sane enough to avoid running RAC inside virtual machines.
I might mention that before I came up with the solution, I switched to VMware Workstation 6.0 (trial) and tried few settings in VMware configuration that can be set in .vmx file. One of the most useful option is setting which processors can be used for the virtual machine. I set them so that one VM run on CPU0 and another on CPU1. I still use this configuration.
LH1.vmx:
processor0.use = TRUE
processor1.use = FALSE LH2.vmx:
processor0.use = FALSE
processor1.use = TRUE
Few other settings I tried:
MemTrimRate = "0"
sched.mem.pshare.enable = "FALSE"
Here is some info about memory trimming and page sharing from http://www.virtualization.info/2005/11/how-to-improve-disk-io-performances.html:
Comments