Oracle Database 10g RAC on Linux and Unix Platforms

Oracle 10g

This paper will discuss and demonstrate some of the basic RAC management commands you might use to manage your Oracle 10g RAC components. The material presented will be applicable to most if not all Linux and Unix platforms. It will not cover RAC on the Microsoft Windows operating system. We will begin with the basics of checking out a RAC system to identify whether or not the appropriate services and resources are running or not. Then we will go through the basic startup and shutdown commands for all services and resources. Lastly, we will go over the startup of individual resources and checking the status of each step as we progress. The shutdown of individual resources can be done by reversing the individual startup order.

About our Environment

The RAC environment that we will reference throughout this paper is composed of two nodes running Red Hat Enterprise Linux 4 ES with a shared disk, and has two ASM instances named ASM1 and ASM2, two database instances named orcl1 and orcl2, and a service named RAC that is used for transparent application failover (TAF) and load balancing.

Overview of Basic RAC Management Commands

The commands we will use are listed below. Remember that this document is a quick reference, and not an exhaustive list of all commands for managing your RAC environment.

Cluster Related Commands
crs_stat -t Shows HA resource status (hard to read)
crsstat Ouptut of crs_stat -t formatted nicely
ps -ef|grep d.bin crsd.bin evmd.bin ocssd.bin
crsctl check crs CSS,CRS,EVM appears healthy
crsctl stop crs Stop crs and all other services
crsctl disable crs* Prevents CRS from starting on reboot
crsctl enable crs* Enables CRS start on reboot
crs_stop -all Stops all registered resources
crs_start -all Starts all registered resources

  • These commands update the file /etc/oracle/scls_scr//root/crsstart which contains the string ìenableî or ìdisableî as appropriate.

Database Related Commands
srvctl start instance -d -i Starts an instance
srvctl start database -d Starts all instances
srvctl stop database -d Stops all instances, closes database
srvctl stop instance -d -i Stops an instance
srvctl start service -d -s Starts a service
srvctl stop service -d -s Stops a service
srvctl status service -d Checks status of a service
srvctl status instance -d -i Checks an individual instance
srvctl status database -d Checks status of all instances
srvctl start nodeapps -n Starts gsd, vip, listener, and ons
srvctl stop nodeapps -n Stops gsd, vip and listener
Keep in mind that some resources will not start unless other resources are already online. We will now look at the general dependency list in greater detail.

There are three main background processes you can see when doing a ps ñef|grep d.bin. They are normally started by init during the operating system boot process. They can be started and stopped manually by issuing the command /etc/init.d/init.crs {start|stop|enable|disable}

/etc/rc.d/init.d/init.evmd
/etc/rc.d/init.d/init.cssd
/etc/rc.d/init.d/init.crsd
Once the above processes are running, they will automatically start the following services in the following order if they are enabled. This list assumes you are using ASM and have a service set up for TAF/load balancing.

The nodeapps (gsd, VIP, ons, listener) are brought online.
The ASM instances are brought online.
The database instances are brought online.
Any defined services are brought online.
Basic RAC Management Commands

Now that we know the dependency tree and have some commands at our disposal, letís have a look at them one at a time, starting with the cluster commands and processes.

crs_stat -t

This command shows us the status of each registered resource in the cluster. I generally avoid this command because its output is hard to read since the names are truncated as you can see in the sample output below. You can download a helpful script called crsstat from http://www.dbspecialists.com/specialists/specialist2007-05.html to make it easy on your eyes.

[[email protected] ~]$ crs_stat -t

Name Type Target State Host

ora….SM1.asm application ONLINE ONLINE green
ora….EN.lsnr application ONLINE ONLINE green
ora.green.gsd application ONLINE ONLINE green
ora.green.ons application ONLINE ONLINE green
ora.green.vip application ONLINE ONLINE green
ora…..RAC.cs application ONLINE ONLINE red
ora….cl1.srv application ONLINE ONLINE green
ora….cl2.srv application ONLINE ONLINE red
ora.orcl.db application ONLINE ONLINE red
ora….l1.inst application ONLINE ONLINE green
ora….l2.inst application ONLINE ONLINE red
ora….SM2.asm application ONLINE ONLINE red
ora….ED.lsnr application ONLINE ONLINE red
ora.red.gsd application ONLINE ONLINE red
ora.red.ons application ONLINE ONLINE red
ora.red.vip application ONLINE ONLINE red
[[email protected] ~]$
crsstat

The output of this script is much better. You can learn more about this script and download it at http://www.dbspecialists.com/specialists/specialist2007-05.html.

[[email protected] ~]$ crsstat
HA Resource Target State
———– —— —–
ora.green.ASM1.asm ONLINE ONLINE on green
ora.green.LISTENER_GREEN.lsnr ONLINE ONLINE on green
ora.green.gsd ONLINE ONLINE on green
ora.green.ons ONLINE ONLINE on green
ora.green.vip ONLINE ONLINE on green
ora.orcl.RAC.cs ONLINE ONLINE on red
ora.orcl.RAC.orcl1.srv ONLINE ONLINE on green
ora.orcl.RAC.orcl2.srv ONLINE ONLINE on red
ora.orcl.db ONLINE ONLINE on red
ora.orcl.orcl1.inst ONLINE ONLINE on green
ora.orcl.orcl2.inst ONLINE ONLINE on red
ora.red.ASM2.asm ONLINE ONLINE on red
ora.red.LISTENER_RED.lsnr ONLINE ONLINE on red
ora.red.gsd ONLINE ONLINE on red
ora.red.ons ONLINE ONLINE on red
ora.red.vip ONLINE ONLINE on red
[[email protected] ~]$
ps -ef|grep d.bin

We can use this command to verify that the CRS background processes are actually running. It is implicit that they are running if the crs_stat command and crsstat script work. If they do not work, you will want to verify the background processes are really running.

[[email protected] ~]# ps -ef|grep d.bin
oracle 5335 3525 0 Jul11 ? 00:00:05 /u01/app/oracle/product/10.2.0/crs/bin/evmd.bin
root 5487 3817 0 Jul11 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs/bin/crsd.bin reboot
oracle 5932 5392 0 Jul11 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs/bin/ocssd.bin
root 30486 30177 0 18:23 pts/1 00:00:00 grep d.bin
[[email protected] ~]#
crsctl check crs

This command verifies that the above background daemons are functioning.

[[email protected] ~]$ crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
[[email protected] ~]$
crsctl stop crs

Weíll need to be logged onto the server as the root user to run this command. It will stop all HA resources on the local node, and it will also stop the above mentioned background daemons.

[[email protected] ~]$ crsctl stop crs
Insufficient user privileges.
[[email protected] ~]$ su
Password:
[[email protected] oracle]# crsctl stop crs
Stopping resources. This could take several minutes.
Successfully stopped CRS resources.
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
[[email protected] oracle]#
crsctl disable crs

This command will prevent CRS from starting on a reboot. Note there is no return output from the command.

[[email protected] oracle]# crsctl disable crs
[[email protected] oracle]#
We did a reboot after this and verified that CRS did not come back online because we wanted to do some operating system maintenance. Letís check the status by running some of the commands weíve just discussed.

[[email protected] ~]$ crsstat
HA Resource Target State
———– —— —–
error connecting to CRSD at [(ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))] clsccon 184

[[email protected] ~]$ crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM

[[email protected] ~]$ ps -ef|grep d.bin
oracle 6149 5582 0 15:54 pts/1 00:00:00 grep d.bin
[[email protected] ~]$
Everything appears to be down on this node as expected.

Now letís start everything back up. We will need to be root for this, unless you have been given permissions or sudo to run crsctl start crs.

[[email protected] oracle]# crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly
[[email protected] oracle]#
After a few minutes the registered resources for this node should come online. Letís check to be sure:

[[email protected] ~]$ crsstat
HA Resource Target State
———– —— —–
ora.green.ASM1.asm ONLINE ONLINE on green
ora.green.LISTENER_GREEN.lsnr ONLINE ONLINE on green
ora.green.gsd ONLINE ONLINE on green
ora.green.ons ONLINE ONLINE on green
ora.green.vip ONLINE ONLINE on green
ora.orcl.RAC.cs ONLINE ONLINE on red
ora.orcl.RAC.orcl1.srv ONLINE ONLINE on green
ora.orcl.RAC.orcl2.srv ONLINE ONLINE on red
ora.orcl.db ONLINE ONLINE on red
ora.orcl.orcl1.inst ONLINE ONLINE on green
ora.orcl.orcl2.inst ONLINE ONLINE on red
ora.red.ASM2.asm ONLINE ONLINE on red
ora.red.LISTENER_RED.lsnr ONLINE ONLINE on red
ora.red.gsd ONLINE ONLINE on red
ora.red.ons ONLINE ONLINE on red
ora.red.vip ONLINE ONLINE on red
[[email protected] ~]$
Letís not forget to enable CRS on reboot:

[[email protected] oracle]# crsctl enable crs
crs_stop -all

This is a handy script that stops the registered resources and leaves the CRS running. This includes all services in the cluster, so it will bring down all registered resources on all nodes.

[[email protected] ~]$ crs_stop -all
Attempting to stop ora.green.gsd on member green
Attempting to stop ora.orcl.RAC.orcl2.srv on member red
Stop of ora.orcl.TEST.orcl1.srv on member green succeeded.
Attempting to stop ora.orcl.RAC.orcl1.srv on member green
Attempting to stop ora.green.ons on member green
Attempting to stop ora.orcl.RAC.cs on member red
Stop of ora.green.gsd on member green succeeded.
Stop of ora.orcl.RAC.orcl1.srv on member green succeeded.
Stop of ora.orcl.RAC.orcl2.srv on member red succeeded.
Stop of ora.orcl.TEST.orcl2.srv on member red succeeded.
Stop of ora.green.ons on member green succeeded.
–snip–
CRS-0216: Could not stop resource ‘ora.orcl.orcl2.inst’.
[[email protected] ~]$
Occasionally you will get the CRS-0216 error message shown above. This is usually bogus, but you should re-check with crsstat and ps ñef|grep smon or similar to be sure everything has died off.

Letís verify that crs_stop -all worked as expected:

[[email protected] oracle]# crsstat
HA Resource Target State
———– —— —–
ora.green.ASM1.asm OFFLINE OFFLINE
ora.green.LISTENER_GREEN.lsnr OFFLINE OFFLINE
ora.green.gsd OFFLINE OFFLINE
ora.green.ons OFFLINE OFFLINE
ora.green.vip OFFLINE OFFLINE
ora.orcl.RAC.cs OFFLINE OFFLINE
ora.orcl.RAC.orcl1.srv OFFLINE OFFLINE
ora.orcl.RAC.orcl2.srv OFFLINE OFFLINE
ora.orcl.db OFFLINE OFFLINE
ora.orcl.orcl1.inst OFFLINE OFFLINE
ora.orcl.orcl2.inst OFFLINE OFFLINE
ora.red.ASM2.asm OFFLINE OFFLINE
ora.red.LISTENER_RED.lsnr OFFLINE OFFLINE
ora.red.gsd OFFLINE OFFLINE
ora.red.ons OFFLINE OFFLINE
ora.red.vip OFFLINE OFFLINE
[[email protected] oracle]#
Letís move on to working with srvctl and managing individual resources. We will begin with the crs background daemons already running, and all registered resources being offline from the last step above. We will first start the nodeapps, then the ASM instances, followed by the database instances, and lastly the services for TAF and load balancing. This is the dependency order in our particular environment. You may or may not have ASM or TAF and load balancing services to start in your environment.

srvctl start nodeapps -n (node)

This will bring up the gsd, ons, listener, and vip. The same command can shut down the nodeapps by replacing start with stop.

[[email protected] ~]$ srvctl start nodeapps -n green
[[email protected] ~]$ srvctl start nodeapps -n red
Now we will check with crsstat again to be sure the nodeapps have started.

[[email protected] ~]$ crsstat
HA Resource Target State
———– —— —–
ora.green.ASM1.asm OFFLINE OFFLINE
ora.green.LISTENER_GREEN.lsnr ONLINE ONLINE on green
ora.green.gsd ONLINE ONLINE on green
ora.green.ons ONLINE ONLINE on green
ora.green.vip ONLINE ONLINE on green
ora.orcl.RAC.cs OFFLINE OFFLINE
ora.orcl.RAC.orcl1.srv OFFLINE OFFLINE
ora.orcl.RAC.orcl2.srv OFFLINE OFFLINE
ora.orcl.db OFFLINE OFFLINE
ora.orcl.orcl1.inst OFFLINE OFFLINE
ora.orcl.orcl2.inst OFFLINE OFFLINE
ora.red.ASM2.asm OFFLINE OFFLINE
ora.red.LISTENER_RED.lsnr ONLINE ONLINE on red
ora.red.gsd ONLINE ONLINE on red
ora.red.ons ONLINE ONLINE on red
ora.red.vip ONLINE ONLINE on red
[[email protected] oracle]#
Now we need to start our ASM instances before we bring up our database and services.

srvctl start asm -n (node)

This will bring up our ASM instances on nodes green and red. Again, the same command will stop the ASM instances by replacing start with stop.

[[email protected] ~]$ srvctl start asm -n green
[[email protected] ~]$ srvctl start asm -n red
[[email protected] ~]$ crsstat
HA Resource Target State
———– —— —–
ora.green.ASM1.asm ONLINE ONLINE on green
ora.green.LISTENER_GREEN.lsnr ONLINE ONLINE on green
ora.green.gsd ONLINE ONLINE on green
ora.green.ons ONLINE ONLINE on green
ora.green.vip ONLINE ONLINE on green
ora.orcl.RAC.cs OFFLINE OFFLINE
ora.orcl.RAC.orcl1.srv OFFLINE OFFLINE
ora.orcl.RAC.orcl2.srv OFFLINE OFFLINE
ora.orcl.db OFFLINE OFFLINE
ora.orcl.orcl1.inst OFFLINE OFFLINE
ora.orcl.orcl2.inst OFFLINE OFFLINE
ora.red.ASM2.asm ONLINE ONLINE on red
ora.red.LISTENER_RED.lsnr ONLINE ONLINE on red
ora.red.gsd ONLINE ONLINE on red
ora.red.ons ONLINE ONLINE on red
ora.red.vip ONLINE ONLINE on red
[[email protected] ~]$
Now letís bring up our orcl1 and orcl2 instances, and verify they are up with crsstat. Once more we can replace start with stop and shutdown an individual instance if we so choose.

srvctl start instance -d (database) -I (instance)

[[email protected] ~]$ srvctl start instance ñd orcl ñi orcl1
[[email protected] ~]$ srvctl start instance -d orcl -i orcl2
[[email protected] ~]$ crsstat
HA Resource Target State
———– —— —–
ora.green.ASM1.asm ONLINE ONLINE on green
ora.green.LISTENER_GREEN.lsnr ONLINE ONLINE on green
ora.green.gsd ONLINE ONLINE on green
ora.green.ons ONLINE ONLINE on green
ora.green.vip ONLINE ONLINE on green
ora.orcl.RAC.cs OFFLINE OFFLINE
ora.orcl.RAC.orcl1.srv OFFLINE OFFLINE
ora.orcl.RAC.orcl2.srv OFFLINE OFFLINE
ora.orcl.db ONLINE ONLINE on red
ora.orcl.orcl1.inst ONLINE ONLINE on green
ora.orcl.orcl2.inst ONLINE ONLINE on red
ora.red.ASM2.asm ONLINE ONLINE on red
ora.red.LISTENER_RED.lsnr ONLINE ONLINE on red
ora.red.gsd ONLINE ONLINE on red
ora.red.ons ONLINE ONLINE on red
ora.red.vip ONLINE ONLINE on red
[[email protected] ~]$
srvctl start service -d (database) -s (service)

Now we will finish up by bringing our load balanced/TAF service named RAC online.

[[email protected] ~]$ srvctl start service -d orcl -s RAC
[[email protected] ~]$ crsstat
HA Resource Target State
———– —— —–
ora.green.ASM1.asm ONLINE ONLINE on green
ora.green.LISTENER_GREEN.lsnr ONLINE ONLINE on green
ora.green.gsd ONLINE ONLINE on green
ora.green.ons ONLINE ONLINE on green
ora.green.vip ONLINE ONLINE on green
ora.orcl.RAC.cs ONLINE ONLINE on red
ora.orcl.RAC.orcl1.srv ONLINE ONLINE on green
ora.orcl.RAC.orcl2.srv ONLINE ONLINE on red
ora.orcl.db ONLINE ONLINE on red
ora.orcl.orcl1.inst ONLINE ONLINE on green
ora.orcl.orcl2.inst ONLINE ONLINE on red
ora.red.ASM2.asm ONLINE ONLINE on red
ora.red.LISTENER_RED.lsnr ONLINE ONLINE on red
ora.red.gsd ONLINE ONLINE on red
ora.red.ons ONLINE ONLINE on red
ora.red.vip ONLINE ONLINE on red
[[email protected] ~]$
There we have it; all of our resources are now online. The next steps would be to verify you can connect via SQL*Plus or your favorite application.

Conclusion

When a product or process is new to you, as Oracle 10g RAC is to many people, it can be an intimidating and possibly disastrous experience. This paper has hopefully given you the elementary commands you will need to manage your Oracle 10g RAC system. While it is not a complete dissection of RAC and its total command set, it should be enough to get you on your feet. You can always get the basic syntax of srvctl by typing srvctl ñhelp. For a complete list of all options, type srvctl ñh. You can also get the complete syntax for crsctl by typing crsctl at the command line. Also, do have a peek at the Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide, publication number B14197-04. You can find it on Oracle website at http://download.oracle.com/docs/cd/B19306_01/rac.102/b14197/toc.htm.