CRS-4535 Cannot communicate with Cluster Ready Services

ORA-Errors Oracle RAC

In Cluster environment, when you check the status of the CRS (Cluster Ready Service) you may find the error as CRS-4535 Cannot communicate with Cluster Ready Services as shown below.

1
2
3
4
5
[[email protected] bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager

This mainly occurs for two reasons:
1. Check if the nodes are able to ping each other in terms  of respective IPs (Public, Private and Virtual IP).
2. Check if the Grid owner has the permission on the ASM disks on the node where you faced the error.

In my case, GRID owner was user Oracle. The connectivity between the nodes using their Public, Private and Virtual IPs were perfect and was able to ping each other using the above said IPs.

So, the issue laid with the permission of the ASM disks for the Grid Owner (username Oracle)

This is what I found with the permissions for the ASM disks. The disks were owned by ROOT and ORACLE had no permissions on it.

1
2
3
4
5
6
7
8
9
10
11
12
13
[[email protected] bin]# cd /dev/oracleasm/disks
[[email protected] disks]# ls -lrt
total 0
brw------- 1 root root 8, 17 May  6 10:16 DISK1
brw------- 1 root root 8, 33 May  6 10:16 DISK2
brw------- 1 root root 8, 49 May  6 10:16 DISK3
brw------- 1 root root 8, 65 May  6 10:16 DISK4
brw------- 1 root root 8, 81 May  6 10:16 DISK5
[[email protected] bin]# ps -ef  | grep css
root      3784     1  0 10:17 ?        00:00:01 /u01/app/grid/11.2.0/bin/cssdmonitor
root      3801     1  0 10:17 ?        00:00:01 /u01/app/grid/11.2.0/bin/cssdagent
root      4189  4107  0 10:26 pts/1    00:00:00 grep css

Now, change the owner of these disks to ORACLE as shown below and also provide appropriate permission for the ORACLE user to read/write these disks.

1
2
3
4
5
6
7
8
9
[[email protected] disks]# chown -R oracle:dba /dev/oracleasm/disks
[[email protected] disks]# chmod -R 777 /dev/oracleasm/disks
[[email protected] disks]# ls -lrt
total 0
brwxrwxrwx 1 oracle dba 8, 17 May  6 10:16 DISK1
brwxrwxrwx 1 oracle dba 8, 33 May  6 10:16 DISK2
brwxrwxrwx 1 oracle dba 8, 49 May  6 10:16 DISK3
brwxrwxrwx 1 oracle dba 8, 65 May  6 10:16 DISK4
brwxrwxrwx 1 oracle dba 8, 81 May  6 10:16 DISK5

Once you have assigned the permission, start the cluster services as the ROOT user.

Change to your $GRID_HOME/bin directory (in my case, $GRID_HOME was /u01/app/oracle/product/11.2.0/grid) and start the cluster services using the CRSCTL utility as shown below.

1
2
3
4
5
6
7
8
9
10
11
[[email protected] bin]# ./crsctl start cluster
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2679: Attempting to clean 'ora.asm' on 'rac1'
CRS-2681: Clean of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded

Check the CSS service status:

1
2
3
4
5
[[email protected] bin]# ps -ef | grep css
root      3784     1  0 10:17 ?        00:00:01 /u01/app/grid/11.2.0/bin/cssdmonitor
root      4372     1  0 10:30 ?        00:00:01 /u01/app/grid/11.2.0/bin/cssdagent
oracle    4387     1  1 10:30 ?        00:00:02 /u01/app/grid/11.2.0/bin/ocssd.bin
root      5347  4107  0 10:33 pts/1    00:00:00 grep css

Now check if CRS (Cluster Ready Service) is online or not:

1
2
3
4
5
6
[[email protected]c1 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

Here we go !!