Thursday, October 9, 2014

Oracle DBA Interview Questions and Answers - RAC

Oracle DBA Interview Questions and Answers - RAC 

Oracle RAC Interview Questions and Answers

1. Where are the Clusterware files stored on a RAC environment?
The Clusterware is installed on each node (on an Oracle Home) and on the shared disks (the voting disks and the CSR file)

2. Where are the database software files stored on a RAC environment?
The base software is installed on each node of the cluster and the
database storage on the shared disks.

3. What kind of storage we can use for the shared Clusterware files?
- OCFS (Release 1 or 2)
- raw devices
- third party cluster file system such as GPFS or Veritas

4. What kind of storage we can use for the RAC database storage?
- OCFS (Release 1 or 2)
- ASM
- raw devices
- third party cluster file system such as GPFS or Veritas

5. What is a CFS?
A cluster File System (CFS) is a file system that may be accessed (read and write) by all members in a cluster at the same time. This implies that all members of a cluster have the same view.

6. What is an OCFS2?
The OCFS2 is the Oracle (version 2) Cluster File System which can be used for the Oracle Real Application Cluster.

7. Which files can be placed on an Oracle Cluster File System?
- Oracle Software installation (Windows only)
- Oracle files (controlfiles, datafiles, redologs, files described by the bfile datatype)
- Shared configuration files (spfile)
- OCR and voting disk
- Files created by Oracle during runtime
Note: There are some platform specific limitations.

8. Do you know another Cluster Vendor?
HP Tru64 Unix, Veritas, Microsoft

9. How is possible to install a RAC if we don’t have a CFS?
This is possible by using a raw device.

10. What is a raw device?
A raw device is a disk drive that does not yet have a file system set up. Raw devices are used for Real Application Clusters since they enable the sharing of disks.

11. What is a raw partition?
A raw partition is a portion of a physical disk that is accessed at the lowest possible level. A raw partition is created when an extended partition is created and logical partitions are assigned to it without any formatting. Once formatting is complete, it is called cooked partition.

12. When to use CFS over raw?
A CFS offers:
- Simpler management
- Use of Oracle Managed Files with RAC
- Single Oracle Software installation
- Autoextend enabled on Oracle datafiles
- Uniform accessibility to archive logs in case of physical node failure
- With Oracle_Home on CFS, when you apply Oracle patches CFS guarantees that the updated Oracle_Home is visible to all nodes in the cluster.
Note: This option is very dependent on the availability of a CFS on your platform.

13. When to use raw over CFS?
- Always when CFS is not available or not supported by Oracle.
- The performance is very, very important: Raw devices offer best performance without any intermediate layer between Oracle and the disk.
Note: Autoextend fails on raw devices if the space is exhausted. However the space could be added online if needed.

14. What CRS is?
Oracle RAC 10g Release 1 introduced Oracle Cluster Ready Services (CRS), a platform-independent set of system services for cluster environments. In Release 2, Oracle has renamed this product to Oracle Clusterware.

15. What is VIP IP used for?
It returns a dead connection IMMIDIATELY, when its primary node fails. Without using VIP IP, the clients have to wait around 10 minutes to receive ORA-3113: “end of file on communications channel”. However, using Transparent Application Failover (TAF) could avoid ORA-3113.

16. Why we need to have configured SSH or RSH on the RAC nodes?
SSH (Secure Shell,10g+) or RSH (Remote Shell, 9i+) allows “oracle” UNIX account connecting to another RAC node and copy/ run commands as the local “oracle” UNIX account.

17. Is the SSH, RSH needed for normal RAC operations?
No. SSH or RSH are needed only for RAC, patch set installation and clustered database creation.

18. Do we have to have Oracle RDBMS on all nodes?
Each node of a cluster that is being used for a clustered database will typically have the RDBMS and RAC software loaded on it, but not actual data files (these need to be available via shared disk).

19. What are the restrictions on the SID with a RAC database? Is it limited to 5 characters?
The SID prefix in 10g Release 1 and prior versions was restricted to five characters by install/ config tools so that an ORACLE_SID of up to max of 5+3=8 characters can be supported in a RAC environment. The SID prefix is relaxed up to 8 characters in 10g Release 2, see bug 4024251 for more information.

20. Does Real Application Clusters support heterogeneous platforms?
The Real Application Clusters do not support heterogeneous platforms in the same cluster.

21. Are there any issues for the interconnect when sharing the same switch as the public network by using VLAN to separate the network?
RAC and Clusterware deployment best practices suggests that the interconnect (private connection) be deployed on a stand-alone, physically separate, dedicated switch. On big network the connections could be instable.

22. What is the Load Balancing Advisory?
To assist in the balancing of application workload across designated resources, Oracle Database 10g Release 2 provides the Load Balancing Advisory. This Advisory monitors the current workload activity across the cluster and for each instance where a service is active; it provides a percentage value of how much of the total workload should be sent to this instance as well as service quality flag.

23. How many nodes are supported in a RAC Database?
With 10g Release 2, we support 100 nodes in a cluster using Oracle Clusterware, and 100 instances in a RAC database. Currently DBCA has a bug where it will not go beyond 63 instances. There is also a documentation bug for the max-instances parameter. With 10g Release 1 the Maximum is 63.

24. What is the Cluster Verification Utiltiy (cluvfy)?
The Cluster Verification Utility (CVU) is a validation tool that you can use to check all the important components that need to be verified at different stages of deployment in a RAC environment.

25. What versions of the database can I use the cluster verification utility (cluvfy) with?
The cluster verification utility is release with Oracle Database 10g Release 2 but can also be used with Oracle Database 10g Release 1.

26. If I am using Vendor Clusterware such as Veritas, IBM, Sun or HP, do I still need Oracle Clusterware to run Oracle RAC 10g?
Yes. When certified, you can use Vendor Clusterware however you must still install and use Oracle Clusterware for RAC. Best Practice is to leave Oracle Clusterware to manage RAC. For details see Metalink Note 332257.1 and for Veritas SFRAC see 397460.1.

27. Is RAC on VMW are supported?
Yes.

28. What is hangcheck timer used for ?
The hangcheck timer checks regularly the health of the system. If the system hangs or stop the node will be restarted automatically.

There are 2 key parameters for this module:

-> hangcheck-tick: this parameter defines the period of time between checks of system health. The default value is 60 seconds; Oracle recommends setting it to 30seconds.

-> hangcheck-margin: this defines the maximum hang delay that should be tolerated before hangcheck-timer resets the RAC node.

29. Is the hangcheck timer still needed with Oracle RAC 10g?
Yes.

30. What files can I put on Linux OCFS2?
For optimal performance, you should only put the following files on Linux OCFS2:
- Datafiles
- Control Files
- Redo Logs
- Archive Logs
- Shared Configuration File (OCR)
- Voting File
- SPFILE

31. Is it possible to use ASM for the OCR and voting disk?
No, the OCR and voting disk must be on raw or CFS (cluster file system).

32. Can I change the name of my cluster after I have created it when I am using Oracle Clusterware?
No, you must properly uninstall Oracle Clusterware and then re-install.

33. What the O2CB is?
The O2CB is the OCFS2 cluster stack. OCFS2 includes some services. These services must be started before using OCFS2 (mount/ format the file systems).

34. What the OCR file is used for?
OCR is a file that manages the cluster and RAC configuration.

35. What the Voting Disk file is used for?
The voting disk is nothing but a file that contains and manages information of all the node memberships.

36. What is the recommended method to make backups of a RAC environment?
RMAN to make backups of the database, dd to backup your voting disk and hard copies of the OCR file.

37. What command would you use to check the availability of the RAC system?
crs_stat -t -v (-t -v are optional)

38. What is the minimum number of instances you need to have in order to create a RAC?
You can create a RAC with just one server.

39. Name two specific RAC background processes
RAC processes are: LMON, LMDx, LMSn, LKCx and DIAG.

40.Can you have many database versions in the same RAC?
Yes, but Clusterware version must be greater than the greater database version.

41. What was RAC previous name before it was called RAC?
OPS: Oracle Parallel Server

42. What RAC component is used for communication between instances?
Private Interconnect.

43. What is the difference between normal views and RAC views?
A RAC view has the prefix ‘G’. For example, GV$SESSION instead of V$SESSION

44. Which command will we use to manage (stop, start) RAC services in command-line mode?
srvctl

45. How many alert logs exist in a RAC environment?
One for each instance.

46. What are Oracle Clusterware Components
Voting Disk — Oracle RAC uses the voting disk to manage cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on shared disk.

Oracle Cluster Registry (OCR) — Maintains cluster configuration information as well as configuration information about any cluster database within the cluster. The OCR must reside on shared disk that is accessible by all of the nodes in your cluster

47. How do you backup voting disk
#dd if=voting_disk_name of=backup_file_name

48. How do I identify the voting disk location
#crsctl query css votedisk

49. How do I identify the OCR file location
check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)
or
#ocrcheck

50. How do you backup the OCR
There is an automatic backup mechanism for OCR. The default location is : $ORA_CRS_HOME\cdata\"clustername"\
To display backups :
#ocrconfig -showbackup
To restore a backup :
#ocrconfig -restore
With Oracle RAC 10g Release 2 or later, you can also use the export command:
#ocrconfig -export -s online, and use -import option to restore the contents back.
With Oracle RAC 11g Release 1, you can do a manaual backup of the OCR with the command:
# ocrconfig -manualbackup

51. What is SCAN?
Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g Release 2 feature that provides a single name for clients to access an Oracle Database running in a cluster. The benefit is clients using SCAN do not need to change if you add or remove nodes in the cluster.

52. What are Oracle Clusterware processes for 10g on Unix and Linux
Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.

Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on) based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as the root user

Event manager daemon (evmd) —A background process that publishes events that crs creates.

Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.

RACG (racgmain, racgimon) —Extends clusterware to support Oracle- specific requirements and complex resources. Runs server callout scripts when FAN events occur.

53. What are Oracle database background processes specific to RAC
•LMS—Global Cache Service Process
•LMD—Global Enqueue Service Daemon
•LMON—Global Enqueue Service Monitor
•LCK0—Instance Enqueue Process
To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy a query or transaction, Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances.

54. How do you troubleshoot node reboot
Please check metalink ...
Note 265769.1 Troubleshooting CRS Reboots
Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle Clusterware Node evictions.

55. Is ssh required for normal Oracle RAC operation ?
"ssh" are not required for normal Oracle RAC operation. However "ssh" should be enabled for Oracle RAC and patchset installation.

56. What is the purpose of Private Interconnect ?
Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communication between the the clustered nodes. This communication is based on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP). Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster.

57. Why do we have a Virtual IP (VIP) in Oracle RAC?
Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs.

When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately.

58. What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in AWR Report?
This is most likely due to a fault in interconnect network.
Check netstat -s
if you see "fragments dropped" or "packet reassemblies failed" , Work with your system administrator find the fault with network.

59. How many nodes are supported in a RAC Database?
10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances in a RAC database.

60. Srvctl cannot start instance, I get the following error PRKP- 1001 CRS-0215, however sqlplus can start it on both nodes? How do you identify the problem?
Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl. Now you will get detailed error stack.

61. What is the purpose of the ONS daemon?
The Oracle Notification Service (ONS) daemon is an daemon started by the CRS clusterware as part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service daemon receive a subset of published clusterware events via the local evmd and racgimon clusterware daemons and forward those events to application subscribers and to the local listeners.

This in order to facilitate:

a. the FAN or Fast Application Notification feature or allowing applications to respond to database state changes.

b. the 10gR2 Load Balancing Advisory, the feature that permit load balancing accross different rac nodes dependent of the load on the different nodes. The rdbms MMON is creating an advisory for distribution of work every 30seconds and forward it via racgimon and ONS to listeners and applications

No comments:

Post a Comment