TANTI TECHNOLOGIES: HACMP Basics

HACMP Basics
SANDEEP TANTI

History
IBM's HACMP exists for almost 15 years. It's not actually an IBM product, they bought it from CLAM, which was later renamed to Availant and is now called LakeViewTech. Until august 2006, all development of HACMP was done by CLAM. Nowadays IBM does it's own development of HACMP in Austin, Poughkeepsie and Bangalore

IBM's high availability solution for AIX, High Availability Cluster Multi Processing (HACMP), consists of two components:

•High Availability: The process of ensuring an application is available for use through the use of duplicated and/or shared resources (eliminating Single Points Of Failure – SPOF's)

.Cluster Multi-Processing: Multiple applications running on the same nodes with shared or concurrent access to the data.

A high availability solution based on HACMP provides automated failure detection, diagnosis, application recovery and node reintegration. With an appropriate application, HACMP can also provide concurrent access to the data for parallel processing applications, thus offering excellent horizontal scalability.

What needs to be protected? Ultimately, the goal of any IT solution in a critical environment is to provide continuous service and data protection.

The High Availability is just one building block in achieving the continuous operation goal. The High Availability is based on the availability hardware, software (OS and its components), application and network components.

The main objective of the HACMP is to eliminate Single Points of Failure (SPOF's)

“…A fundamental design goal of (successful) cluster design is the elimination of single points of failure (SPOFs)…”

Eliminate Single Point of Failure (SPOF)
Cluster Eliminated as a single point of failure

Node Using multiple nodes
Power Source Using Multiple circuits or uninterruptible
Network/adapter Using redundant network adapters
Network Using multiple networks to connect nodes.
TCP/IP Subsystem Using non-IP networks to connect adjoining nodes & clients
Disk adapter Using redundant disk adapter or multiple adapters
Disk Using multiple disks with mirroring or RAID
Application Add node for takeover; configure application monitor
Administrator Add backup or every very detailed operations guide
Site Add additional site.

Cluster Components

Here are the recommended practices for important cluster components.

Nodes

HACMP supports clusters of up to 32 nodes, with any combination of active and standby nodes. While it
is possible to have all nodes in the cluster running applications (a configuration referred to as "mutual
takeover"), the most reliable and available clusters have at least one standby node - one node that is normally
not running any applications, but is available to take them over in the event of a failure on an active
node.

Additionally, it is important to pay attention to environmental considerations. Nodes should not have a
common power supply - which may happen if they are placed in a single rack. Similarly, building a cluster
of nodes that are actually logical partitions (LPARs) with a single footprint is useful as a test cluster, but
should not be considered for availability of production applications.
Nodes should be chosen that have sufficient I/O slots to install redundant network and disk adapters.
That is, twice as many slots as would be required for single node operation. This naturally suggests that
processors with small numbers of slots should be avoided. Use of nodes without redundant adapters
should not be considered best practice. Blades are an outstanding example of this. And, just as every cluster
resource should have a backup, the root volume group in each node should be mirrored, or be on a

RAID device.
Nodes should also be chosen so that when the production applications are run at peak load, there are still
sufficient CPU cycles and I/O bandwidth to allow HACMP to operate. The production application
should be carefully benchmarked (preferable) or modeled (if benchmarking is not feasible) and nodes chosen
so that they will not exceed 85% busy, even under the heaviest expected load.
Note that the takeover node should be sized to accommodate all possible workloads: if there is a single
standby backing up multiple primaries, it must be capable of servicing multiple workloads. On hardware
that supports dynamic LPAR operations, HACMP can be configured to allocate processors and memory to
a takeover node before applications are started. However, these resources must actually be available, or
acquirable through Capacity Upgrade on Demand. The worst case situation – e.g., all the applications on
a single node – must be understood and planned for.

Networks

HACMP is a network centric application. HACMP networks not only provide client access to the applications
but are used to detect and diagnose node, network and adapter failures. To do this, HACMP uses
RSCT which sends heartbeats (UDP packets) over ALL defined networks. By gathering heartbeat information
on multiple nodes, HACMP can determine what type of failure has occurred and initiate the appropriate
recovery action. Being able to distinguish between certain failures, for example the failure of a network
and the failure of a node, requires a second network! Although this additional network can be “IP
based” it is possible that the entire IP subsystem could fail within a given node. Therefore, in addition
there should be at least one, ideally two, non-IP networks. Failure to implement a non-IP network can potentially
lead to a Partitioned cluster, sometimes referred to as 'Split Brain' Syndrome. This situation can
occur if the IP network(s) between nodes becomes severed or in some cases congested. Since each node is
in fact, still very alive, HACMP would conclude the other nodes are down and initiate a takeover. After
takeover has occurred the application(s) potentially could be running simultaneously on both nodes. If the
shared disks are also online to both nodes, then the result could lead to data divergence (massive data corruption).
This is a situation which must be avoided at all costs.

The most convenient way of configuring non-IP networks is to use Disk Heartbeating as it removes the
problems of distance with rs232 serial networks. Disk heartbeat networks only require a small disk or
LUN. Be careful not to put application data on these disks. Although, it is possible to do so, you don't want
any conflict with the disk heartbeat mechanism!

Adapters

As stated above, each network defined to HACMP should have at least two adapters per node. While it is
possible to build a cluster with fewer, the reaction to adapter failures is more severe: the resource group
must be moved to another node. AIX provides support for Etherchannel, a facility that can used to aggregate
adapters (increase bandwidth) and provide network resilience. Etherchannel is particularly useful for
fast responses to adapter / switch failures. This must be set up with some care in an HACMP cluster.
When done properly, this provides the highest level of availability against adapter failure. Refer to the IBM
techdocs website: http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD101785 for further
details.
Many System p TM servers contain built-in Ethernet adapters. If the nodes are physically close together, it
is possible to use the built-in Ethernet adapters on two nodes and a "cross-over" Ethernet cable (sometimes
referred to as a "data transfer" cable) to build an inexpensive Ethernet network between two nodes for
heart beating. Note that this is not a substitute for a non-IP network.
Some adapters provide multiple ports. One port on such an adapter should not be used to back up another
port on that adapter, since the adapter card itself is a common point of failure. The same thing is true
of the built-in Ethernet adapters in most System p servers and currently available blades: the ports have a
common adapter. When the built-in Ethernet adapter can be used, best practice is to provide an additional
adapter in the node, with the two backing up each other.
Be aware of network detection settings for the cluster and consider tuning these values. In HACMP terms,
these are referred to as NIM values. There are four settings per network type which can be used : slow,
normal, fast and custom. With the default setting of normal for a standard Ethernet network, the network
failure detection time would be approximately 20 seconds. With todays switched network technology this
is a large amount of time. By switching to a fast setting the detection time would be reduced by 50% (10
seconds) which in most cases would be more acceptable. Be careful however, when using custom settings,
as setting these values too low can cause false takeovers to occur. These settings can be viewed using a variety
of techniques including : lssrc –ls topsvcs command (from a node which is active) or odmget
HACMPnim |grep –p ether and smitty hacmp.

Applications
The most important part of making an application run well in an HACMP cluster is understanding the
application's requirements. This is particularly important when designing the Resource Group policy behavior
and dependencies. For high availability to be achieved, the application must have the ability to
stop and start cleanly and not explicitly prompt for interactive input. Some applications tend to bond to a
particular OS characteristic such as a uname, serial number or IP address. In most situations, these problems
can be overcome. The vast majority of commercial software products which run under AIX are well
suited to be clustered with HACMP.

Application Data Location
Where should application binaries and configuration data reside? There are many arguments to this discussion.
Generally, keep all the application binaries and data were possible on the shared disk, as it is easy
to forget to update it on all cluster nodes when it changes. This can prevent the application from starting or
working correctly, when it is run on a backup node. However, the correct answer is not fixed. Many application
vendors have suggestions on how to set up the applications in a cluster, but these are recommendations.
Just when it seems to be clear cut as to how to implement an application, someone thinks of a new
set of circumstances. Here are some rules of thumb:
If the application is packaged in LPP format, it is usually installed on the local file systems in rootvg. This
behavior can be overcome, by bffcreate’ing the packages to disk and restoring them with the preview option.
This action will show the install paths, then symbolic links can be created prior to install which point
to the shared storage area. If the application is to be used on multiple nodes with different data or configuration,
then the application and configuration data would probably be on local disks and the data sets on
shared disk with application scripts altering the configuration files during fallover. Also, remember the
HACMP File Collections facility can be used to keep the relevant configuration files in sync across the cluster.
This is particularly useful for applications which are installed locally.

Start/Stop Scripts
Application start scripts should not assume the status of the environment. Intelligent programming should
correct any irregular conditions that may occur. The cluster manager spawns theses scripts off in a separate
job in the background and carries on processing. Some things a start script should do are:
First, check that the application is not currently running! This is especially crucial for v5.4 users as
resource groups can be placed into an unmanaged state (forced down action, in previous versions).
Using the default startup options, HACMP will rerun the application start script which may cause
problems if the application is actually running. A simple and effective solution is to check the state
of the application on startup. If the application is found to be running just simply end the start script
with exit 0.
Verify the environment. Are all the disks, file systems, and IP labels available?
If different commands are to be run on different nodes, store the executing HOSTNAME to variable.
Check the state of the data. Does it require recovery? Always assume the data is in an unknown state
since the conditions that occurred to cause the takeover cannot be assumed.
Are there prerequisite services that must be running? Is it feasible to start all prerequisite services
from within the start script? Is there an inter-resource group dependency or resource group sequencing
that can guarantee the previous resource group has started correctly? HACMP v5.2 and later has
facilities to implement checks on resource group dependencies including collocation rules in
HACMP v5.3.
Finally, when the environment looks right, start the application. If the environment is not correct and
error recovery procedures cannot fix the problem, ensure there are adequate alerts (email, SMS,
SMTP traps etc) sent out via the network to the appropriate support administrators.
Stop scripts are different from start scripts in that most applications have a documented start-up routine
and not necessarily a stop routine. The assumption is once the application is started why stop it? Relying
on a failure of a node to stop an application will be effective, but to use some of the more advanced features
of HACMP the requirement exists to stop an application cleanly. Some of the issues to avoid are:

Be sure to terminate any child or spawned processes that may be using the disk resources. Consider
implementing child resource groups.
Verify that the application is stopped to the point that the file system is free to be unmounted. The
fuser command may be used to verify that the file system is free.
In some cases it may be necessary to double check that the application vendor’s stop script did actually
stop all the processes, and occasionally it may be necessary to forcibly terminate some processes.
Clearly the goal is to return the machine to the state it was in before the application start script was run.
Failure to exit the stop script with a zero return code as this will stop cluster processing. * Note: This is not the case with start scripts!
Remember, most vendor stop/starts scripts are not designed to be cluster proof! A useful tip is to have stop
and start script verbosely output using the same format to the /tmp/hacmp.out file. This can be achieved
by including the following line in the header of the script: set -x && PS4="${0##*/}"'[$LINENO]

HACMP
HACMP Daemons
HACMP Log files
HACMP Startup and Shutdown
HACMP Version 5.x
What is new in HACMP 5.x
Cluster Communication Daemon
Heart Beating
Forced Varyon of Volume Groups
Custom Resource Group
Application Monitoring
Resource Group Tasks

HACMP Daemon

01. clstrmgr
02. clinfo
03. clmuxpd
04. cllockd

HACMP Log files

/tmp/hacmp.out: It records the output generated by the event scripts as they execute. When checking the /tmp/hacmp.out file, search for EVENT FAILED messages. These messages indicate that a failure has occurred. Then, starting from the failure message, read back through the log file to determine exactly what went wrong.
The /tmp/hacmp.out file is a standard text file. The system creates a new hacmp.out log file every day and retains the last seven copies. Each copy is identified by a number appended to the file name. The most recent log file is named /tmp/hacmp.out; the oldest version of the file is named /tmp/hacmp.out.7
/usr/es/adm/cluster.log: It is the main HACMP log file. HACMP error messages and messages about HACMP-related events are appended to this log with the time and date at which they occurred
/usr/es/sbin/cluster/history/cluster.mmddyyyy: It contains time-stamped, formatted messages generated by HACMP scripts. The system creates a cluster history file whenever cluster events occur, identifying each file by the file name extension mmddyyyy, where mm indicates the month, dd indicates the day, and yyyy indicates the year.
/tmp/cspoc.log: It contains time-stamped, formatted messages generated by HACMP C-SPOC commands. The /tmp/cspoc.log file resides on the node that invokes the C-SPOC command.
/tmp/emuhacmp.out: It records the output generated by the event emulator scripts as they
execute. The /tmp/emuhacmp.out file resides on the node from which the event emulator is
invoked.

HACMP Startup and shutdown

HACMP startup option:
Cluster to re-aquire resources: If cluster services were stopped with the forced option, hacmp expects all cluster resources on this node to be in the same state when cluster services are restarted. If you have changed the state of any resources while cluster services were forced down, you can use this option to have hacmp reacquire resources during startup.
HACMP Shutdown Modes:
Graceful: Local machine shuts itself gracefully. Remote machine interpret this as a graceful down and do not takeover resources
Takeover: Local machine shuts itself down gracefully. Remote machine interpret this as a non-graceful down and takeover resources
Forced: Local machine shuts down cluster services without releasing any resources. Remote machine do not take over any resources. This mode is use ful for system maintenence.

HACMP 5.x

New in AIX 5.1

• SMIT Standard and Extended configuration paths (procedures)
• Automated configuration discovery
• Custom resource groups
• Non IP networks based on heartbeating over disks
• Fast disk takeover
• Forced varyon of volume groups
• Heartbeating over IP aliases
• Heartbeating over disks
• Heartbeat monitoring of service IP addresses/labels on takeover node(
• Now there is only HACMP/ES,based on IBM Reliable Scalable Cluster Technology
• Improved security, by using cluster communication daemon
• Improved performance for cluster customization and synchronization
• Fast disk takeover
• GPFS integration
• Cluster verification enhancements
New In AIX 5.2
• Custom only resource groups
• Cluster configuration auto correction
• Cluster file collections
• Automatic cluster verification
• Application startup monitoring and multiple application monitors
• Cluster lock manager dropped
• Resource Monitoring and Control (RMC) subsystem replaces Event Management

HACMP 5.3 Limits

• 32 nodes in a cluster
• 64 resource group in a cluster
• 256 IP addresses known to HACMP (Service and boot IP lables)
• RSCT limit: 48 heartbeat rings

Cluster Communication Daemon

The Cluster Communication Daemon, clcomdES, provides secure remote command execution and HACMP ODM configuration file updates by using the principle of the "least privilege".
The cluster communication daemon (clcomdES) has the following characteristics:
• Since cluster communication does not require the standard AIX \"r\" commands, the dependency on the /.rhosts file has been removed. Thus, even in \"standard\" security mode, the cluster security has been enhanced.
• Provides reliable caching mechanism for other node's ODM copies on the local node (the node from which the configuration changes and synchronization are performed).
• Limits the commands which can be executed as root on remote nodes (only the commands in /usr/es/sbin/cluster run as root).
• clcomdES is started from /etc/inittab and is managed by the system resource controller (SRC) subsystem.
• Provides its own heartbeat mechanism, and discovers active cluster nodes (even if cluster manager or RSCT is not running).
• Uses HACMP ODM classes and the /usr/es/sbin/cluster/rhosts file to determine legitimate partners.

Heartbeating

Starting with HACMP V5.1, heartbeating is exclusively based on RSCT topology services
The heartbeat via disk (diskhb) is a new feature introduced in HACMP V5.1, with a proposal to provide additional protection against cluster partitioning and simplified non-IP network configuration. This type of network can use any type of shared disk storage (Fibre Channel, SCSI, or SSA), as long as the disk used for exchanging KA messages is part of an AIX enhanced concurrent volume group. The disks used for heartbeat networks are not exclusively dedicated for this purpose; they can be used to store application shared data

Forced varyon of volume groups

HACMP V5.1 provides a new facility, the forced varyon of a volume group option on a node. You should use a forced varyon option only for volume groups that have mirrored logical volumes, and use caution when using this facility to avoid creating a partitioned cluster.
When using a forced varyon of volume groups option in a takeover situation, HACMP first tries a normal varyonvg. If this attempt fails due to lack of quorum, HACMP checks the integrity of the data to ensure that there is at least one available copy of all data in the volume group before trying to force the volume online. If there is, it runs varyonvg -f; if not, the volume group remains offline and the resource group results in an error state.

Custom Resource groups

Startup preferences
• Online On Home Node Only: At node startup, the RG will only be brought online on the highest priority node. This behavior is equivalent to cascading RG behavior.
• Online On First Available Node: At node startup, the RG will be brought online on the first node activated. This behavior is equivalent to that of a rotating RG or a cascading RG with inactive takeover. If a settling time is configured, it will affect RGs with this behavior.
• Online On All Available Nodes: The RG should be online on all nodes in the RG. This behavior is equivalent to concurrent RG behavior. This startup preference will override certain fall-over and fall-back preferences.
Fallover preferences
• Fallover To Next Priority Node In The List: The RG will fall over to the nextavailable node in the node list. This behavior is equivalent to that of cascading and rotating RGs.
• Fallover Using Dynamic Node Priority: The RG will fall over based on DNP calculations. The resource group must specify a DNP policy.
• Bring Offline (On Error Node Only): The RG will not fall over on error; it will simply be brought offline. This behavior is most appropriate for concurrent-like RGs.
The settling time specifies how long HACMP waits for a higher priority node (to join the cluster) to activate a custom resource group that is currently offline on that node. If you set the settling time, HACMP waits for the duration of the settling time interval to see if a higher priority node may join the cluster, rather than simply activating the resource group on the first possible node that reintegrates into the cluster.
Fallback preferences
• Fallback To Higher Priority Node: The RG will fall back to a higher priority node if one becomes available. This behavior is equivalent to cascading RG behavior. A fall-back timer will influence this behavior.
• Never Fallback: The resource group will stay where it is, even if a higher priority node comes online. This behavior is equivalent to rotating RG behavior.
A delayed fall-back timer lets a custom resource group fall back to its higher priority node at a specified time. This lets you plan for outages for maintenance associated with this resource group.
You can specify the following types of delayed fall-back timers for a custom resource group:
• Daily
• Weekly
• Monthly
• Yearly
• On a specific date

Application Monitoring

HACMP can also monitor applications in one of the following two ways:
• Application process monitoring: Detects the death of a process, using RSCT event management capability.
• Application custom monitoring: Monitors the health of an application based on a monitoring method (program or script) that you define.
When application monitoring is active, HACMP behaves
• For application process monitoring, a kernel hook informs manager that the monitored process has died, and HACMP application recovery process.
For the recovery action to take place, you must provide and restart the application (the application start/stop application server definition may be used). HACMP tries to restart the application and waits for the a specified number of times, before sending an notification actually moving the entire RG to a different node (next priority list).
• For custom application monitoring (custom method), cleanup and restart methods, you must also provide
used for performing periodic application tests.

Resource Group Tasks

To list the resource groups configured for a cluster
# cllsgrp
To list the details of of a resource group
# clshowres
To bring RG1 offline on Node3
# clRGmove -g RG1 -n node3 -d <--- -d for down) To bring CrucialRG online on Node3 # clRGmove -g CrucialRG -n node3 -u To check the current resource status # clfindres or # clRGinfo To find out the current cluster stat and obtain informatin about cluster # cldump Obtaining information via SNMP from Node: err3qci0... _____________________________________________________________________________ Cluster Name: erpqa1 Cluster State: UP Cluster Substate: STABLE _____________________________________________________________________________ Node Name: err3qci0 State: UP Network Name: corp_ether_01 State: UP Address: 10.0.5.2 Label: r3qcibt1cp State: UP Address: 10.0.6.2 Label: r3qcibt2cp State: UP Address: 10.253.1.75 Label: sapr3qci State: UP Network Name: prvt_ether_01 State: UP Address: 10.0.7.2 Label: r3qcibt1pt State: UP Address: 10.0.8.2 Label: r3qcibt2pt State: UP Address: 192.168.200.79 Label: psapr3qci State: UP Network Name: ser_rs232_01 State: Node Name: err3qdb0 State: UP Network Name: corp_ether_01 State: UP Address: 10.0.5.1 Label: r3qdbbt1cp State: UP Address: 10.0.6.1 Label: r3qdbbt2cp State: UP Address: 10.253.1.55 Label: sapr3qdb State: UP Network Name: prvt_ether_01 State: UP Address: 10.0.7.1 Label: r3qdbbt1pt State: UP Address: 10.0.8.1 Label: r3qdbbt2pt State: UP Address: 192.168.200.8 Label: psapr3qdb State: UP Network Name: ser_rs232_01 State: UP Address: Label: r3qdb_ser State: UP Cluster Name: erpqa1 Resource Group Name: SapCI_RG Startup Policy: Online On Home Node Only Fallover Policy: Fallover To Next Priority Node In The List Fallback Policy: Never Fallback Site Policy: ignore Priority Override Information: Primary Instance POL: Node Group State ---------------------------- --------------- err3qci0 ONLINE err3qdb0 OFFLINE Resource Group Name: OraDB_RG Startup Policy: Online On Home Node Only Fallover Policy: Fallover To Next Priority Node In The List Fallback Policy: Never Fallback Site Policy: ignore Priority Override Information: Primary Instance POL: Node Group State ---------------------------- --------------- err3qdb0 ONLINE err3qci0 OFFLINE Syncronizing the VG info in HACMP if cluster is already running: 01. In the system where the VG changes are made, break the reserve on disks using varyonvg command # varyonvg -b -u

02. Import the VG in the system where the VG info need to be updated. Use the -n and -F flag to not to vary on the VG

# importvg -V -y -n -F

03. Varyon the VG without the SCSI reserves
# varyonvg -b -u

04. Change the VG not to caryon automatically
# chvg -an -Qy

05. Varyoff the VG
# varyoffvg

06. Put the SCSI reserves back in the primary server
# varyonvg

Some useful HACMP Commands
To list all the app servers configured including start and stop script
# cllsserv
OraDB_APP /usr/local/bin/dbstart /usr/local/bin/dbstop
SapCI_APP /usr/local/bin/sapstart /usr/local/bin/sapstop
To list the application monitoring configured on a cluster
# cllsappmon
OraDB_Mon user
SapCI_Mon user
To get the detailed information about application monitoring
# cllsappmon
# cllsappmon -h OraDB_Mon
#name type MONITOR_METHOD MONITOR_INTERVAL INVOCATION HUNG_MONITOR_SIGNA
STABILIZATION_INTERVAL FAILURE_ACTION RESTART_COUNT RESTART_INTERVAL RESTART_METHOD
NOTIFY_METHOD CLEANUP_METHOD PROCESSES PROCESS_OWNER INSTANCE_COUNT RESOURCE_TO_MONITOR
OraDB_Mon user /usr/local/bin/dbmonitor 30 longrunning 9 180 fallover
1 600 /usr/local/bin/dbstart /usr/local/bin/dbstop
To clear a hacmp logs
# clclear

HACMP Upgrading options
01. Rolling Migration
02. Snapshot Migration
To apply the online worksheet
/usr/es/sbin/cluster/utilities/cl_opsconfig

HACMP Tips I - Files and Scripts

1. Where is the rhosts file located for HACMP ?

Location: /usr/es/sbin/cluster/etc/rhosts
Used By: clcomd daemon to validate the addresses of the incoming connections
Updated By:
It is updated automatically by clcomd daemon during the first connection.
But we should update it manually incase of configuring the cluster on an unsecured network.

2. What happened to ~/.rhosts file in the current version of HACMP ?

~/.rhosts is only needed during the migration from pre-5.1 versions of hacmp.
Once migration is completed, we should remove the file if no other applications need rsh.
From HACMP V5.1, inter-node communication for cluster services is handled by clcomd daemon.

3. What is the entry added to /etc/inittab for to IP Address Takeover ?

harc:2:wait:/usr/es/sbin/cluster/etc/harc.net # HACMP network startup

4. What is the entry added to /etc/inittab file due auto-start of HACMP ?
hacmp:2:once:/usr/es/sbin/cluster/etc/rc.init

5. What is the script used to start cluster services ?

/usr/es/sbin/cluster/etc/rc.cluster

6. rc.cluster calls a script internally to start the cluster services. What is that ?

/usr/es/sbin/cluster/utilities/clstart

7. What is the equivalent script for clstart in CSPOC ?

/usr/es/sbin/cluster/sbin/cl_clstart

8. What is the script used to stop cluster services ?

/usr/es/sbin/cluster/utitilies/clstop

9. What is the equivalent script for clstop in CSPOC ?

/usr/es/sbin/cluster/sbin/cl_clstop

10. What happens when clstrmgr daemon terminates abmornally ?

/usr/es/sbin/ckuster/utilities/clexit.rs script ahlts the system.
You can change the default behavior of the clexit.rc script by configuring
/usr/es/sbin/cluster/etc/hacmp.term

11. What script is invoked by clinfo daemon incase of a network or node event ?

/usr/es/sbin/cluster/etc/clinfo.rc

HACMP Tips II - Utility Commands

Below mentioned utility commands are available under /usr/es/sbin/cluster/utilities.
If you need, please add it to your PATH variable.

1. To list cluster and node topology information :

# cltopinfo (or) cllscf

2. To show the config for the nodes :

# cltopinfo -n

3. To show all networks configured in the cluster :

# cltopinfo -w

4. To show resources defined for all groups :

# clshowres

5. To show resources defined to selected the group :

# clshowres -g

6. To list all resource groups :

# cllsgrp

7. To list all file systems :

# cllsfs

8. To list the service IPs configured for a node :

# cllsip nodename

9. To show the whole cluster configuration :

# cldump

10. To show adapter information :

# cllsif

11. To show network information :

# cllsnw

12. To show the status of resource groups :

# clfindres

13. To list all resources :

# cllsres

14. To list all tape resources :

# cllstape

15. To list all nodes in a cluster :

# cllsnode

16. To list all application servers alongwith their start and stop scripts :

# cllsserv

17. To list all logical volumes in a cluster :

# cllslv

18. To list all IP networks in a cluster :

# cllsipnw

19. To list all alive network interfaces :

# cllsaliveif
CSPOC commands are located under /usr/es/sbin/cluster/sbin. If you need, please add this directory to your PATH.

1. To create a user in a cluster :

# cl_mkuser

2. To change/set passwd for a user in a cluster :

# cl_chpasswd

3. To change a user's attribute in a cluster :

# cl_chuser

4. To remove a user in a cluster :

# cl_rmuser

5. To list users in a cluster :

# cl_lsuser

6. To create a group in a cluster :

# cl_mkgroup

7. To change attributes of a group :

# cl_lsgroup

8. To remove a group in a cluster :

# cl_rmgroup

9. To create a shared VG in a cluster :

# cl_mkvg

10. To change the attributes of a shared VG :

# cl_chvg

11. To extend a VG (add a PV to a VG) :

# cl_extendvg

12. To reduce a VG (remove a PV from a VG) :

# cl_reducevg

13. To mirror a VG :

# cl_mirrorvg

14. To unmirror a VG :

# cl_unmirrorvg

15. To list VG's in a cluster :

# cl_lsvg

16. To sync a VG :

# cl_syncvg

17. To import a volume group :

# cl_importvg

18. To import a VG into a list of nodes :

# cl_updatevg

19. To activate/varyon a VG :

# cl_activate_vgs VG_name

20. To deactivate/varyoff a VG :

# cl_deactivate_vgs VG_name

21. To create a LV :

# cl_mklv

22. To change the attributes of a LV :

# cl_chlv

23. To list a LV :

# cl_lslv

24. To remove a LV :

# cl_rmlv

25. To make copies for a LV :

# cl_mklvcopy

26. To remove copies for a LV :

# cl_rmlvcopy

27. To extend a LV :

# cl_extendlv

28. To create a file system in a cluster :

# cl_crfs

29. To create a LV followed by a FS :

# cl_crlvfs

30. To change the attribute of a FS :

# cl_chfs

31. To lsit file systems :

# cl_lsfs

32. To remove a FS :

# cl_rmfs

33. To show JFS2 file systems with all attributes :

# cl_lsjfs2

34. To list JFS2 filesysems and their resource groups :

# cl_showfs2

35. To activate/mount a file system :

# cl_activate_fs /filesystem_mountpoint

36. To activate/mount a NFS file system :

# cl_activate_nfs retry NFS_Hostname /filesystem_mountpoint

37. To deactivate/unmount a file system :

# cl_deactivate_fs /filesystem_mountpoint

38. To deactivate/unmount a NFS file system :

# cl_deactivate_nfs /filesystem_mountpoint

39. To export(NFS) a file system :

# cl_export_fs hostname /filesystem_mountpoint

40. To list the process numbers using the NFS directory :

# cl_nfskill -u /nfs_mountpoint

41. To kill the processes using the NFS directory :

# cl_nfskill -k /nfs_mountpoint

Here are my Q&A

1. What are the different kind of failures HACMP will
answer(respond) ?
ANS:
a) Node Failure
b) Network Failure
c) Network Adapter Failure

For other failures like disk, application we have to configure
seperately using LVM, application monitoring scripts, etc..

Be clear that HACMP is just fault resilience and not fault tolerant
like mainframes. They cant go for mainframe becoz of its high cost.
Thats the reason people are going for ha clsuters.

2. List some of Cluster Topology objects?
ANS:
a) Node
b) Network (IP and Non-IP)
c) Network Adapter
d) Physical Volumes

3. List some of Cluster Resources ?
ANS:
a) Application Server
b) Volume Groups
c) Logical Volumes
d) File Systems
e) Service IP Label/Addresses
f) Tape resources
g) Communication Links

4. Do HACMP detect VG mirror failures ? If not how to make it(VG)
redundant or how to findout/sort out the mirror failures?
ANS: HACMP dont detect VG failures. This has to be implemented using
AIX LVM Soft Mirroring or else in SAN side.

5. List the steps required to configure a cluster ?
ANS:

a) Plan AIX, HACMP levels, Cluster configuration, network diagram,
etc..

b) Install AIX, fixes

c) Configure AIX
- Storage (Adapters, VG, LV, File Ssytems)
- Network (IP Interfaces, /etc/hosts, non-IP networks and devices)
- Application Start and stop scripts

d) Install HACMP file sets and fixes in all the cluster nodes. Then
reboot all the ndoes in the clusters

e) Configure HACMP Environment
- Topology (Cluster, node names, HACMP IP and non-ip networks)
- Resources (Application Server, Service Label, VG, File System, NFS)
- Resource Groups (Identify name, nodes, policies)

f) Synchronize and test the cluster

g) Tune the system and HACMP based on test result
- syncd frequency
- Basic VMM Tuning
- Failure detection rate
- I/O Pacing

h) Start HACMP Services

6. List out some of the HACMP log files ?
ANS:
a) /usr/es/adm/cluster.log - Messages from scripts and daemons (Date
Time Node Subsystem PID Message)
b) /tmp/hacmp.out - Output from configuration, start and stop event
scripts
c) /usr/es/sbin/cluster/history/cluster.mmddyy
d) /tmp/clrmgr.debug - Cluster manager activity
e) /tmp/clappmon..log - Application monitor logs
f) /var/ha/log/top*,/var/ha/log/grpsvcs* - RSCT Logs
g) /var/hacmp/clcomd/clcomd.log - communications daemon log
h) /var/hacmp/clverify - Previous successful and unsuccessful
verification attempts
i) /var/hacmp/log/cl_testtool.log - Cluster test tool logs

7. What are the 3 policies related to a resource group ?

ANS:
a) Start up - Online On Home Node Only, Online On First Available
Node, Online Using Node Distribution Policy, Online On All Available
Nodes.

b) Fallover - Fallover To Next Priority Node In The List, Fallover
Using Dynamic Node Priority, Bring Offline (On Error Node Only).

c) Fall back - Fallback To Higher Priority Node In The List, Never
Fallback

8. Expand the following :
ANS:
a) HACMP - High Availability Cluster Multi Proccessing
b) RG - Resource Group
c) C-SPOC - Central Single Point of Contact
d) SPOF - Single Point of Failure
e) ODM - Object Data Manager
f) SRC - System Resource Controller
g) RSCT - Reliable Scalable Cluster Technology

9. How to list the info on heartbeat data ?
ANS:
# lssrc -ls topsvcs

10. How to list out the info on cluster manager and DNP Information ?
ANS: # lssrc -ls clstrmgrES

11. What is the HA daemon that gets started by /etc/inittab ?
ANS: clcomd gets started by init process. It has an entry in /etc/
inittab

12. How will you start cluster services in a node? Give the command as
well as smitty fastpath.
ANS:
To Start Cluster Services:
Command: #/usr/es/sbin/cluster/etc/rc.cluster (Check the options
available for tis command)
Smitty Fast Path: clstart

To Stop Cluster Services:
Command: /usr/es/sbin/cluster/utilities/clstop
Smitty Fast Path: clstop

13. How many network adapters are required/recommended in a node
belonging to a cluster ?
ANS:
Minimum 2 network adapters are required per node. This is required to
manage network adapter failure event.

14. For a 2 node cluster (with 1 RG) with 2 N/W adapters for each
node, how many IP Label /Address are required. Give some example ?
ANS:
Lets consider a very much used cluster configuration.

Cluster cluster_DB with 2 nodes nodea and nodeb.
Nodea have 2 network adapters with nodea_boot ip label and nodea_stdby
ip label on en0 and en1 resepctively.
And Nodeb have 2 network adapters with nodeb_bootip and nodeb_stdbyip
on en0 and en1 resepctively.
This cluster have a VG, Service IP grouped in a resource group.
Minimum 1 service IP is required for a RG.

When we start RG in nodea, ndoea_bootip on en0 will bet replaced/
aliased by the service ip.

15. How can we achieve non-ip network (for hearbeat) ?
ANS:
Non-IP Network can be achieved thru any of the following ways
a) Serial/rs232 Connection (using /dev/ttyx devices) - Widely used in
old clusters
b) Disk based heart beat (over an ECM(VG) disk) - Widely used in
recent clusters. Becoz People want to eliminate those lengthy serial
cables.
c) Target Mode SCSI - Not widely used
d) Target Mode SSA - Not widely used

16. What are the different ways to set up achieve IP Address
Takeover ?
ANS:
a) IP Address Takeover via IP Alias
b) IP Address Takeover via IP Replacement

17. Is a non-ip network requried for a cluster? Say Yes/No. Also
justify your answer
ANS: Yes. To avoid split-brain problem.

18. How many service IP addresses we can have for a single resoruce
group ?
ANS: Not sure. Have to check in smitty screen.

19. What is the difference between communication interface and
communication device? Also list their usage.
ANS: Dont know how to explain. Below line should answer:
/dev/en0 is a communication interface whereas /dev/tty1 is a
communication device.
/edv/en0 is used for IP Network and /dev/tty1 is used for non-ip
network. I mean tty1 is used only for heardtbeat.

20. Persistent IP Label/Address is a floating IP Label. True/False.
Justify your answer
ANS: No. It resides on a single ndoe and dont move to another node.

21. Which of the following IP Label is stored in AIX ODM.

a) Service IP
b) Boot IP
c) Stand-by IP
d) Persistent IP
Ans: Only Boot IP and Stand-by IP are stored in AIX ODM.

22. If we use a SAN disk for a heartbeat, what type of VG it should
belong? Normal, Big, Scalable, Enhanced Concurrent Mode Vg ?
ANS: ECM (Enhanced Concurrent Mode) Volume Group

23. While stopping cluster services, what are the different type of
shutdown modes available? Do justify.
ANS:
a) graceful
b) graceful with takeover
c) forced

24. How will you view the cluster status ?
ANS: #/usr/es/sbin/cluster/clstat

25. How to list out the RG Status?
ANS: #/usr/es/sbin/cluster/sbin/utilities/clRGinfo

26. What are the ways to eliminate Single Points of Failure ?
ANS:
a) Node : Using multiple nodes
b) Power Source : Using multiple circuits or uninterruptible power
supplies
c) Network Adapters : Using redundant network adapters
d) Network : Using multiple networks to connect nodes
e) TCP/IP Subsystem : Using non-IP networks to connect adjoining nodes
and clients
f) Disk Adapter : Using redundant disk adapter or multipath hardware
g) Disk : Using multiple disks with mirroring or raid
h) Application : Add node for takeover; configure application monitor
i) Administrator : Add backup or very detailed operations
guide
j) Site : Add additional site

Dont assume that HACMP will eliminate all SPOF. We have to plan to
eliminate all kindaa SPOF including UPS and AC for the Data Center.

27. What is the max. # of nodes we can configure in a single cluster ?
ANS: Max. we can have 32 nodes in a cluster

28. What is the max. # of resoruce groups we can configure in a single
cluster ?
ANS: Max. we can have 64 resource groups in a cluster

29. What is the max. # of IP address can be known to a single
cluster ?
ANS: Max. 265 IP addresses/labels can be known to a cluster

30. Which of the following disk technologies are supported by HACMP ?
ANS:
a) SCSI
b) SSA
c) SAN

31. Which command list the cluster topology ?

ANS: /usr/es/sbin/cluster/utilities/cltopinfo
Its a widely used command by HACMP admin to view the cluster topology
configuration

32. Which command sync's the cluster ?
ANS: #cldare -rtV normal

33. What is the latest version of HACMP and what versions of AIX it
supports ?
ANS: HACMP 5.4 is the latest version of HACMP. This version supports
only from AIX 5.2

34. How to test the disk hearbeat in a cluster ?

ANS:
To test the disk heartbeat link on nodes A and B, where hdisk1 is the
heartbeat path:
On Node A, #dhb_read -p hdisk1 -r
On Node B, #dhb_read -p hdisk1 -t

If the link is active, you see this message on both nodes:
Link operating normally.

35. List the daemons running for HA cluster.

ANS:
clcomd - STarted during boot thru /etc/inittab
clstrmgrES - Started during clstart
clsmuxpdES - Started during clstart. This daemon is not available
from HACMP 5.3; SNMP server functions are included in clstrmgrES
itself.
clinfoES - Started during clstart

36. What is the command used to move RG online ?

ANS: cldare and clRGmove
37. Does HACMP work on different operating systems?
Yes. HACMP is tightly integrated with the AIX 5L operating system and System p servers allowing for a rich set of features which are not available with any other combination of operating system and hardware. HACMP V5 introduces support for the Linux operating system on POWER servers. HACMP for Linux supports a subset of the features available on AIX 5L, however this mutli-platform support provides a common availability infrastructure for your entire enterprise.

38. What applications work with HACMP?
All popular applications work with HACMP including DB2, Oracle, SAP, WebSphere, etc. HACMP provides Smart Assist agents to let you quickly and easily configure HACMP with specific applications. HACMP includes flexible configuration parameters that let you easily set it up for just about any application there is.

39. Does HACMP support dynamic LPAR, CUoD, On/Off CoD, or CBU?
HACMP supports Dynamic Logical Partitioning, Capacity Upgrade on Demand, On/Off Capacity on Demand and Capacity Backup Upgrade.

40. If a server has LPAR capability, can two or more LPARs be configured with unique instances of HACMP running on them without incurring additional license charges?
Yes. HACMP is a server product that has one charge unit: number of processors on which HACMP will be installed or run. Regardless of how many LPARs or instances of AIX 5L that run in the server, you are charged based on the number of active processors in the server that is running HACMP. Note that HACMP configurations containing multiple LPARs within a single server may represent a potential single point-of-failure. To avoid this, it is recommended that the backup for an LPAR be an LPAR on a different server or a standalone server.
41 .Does HACMP support non-IBM hardware or operating systems?
Yes. HACMP for AIX 5L supports the hardware and operating systems as specified in the manual where HACMP V5.4 includes support for Red Hat and SUSE Linux.
HACMP - Configuration - Contd
HACMP - Configuration - Contd

~Go back to config HA commn Interfaces., --> Add commn Int., --> Add discovered --> select devices --> add devices

~Again Extended topology --> Add persistent IPs --> select node1
select the n/w and persistent IP.

~Lets go back to extended config --> Extended resource config --> config HA service IP label --> add the service IP (here prod_svc,dev_svc)

~Now lets go back to extended resource config --> HA extended RG config --> Add RG --> give the RG name (here test_rg).

~Now apply the service IPs and VGs we defined in the RG basket.

~Lets go back and select change/show attributes for RG --> select RG (here test_rg) -->select the service IP and VG in their respective fields.

Now lets verify our config.
HACMP - Configuration - Verification
HACMP - Configuration - Verification

~Go to Extended topology --> extended verification and synchronization --> select verify --> correct errors should be yes --> press enter

~After verification is successful continue with the synch.

~Then go to C-SPOC (smitty cl_admin) --> manage HA services --> start cluster services --> start clinfo daemon should be true (or we can enter smitty clstart).

~To check whether cluster is running or not check with lssrc -g cluster when doing a failover always check /tmp/hacmp.out with
# tail -f /tmp/hacmp.out
HACMP - Disk HeartBeat
HACMP - Disk HeartBeat

~cd to /usr/sbin/rsct/bin (not needed if path is already added)

~First execute # ./dhb_read -p hdisk2 -r on one node where hdisk2 is your heartbeat disk.

~Then execute # ./dhb_read -p hdisk2 -t on the other node.

~If they are working normally we should get a message Link operating normally.
HACMP Installation--Pre-Installation Tasks
~irst do smitty tcpip and install the communicatoin devices en0 and en1 so that we can establish communication between the two nodes.

~Then update the /etc/hosts file with the non-service IPs,persistent IPs and service IPs.

~Install bos.adt,bos.compat and any cluster.* filesets in AIX CD1. Also install rsct and bos.clvm filesets from CD3.
High Availability and Hardware Availability for HACMP
High availability is sometimes confused with simple hardware availability. Fault tolerant,redundant systems (such as RAID) and dynamic switching technologies (such as DLPAR)provide recovery of certain hardware failures, but do not provide the full scope of error detection and recovery required to keep a complex application highly available.
A modern, complex application requires access to all of these components:
• Nodes (CPU, memory)
• Network interfaces (including external devices in the network topology)
• Disk or storage devices.

Recent surveys of the causes of downtime show that actual hardware failures account for only
a small percentage of unplanned outages. Other contributing factors include:
• Operator errors
• Environmental problems
• Application and operating system errors.

Reliable and recoverable hardware simply cannot protect against failures of all these different
aspects of the configuration. Keeping these varied elements—and therefore the application—highly available requires:

• Thorough and complete planning of the physical and logical procedures for access and operation of the resources on which the application depends. These procedures help to avoid failures in the first place.
• A monitoring and recovery package that automates the detection and recovery from errors.
• A well-controlled process for maintaining the hardware and software aspects of the cluster configuration while keeping the application available.
HACMP - Configuration
HACMP - Configuration

~smitty hacmp --> extended configuration --> extended topology-->config a HACMP cluster --> Add/change/show a HA cluster

~Here give the cluster name (here test_cl).

~Press F3 to go back

~Again Extended topology --> config a HA node --> Add a node to a HA cluster

~Here give a node name (here aix1) and a communication path (here aix1_nsvc1).

~Similarly add another node.
Add a n/w config HA n/w Again Extended topology
add ether,rs232 and then SCSI one by one.

Discover devices....hacmp attempts to discover other devices based on the info provided by us.~Now go back to extended config

~Again Extended topology --> config HA commn., Interfaces --> Add a commn., Interface --> Add discovered --> commn Int., -->select All --> select en0,en1

TANTI TECHNOLOGIES

Tanti Technology

Sunday, 19 June 2011

HACMP Basics

No comments:

Post a Comment