Monday, 14 April 2014

Perform NIM operation without booting nim client machine to SMS mode:

Perform NIM operation without booting machine to SMS mode:

If you have any confusion or problem in finding out the ethernet adapter to select during the migration or any operation where you need to select the adapter for performing ping test or to start the installation through network, you can use the below command.

1. Find out the gateway of the servers where you need to perform the operation using below command:

netstat -nr
Destination Gateway
default x.x.x.x --> gateway of client

2. In "ifconfig -a" o/p, select the interface/adapter(entX) which has an IP, which is reachable from your laptop.(which you use to do network boot)

3. Set the bootlist in such a way, so that after the server reboot the it boots through network.

bootlist -m normal entX gateway=(GW of Client) bserver=(NIM IP) client=(CLIENT IP)

Advantages:

1. You may not end up with shutting down the wrong server from HMC GUI.
2. You can overcome the ping test, where mostly people give wrong ip details.
3. Also helps if one has less experience about the SMS menu

B700F120 LED server hung

AIX server migrated from AIX5.3 to AIX6.1 on 9117-mmb frame and the server hangs at the LED B700F120

B700F120

Explanation

Platform firmware detected an error

Response

The platform is unable to support the requested architecture options requested by the operating system via ibm,client-architecture-support interface.

Solution

6100-01 is not supported on a 9117-mmb. So if installing the AIX operating

system on a 9117-mmb frame, please make sure you go with one of these options.

AIX 6.1 with the 6100-04 Technology Level and Service Pack 3, or later.
AIX 6.1 with the 6100-03 Technology Level and Service Pack 5, or later.

AIX 6.1 with the 6100-02 Technology Level and Service Pack 8, or later.

If you have used lpp and spot of 6100-01 level, please update the lpp and spot to the above mentioned TL levels and perform the migration.

By doing so, you can bypass this LED hung of server during the reboot of the server after the OS migration

How to list files in alphabetical order in a directory

We know how to list the files by sorting with time, and using long list options, but anyone knows how to list based on the alphabetical order.

Here is the simple command to do that :

cd to directory and then

# ls -ltr | awk ' { print $9 }' | sort -d

This will list files based on alphabet order.

find oslevel of the clone rootvg

We usually know how to find the oslevel of a server, but if we want to find the oslevel of the cloned rootvg that exists in the disk, then its very simple.

Using the below blvset command, you can find the oslevel in the clone vg/disk.

/usr/lpp/bosinst/blvset -d /dev/hdisk0 -g level

Where hdisk0 is part of the cloned rootvg as below.

# lspv | grep rootvg
hdisk0 00z301dce111df74 altinst_rootvg
hdisk1 00c503d45e50dr04 rootvg active

If we need few more details on the cloned disk, then below command will be useful. If we need TL level just need to check the date when clone performed and we know when the TL or SP upgrade happened on the server. so based on that we can decide the TL level.

/usr/lpp/bosinst/blvset -d /dev/hdisk0 -g menu
locale: C C C C C C
console:
blvname: hd5
targ_dev: /dev/hdisk0
root_fsattr: hd4:jfs2:hd8
timestamp: Fri Jun 28 12:57:06 2013
padstring: 7.1 pad string:@%$#~!~~!~#$%@

dlpar not working between from hmc

Usually, if DLPAR fails due to below reason, its always problem with the rmc connection, which depends on RSCT daemons.

Error :

"A RMC network connection to the partition is not present. Verify the network settings on the partition and the HMC. If you select OK you will have to restart the partition for the resource changes to take effect."

Solution :

The very easy best thing, that we can do for making DLPAR to work is restart RMC along with the rsct group daemons, with the help of below three commands and wait for some 5 min, before you again test the dlpar capability.

# /usr/sbin/rsct/bin/rmcctrl -z
# /usr/sbin/rsct/bin/rmcctrl -A
# /usr/sbin/rsct/bin/rmcctrl -p

# lssrc -a | grep rsct

- IBM.DRM should be in active state

Go back to the HMC restricted shell command prompt

# lspartition -dlpar
the partition shows correct hostname & IP
Active<1> and Decaps value 0x3f

The above values mean that the partition is capable of a dlpar operation.

If above things don`t work even you wait for 5 to 10 min, please execute the above commands followed by therecfgct command. Below is the command.

/usr/sbin/rsct/install/bin/recfgct

Wait for sometime and give a try. It should work for sure if the problem is with the RMC, rsct related otherwise the problem may be because of firewall between your hmc and the server.

How to delete a file using inode

If the part of file name has some invalid charecters then we may face difficulty in deleting that file, so the best option to delete the file would be using inode number.

Below is the command for the same:

# find /directory -inum [inode-number] -exec rm -i {} \;

Script to perform FTP in background

Usually to transfer files from Fix Central to the servers, while downloading to your server it takes lot of time, so during this period we should make sure your network not to disconnect.

To avoid this problem, we can use a script to run this file transfer in background so that, there will be no need for us to be attentive towards the ftp transfer.

Here is the scripttttt

#!/bin/sh
USERNAME="anonymous"
PASSWORD="anonymous"
SERVER="FTP SITE SERVER NAME"

# local directory to pickup *.tar.gz file
FILE="/ecc/hsb/H60987058"

# login to remote server
ftp -n -i $SERVER <user $USERNAME $PASSWORD
cd $FILE
mget *.*
quit
EOF

- Just put these entries in a file and can be executed as below in background.

Save the content to the file --> auto-ftp.ksh
Command : nohup auto-ftp.ksh > /tmp/ftp.out &

- Note: The command has to be executed in a directory where you need to place the download the files!!!

how to increase the queue depth on the vio client

To change the queue_depth on the hdisk device:

To change the queue depth values, the hdisk should be free from I/O operataions. So

1. stop I/O on the device hence unmount all the filesystems.
- varyoffvg group name

> or rmdev -l hdisk

2. To change the queue_depth

- chdev -l hdisk -a queue_depth=

- varyonvg or cfgmgr -l hdisk

3. You can change the ODM using the -P option and rebooting.

- chdev -l hdiskX -a queue_depth=20 -P

shutdown -Fr

0516-404 allocp: This system cannot fulfill the allocation

Issue: volume group mirroring is giving error.

Error :

0516-404 allocp: This system cannot fulfill the allocation request. There are not enough free partitionsor not enough physical volumes to keep strictness and satisfy allocation requests. The command should be retried with different allocation characteristics.
0516-1517 mklvcopy: Failed to create a valid partition allocation.
0516-842 mklvcopy: Unable to make logical partition copies for logical volume.
0516-1199 mirrorvg: Failed to create logical partition copies for logical volume volume group.
0516-1200 mirrorvg: Failed to mirror the volume group.

In such cases, UPPER BOUND of lv`s has to be considered.

- Find out using lslv lvname, the UPPER BOUND value should be atleast equal to the number of disks, if not it gives the above error.

# lslv testlv
LOGICAL VOLUME:    testlv                 VOLUME GROUP:   testvg
LV IDENTIFIER:      00c502df00004c00000001233479719d.6    PERMISSION:     read/write
VG STATE:           active/complete          LV STATE:       opened/syncd
TYPE:               jfs2                    WRITE VERIFY:   off
MAX LPs:            512                    PP SIZE:        128 megabyte(s)
COPIES:             1                     SCHED POLICY:   parallel
LPs:                296                   PPs:            296
STALE PPs:          0                   BB POLICY:      relocatable
INTER-POLICY:       minimum                RELOCATABLE:    yes
INTRA-POLICY:       middle                 UPPER BOUND:   128 --> Min = Equal to no.of disks.
                                                                                   --> Max = Depends on VG Type
MOUNT POINT:        /testfs             LABEL:          /testfs
DEVICE UID:         0                      DEVICE GID:     0
DEVICE PERMISSIONS: 432
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?:     NO
INFINITE RETRY:     no

Also when we add disks to the VG this upper bound value should be checked once, for all the lv`s in the volume group and make sure it is atleast equal to the number of disks in the volume group and max value depends on the type of VG. i.e it will be the max disks that can be part of a vg.

To change the UPPER BOUND value for an LV :

chlv -u (value to be changed) lvname

In simple words, only if the lv is allowed to have upper bound atleast equal to number of disks, then all the lv basedoperating system commands will work with out any issues, be it lvcopy, vg mirror etc..,

Understanding the test(t) factor.

Test factor

For an existing volume group in AIX server, if we are adding any disk to it
( Typically extending the vg), on the new disk that we add everytime in a Volume Group there will be

(size of disk in MB )/ (PP Size) = # of PPs. ( desired )

If this number is greater than the 1016 limit we will need to change the t-factor.

If the above condition is met, where you have more than 1016 PP count, then this test or t factor come into picture. Hence we need to first find out the t factor value, that we should use.

Formula for calculating factor in chvg -t:

factor * 1016 = desired # of PPs on the new disk
factor = # off PPs / 1016 ---> always round up this value.

Use the above obtained value in chvg command i.e,
#chvg -t vgname

Example:

Consider we have a VG in a AIX server with below details.

1. TOTAL PPs = 511
2. PP Size = 32 MB
3. Number of PVs = 1 and
4. The size of the disk that exists part of the volume group is 16384 MB.
5. Now i want to add a disk of size 32768 MB

First i need to find how many PPs will be hosted on the new disk that i am adding

32768 / 32 = 1024 --> this is more than 1016 which is the limit.

Now calculate the test(t) factor.

factor * 1016 = 1024

factor = 1024 / 1016 -> 1.007874016 -> If we round it up its "2" .

So use the value "2" in chvg command : "chvg -t 2 vgname"

Note: Increasing the t-factor decreases the number of PV's you can have in the volume group.

Power HA migration

powerha migration detailed document

If anyone has any problems or need any help to perform the powerha migration, you can refer to the below document, which clearly illustrates how to perform the migration.

PROCEDURE :

Prework : (atleast 2 days before the actual migration)

(1)Check current version of HACMP is up and the cluster is stable.

# odmget HACMPcluster

# lssrc –ls clstrmgrES | grep state

(2) Verify the existing cluster and correct the problem if any error found

(3) Take the mksysb backup and also the copy of the below important files

Save a copy of these files:

/.rhosts

/etc/hosts

/etc/exports

/etc/inittab

Save a copy of these files

/usr/es/sbin/cluster/netmon.cf

/usr/es/sbin/cluster/etc/exports

/usr/es/sbin/cluster/etc/rhosts

/tmp/hacmp.log

(4) Take snapshot and save it to locally /tmp, and another copy in safest place like NIM server

(5)Take the clone for the rootvg disk on each server on the spare disk(extra luns should be mapped for taking the clone if it is not present) to ensure safer backout.If any issue comes, the server will be booted with this disk.

(6)Download all the requisite filesets, for the HA migration

Migration Procedure

Lets assume the one pair of participating cluster node as node1 and node2

Step 1: Stop Cluster Services on a Node Hosting the Application ( node1)

stop cluster services on node node 1 using the graceful with takeover option. (this is known as stopping cluster services and moving the resources groups to other nodes - .i.e node2 )

1.Enter smit hacmp

2.Use the System Management (C-SPOC) > Manage HACMP Services > Stop Cluster Services SMIT menu to stop cluster services.

3.Select takeover for Shutdown mode.

4.Select local node only and press Enter.

Now resource group containing the MQ to fall over to node node2 and verify it by

1] clRGinfo /usr/es/sbin/cluster/utilities/clRGinfo

2] execute command hostname

3] check with application team that application is running & all MQ channels are available

Step 2: Install the HACMP software

On node node1, install HA 6.1 filesets, which converts the previous HACMP configuration database (ODMs) to the desired format. This installation process uses the cl_convert utility and creates the /tmp/clconvert.log file. A previously created version of the file is overwritten.

To install the HACMP software:

#cd to the directory that contains filesets

1.Enter smit install

2.In SMIT, select Install and Update Software > Update Installed Software to Latest Level (Update All) and press Enter.

3.Enter the values for Preview only? and Accept new license agreements? For all other

field values, select the defaults.

Preview only select No

Accept new license agreement select Yes

4) Press enter

Step 3: Start Cluster Services on the Upgraded Node

Start cluster services on node node1.

To start cluster services on a single upgraded node:

1.Enter smit clstart

2.Enter field values as follows and press Enter

Start now, on system restart or both	Select now
Start Cluster Services on these nodes	Select local node (Default)
Manage Resource Groups Automatically/Manually	Select Automatically HACMP brings resource group(s) online according to the resource groups’ configuration settings and the current cluster state, and starts monitoring the resource group(s) and applications for availability.
BROADCAST message at startup?	Select false
Startup Cluster Information Daemon?	Select true
Ignore verification errors?	Select false

Automatically correct errors found during cluster start?

Whatever value is in these fields will not make sense since it is a mixed cluster.

Note:Verification is not supported on a mixed version cluster. Run verification only when all nodes have been upgraded.

Step 4: Repeat the same steps for the node2

Step 5 :Verifying the Upgraded Cluster Definition

After the HACMP 6.1 software is installed on all of the nodes in the cluster and cluster services restored, verify and synchronize the cluster configuration. Verification ensures that the cluster definition is the same on all nodes. You can verify and synchronize a cluster only when all nodes in the cluster are running the same version of the software.

To verify the cluster:

1.Enter smit hacmp

2.In SMIT, select Extended Configuration > Extended Verification and Synchronization > Verify Changes only and press Enter.

Verifying Software Levels Installed Using AIX 5L Commands

Run the commands lppchk -v and lppchk -c “cluster.*”

Both commands return nothing if the installation is OK.

perform the another failover to check the RG movement after the upgrade on both nodes.

How to Backout if anything goes wrong :

Boot the server with the clone disk in case of any issue

TANTI TECHNOLOGIES

Tanti Technology