TANTI TECHNOLOGIES: February 2016

Wednesday, 17 February 2016

Lesson - 5 Device Management

Device Terminology
No of hardware and software devices must interact correctly for the device to function properly.

Physical devices
Ports
Device Drivers
To put a defined device to available
Logical devices (/dev)

Note: Some of the logical devices are accessed only ODM cannot accessed by users

Types of devices

1. Primary Devices (RAM,CPU and Motherboard)

2. Secondary Devices

Hard disk, floppy disk, tape drive, printers , physical adapters, error special files, null special files etc.,

a) Block devices: This is the structured random device. Buffering is used to provide a block-at-a-time method of access. This is usually a disk file systems

b) Character devices: This is the sequential, stream oriented device which does not use buffering

3. Base devices : Keyboard, Mouse

Examples of block devices

Following are examples of block devices:

cd0 CD-ROM

fd0, fd0l, fd0h Diskette

hd1, lv00 Logical volume

hdisk0 Physical volume

Examples of character (raw) devices

Following are examples of character (raw) devices:

console, lft, tty0 Terminal

lp0 Printer

rmt0 Tape drive

tok0, ent0 Adapter

kmem, mem, null Memory

rfd0, rfd0l, rfd0h Diskette

rhd1, rlv00 Logical volume

rhdisk0 Physical volume

Note: Most of the block devices have equivalent character device , For ex, /dev/hd1 provides buffered access to a logical volume whereas /dev/rhd1 provides raw access to the same logical volume. (The raw devices are ususally accessed by the kernel)

Major and Minor Number

maj,min dev nums

brw------- 1 root system 32,8192 Nov 03 14:19 hdisk3

brw------- 1 root system 32,8194 Nov 03 14:19 hdisk4

brw------- 1 root system 32,8195 Nov 07 07:08 hdisk5

Major no is 32 and minor no is 8192,8194 ...

Major no refers to the s/w section of code in the kernel which handles that type of device(hard disk). and minor no to the particular device of that type.

Device Configuration Database

The predefined and customized databases store information of all the logical devices in the system and their attributes. It is managed by ODM.

Predefined Database (PdDv) List all the Supported Devices

Contains all the possible devices supported by the system.s
Devices in Undefined state
Output contains column like

Class - what the device does

Type - what model

subclass – Where it is attached

Using “lsdev -P” command can list the devices supported by the system

Customized Database (CuDv) – List Defined Devices

List the actual devices used (configured) by the system.
Displays name,status, Location and description.
“lsdev -CH” provides information about the Customized devices.

The status column contain 2 states

Available – Device ready to use

Defined -- Device is unavailable

Note: Devices may appear in the Defined state after restart the server, the reason is either the corresponding device is powered off or removed from the server.

Devices with location code are physical devices. Devices without location codes are logical devices.Location codes depend on type of device and the adapter to which the device is connected.
“lsattr -E -l command provides detailed effective attributes of currently configured devices.
“lscfg -v” - important Customized database command in which displays complete information about the device in ODM. display vital product data (VPD) such as part no, serial nos, Model architecutre, FRU, part number etc., for all the devices present on the system

Device Status

The most common devices states are as follows.

1. Undefined

The device is supported device but not configured. Devices reside in the Predefined Database not customized database.

2. Defined

The device is configured but unavailable to use. Devices reside in Customized databases

are in defined state.

3. Available

The device is available and ready to use.

Configuring Devices

1. “cfgmgr” configure all self-configuring devices.

2. While configuring the devices, cfgmgr checks whether the device is support on your AIX system by checking the PdDv.

3. Once the information is found in PD DB, it uses complete information to complete entries into the customized devices database CuDv.

4. It also loads the appropriate device driver into the AIX kernel and makes the logical devices under /dev directory.

5. Finally makes the device to available state (ODM).

The above 3 &4 steps are performed by mkdev command

Changing Device State

i. To put a defined device to available

#mkdev -l

ex.,#mkdev -l hdisk1

ii. To put a available device to defined

#rmdev -l

ex.,#rmdev -l hdisk1

iii. To permanently remove from ODM.

#rmdev -dl

ex.,#rmdev -dl hdisk1

Device Addressing

Location codes are used for device addressing
Where exactly the device is connected to your m/c.
Location code is made up of 4 fields of information. Useful in troubleshooting scenarios, referred on LED hardware troubleshooting.
Devices which are having location code are physical deviecs, not having location code are called logical devices.
Location codes depend on the type of the device and adapter to which it connects.
Location code is another way of identifying Physical device.

Format for ex;

AB-CD-EF-GH for non SCSI/PCI devices

AB-CD-EF-G,H for SCSI devices

Commands for displaying Location codes are as follows

#lsdev -Cc adapter

#lscfg -vl

“lscfg” list the vital product data including h/w serial no.type, model and part numbers.

Important Commands

lsdev --> List all the devices (Predefined and Customized)
lsattr --> List attributes of the devices
lscfg --> List the VPD information of the configured (customized) devices

Options for Device Commands

LSDEV

-P --> Predefined devices

-C --> Customized devices

-c --> Class type

-H --> Header information

-l --> Specify logical device name

-p --> child devices of parent deviecs

-F --> parent device of particular child device (use “lsparent -Cl hdisk0)

LSATTR

-E --> Effective attributes of the devices

-l --> Specify the logical device name

RMDEV

-l --> logical device name

-d --> completely removes from ODM.

-SR --> removes all child devices from the parent device.

LSCFG

#lscfg -vp --> all h/w information

#lscfg -vl --> for particular logical device

Network Installation Manager (NIM) -- Part 1 Introduction & Concepts

Introduction

NIM (Network Installation Manager) is a IBM AIX Service used for remote unattended installation similar to the Sun/Oracle Jumpstart and RedHat Kickstart or PXE without DVD/CD used to install AIX in many servers at a time in an Infra. Now, the installation of VIO and Linux can also be performed using NIM. Apart from installation, upgradation and maintainig AIX softwares/filesets the management activities like backup and restore of mksysb can be performed via NIM. By using NIM Master to store mksysb can reduce usage of tape device and time duration.

Activities performed by NIM
1) New Installation (Pull method)
2) TL/SP upgradation
3) OS Migration/Upgradation
4) mksysb backup/restore
5) Provide boot CD image for maintenance activities
6) Centralized repository for Filesets
7) Used to restore missed files in the mksysb backup

Components of NIM
The following are the important components involved in NIM installation, confiugration and managment.
1. Master (NIM Master)
As named NIM Master server is the repository of all the softwares, OS images, mksysb backup which serves the same to its clients for the corresponding operations.
2. Client (NIM Client)
Systems which are managed by NIM Master server for the operations like installation, backup and restoration
3. Resources (NIM Resources)
This can be a single file or up to a whole filesystem that is used to provide some sort of information to, or perform an operation on a NIM client. Resources are allocated to NIM clients using NFS and can be allocated to multiple clients at the same time. The NIM resources are lpp_source,SPOT,mksysb,bosinst_data and image_data
4. Resource Server
Server where resources are available. NIM client can also act as Resource Server. But in most cases NIM Master server will be the resource server. Briefly discuss types of resources in consecutive sections.
5. NIMSH Daemon (NIM Service Handler)
For environments where the standard "rsh" protocol are not enough secure, at the time "nimsh". Using the port nos 3901 and 3902
6. Allocate/Allocation :
This process is what allows your NIM client to access resources in NIM. The master uses NFS to perform the allocation process. Resources can be allocated to one or more NIM clients at the same time

NIM RESOURCES

a) lpp_source
It is just a directory which acts as source of installation which contains the following
* OS image cd/dvd
* filesets/packages
* TL/SP updates
These type of resources can be used for all NIM clients.
b) SPOT (Shared Processor Object Tree)
It is similar to boot cd, we can boot the nimclient using the SPOT. It contains the "/usr" part i.e., kernel, filesets, device drivers and BOS install programs.
There are 2 types of SPOT i) lppspot and ii) mksysbspot and these types of SPOT are specific to particular NIM client or can be used for all NIM clients.
c) mksysb
In general, rootvg backup is the mksysb backup which may be specific to particular NIM client.
Important advantages are quick restoration of crashed OS and can create customized (hardened) mksysb backup of particular which could be installed/resotred in other servers to avoid fresh installation and performing hardening.
d) bosinst_data
Customized Installation Procedure (bosinst.data) which is used in non-interactive installation, mainly POST installation changes.
e) image_data
image.data (rootvg architecture) layout of LV,VG,PP information and its architecture.

Daemons Required

a) bootp protocol (BOOTPD)

This is the initial communication made between NIM master and client during network boot.

b) Trivial File Transfer Protocol (TFTPD)

It is used to transfer lppsource,SPOT and other resources from NIM server to NIM client during the NIM client booting.

Check the services are running,

#lssrc -ls inetd
bootps /usr/sbin/bootpd bootpd /etc/bootptab active
tftp /usr/sbin/tftpd tftpd -n active
#grep bootps /etc/services
bootps 67/tcp # Bootstrap Protocol Server
bootps 67/udp # Bootstrap Protocol Server
#grep tftp /etc/services
tftp 69/udp # Trivial File Transfer
tftp 69/tcp # Trivial File Transfer
If the above services are not enabled, start the same.
#chubserver -v bootp -p udp -a
#chubserver -v tftpd -p udp -a
#refresh -s inetd

Also, ensure NFS group related services are also enabled.

#startsrc -g nfs

Important Directories and Files

a) /tftpboot

When we create SPOT, kernel is copied into /tftpboot directory . All kernles of NIM clients are in the direcotry

root@aixnim1: /etc # ls -l /tftpboot
lrwxrwxrwx 1 root system 34 Dec 19 18:36 aix21.domain.com -> /tftpboot/spot_5200-08.chrp.mp.ent
-rw-r--r-- 1 root system 1276 Dec 19 18:36 aix21.domain.com.info
-rw-r--r-- 1 root system 9379964 Dec 8 15:31 spot_5200-08.chrp.64.ent
-rw-r--r-- 1 root system 9260943 Dec 8 15:31 spot_5200-08.chrp.mp.ent

b) /etc/bootptab

File which authenticate nim client for remote boot up for nim operations like bos installation, restoration & maintenance boot. Based on the NIM operation, nim client info update and once the NIM operation completed, entry also removed automatically from this file.

In simple words, authenticates NIM client for remote boot up.

#tail /etc/bootptab

aix21.domain.com:bf=/tftpboot/aix21.domain.com:ip=10.200.50.56:ht=ethernet:sa=50.20.100.48:gw=10.200.50.1:sm=255.255.255.0:

The fields are separated with colon (:) and attributes are separated with equal sign (=):

first field - NIM client name

bf - boot file name

ip - NIM client IP address

ht - hardware type

sa - TFTP server address for the boot file

gw - gateway used by client to reach the server

sm - subnet mask for the nim client

c) /etc/niminfo

The file always exists on the NIM master and client. contains information about who is master and client, port no, protocol informations.

This is automatically generated when master was created and client was registered to master. This file was not manually edited. If any corrupted, want to be removed and recreated.

1)Rebuild on Master

#nimconfig -r

2)Rebuild on Client

#nimint -a master= -a name=

Filesets Required

bos.sysmgt.nim.master

bos.sysmgt.nim.spot

Useful NIM Basics Link

http://www.youtube.com/watch?v=UBbhoSUnCwE

FAQ AIX

Backup and Restore

Hardware and Support

Network

Package Management

Storage Management

Miscellaneous

Backup and Restore

1. How can I archive a directory with backup?

tar on AIX has severe limitations. If I want to archive a directory without these limitations I think about the backup command. But what would be the syntax?

Assuming you want to backup /usr/local the tar command would be something like this:

# tar cvpf /tmp/archive.tar ./usr/local/

You could use the following backup command instead:

 # find ./usr/local/ | backup -iqvpf /tmp/archive.bff

To restore the backup you can use the restore command:

 # restore -xqvf /tmp/archive.bff

(the corresponding tar command would be tar xvpf /tmp/archive.tar)

2. How to exclude a directory from being backed up by the mksysb command?

To exclude a directory /var/testdir from being backed up by mksysb add the following line to the file /etc/exclude.rootvg:

^./var/testdir/

and start mksysb with the '-e' flag:

 # mksysb -e -i -X /dir/to/image

3. How can I extract a compressed tar archive under AIX?

First of all: You need gzip/bzip2 to be installed on your AIX box. However gnutar is not required - you could use the AIX tar as shown below:

# gzip -d -c archiv.tgz | tar xvf -

If you only want to have a deeper look into the archive, type:

# gzip -d -c archiv.tgz | tar tvf -

It's a bzip2 compressed archive - then type

# bzip2 -d -c archiv.tar.bz2 | tar xvf -

4. How can I restore an old version of a specific file with dsmc?

An old version of/etc/sudoers can be restored with a command like this:

# dsmc restore -pick -inactive /etc/sudoers

TSM Scrollable PICK Window - Restore

     #    Backup Date/Time        File Size A/I  File
        -----------------------------------------------------------------------
     1. | 02/11/09   21:31:28      27.74 KB  A   /etc/sudoers
     2. | 02/02/09   22:12:45      25.42 KB  I   /etc/sudoers
     3. | 01/28/09   22:10:38      25.00 KB  I   /etc/sudoers
     4. | 01/20/09   21:34:20      23.73 KB  I   /etc/sudoers
        0---------10--------20--------30--------40--------50--------60--------7
=Up  =Down  =Top  =Bottom  =Right  =Left
=Goto Line #  <#>=Toggle Entry  <+>=Select All  <->=Deselect All
<#:#+>=Select A Range <#:#->=Deselect A Range  =Ok  =Cancel
pick>

Type the one character commands listed in the three lines at the bottom of the page beside the pick> prompt to move up, move down, move to the bottom of the list, etc.

After you decide which day's backup you want to restore, type the number of that line beside the pick> prompt, press Enter. An x will appear in the first column of that line.

Then type o (the letter "O") and press Enter again.

TSM will begin to restore the file you selected.

If you want to restore the file to a different folder, let's say to /tmp you can do it like this:

# dsmc restore -pick -inactive /etc/sudoers /tmp/

You can even pick more than one file from a list, e.g. from /etc/init.d:

# cd /etc/init.d
# dsmc restore -pick "*"

TSM Scrollable PICK Window - Restore

     #    Backup Date/Time        File Size A/I  File
        -----------------------------------------------------------------------
     1. | 09/19/08   15:30:28       3.77 KB  A   /etc/init.d/gssd
     2. | 09/19/08   15:30:28       3.83 KB  A   /etc/init.d/haldaemon
     3. | 09/19/08   15:30:28       3.60 KB  A   /etc/init.d/halt
     4. | 09/19/08   15:30:28        360  B  A   /etc/init.d/halt.local
     5. | 09/19/08   15:30:28       4.01 KB  A   /etc/init.d/idmapd
     6. | 09/19/08   15:30:28      13.44 KB  A   /etc/init.d/ipmi
     7. | 09/19/08   15:30:28       2.12 KB  A   /etc/init.d/irq_balancer
     8. | 09/19/08   15:30:28       2.79 KB  A   /etc/init.d/joystick
     9. | 09/19/08   15:30:28      11.91 KB  A   /etc/init.d/kbd
        0---------10--------20--------30--------40--------50--------60--------7
=Up  =Down  =Top  =Bottom  =Right  =Left
=Goto Line #  <#>=Toggle Entry  <+>=Select All  <->=Deselect All
<#:#+>=Select A Range <#:#->=Deselect A Range  =Ok  =Cancel
pick>

Type the one character commands listed in the three lines at the bottom of the page beside the pick> prompt to move up, move down, move to the bottom of the list, etc.

After you decide which files you want to restore, type the number of that line beside the pick> prompt, press Enter. An x will appear in the first column of that line.

Then type o (the letter "O") and press Enter again.

TSM will begin to restore the files you selected.

5. How can I create a bootable DVD image from an mksysb?

Let's say you have an mksysb image present under /var/backups/mksysb.obj, then you can run the following command to create a bootable DVD image from it:

 # mkcd -L -S -I /iso-images -m /var/backups/mksysb.obj

In the above example the DVD image can be found under /iso-images. Please note that the switch -m doesn't accept relative paths!

6. How can I restore single files or directories from mksysb?

To restore a single file - let's say /etc/mail/sendmail.cf - from mksysb, type

# restore -xqvf mksysb.img ./etc/mail/sendmail.cf

Two or more files could be restored by a line like this:

# restore -xqvf mksysb.img ./etc/mail/aliases ./etc/mail/aliases.db

And if you want to restore a whole directory with all files and subdirectories, type

# restore -xqvf mksysb.img './etc/mail/*'

The above command restores /etc/mail and all containing files.

Please note: All files will be restored with full path relative to your current working directory.

Hardware and Support

1. What is the command to analyze a dump with AIX 5L?

Since AIX 5.1 the proper command to analyze a dump is kdb. For AIX 4.3.3 and earlier you use crash instead. The usage is almost identical.

2. How can I display all actually loaded kernel extensions?

You use the command genkex to see all modules and extensions loaded into the kernel. If you are only interested in real extensions, you can use a line like this:

 # genkex | grep 'ext$'

3. Can I run a 64bit kernel on my system?

You can use the bootinfo command:

 # bootinfo -y
 32

If it comes back with 32 this machine can not run the 64bit kernel. Does it come back with 64 it can run both, 32bit and 64bit kernel.

4. How can I activate the 64bit Kernel? Is it even already running??

Since AIX 5.2 the classic kernel /unix is only a link to the real kernel file in /usr/lib/boot. You can follow this link to see which kernel the system is supposed to use. In the below example this link points to /usr/lib/boot/unix_64 what means that your system is set up for a 64bit kernel. However, does this link point to either /usr/lib/boot/unix_up or/usr/lib/boot/unix_mp your system is set up to run the 32bit kernel.

Example:

 # ls -l /unix
 lrwxrwxrwx   1 root     system           21 21 Jul 2004  /unix -> /usr/lib/boot/unix_64

Use bootinfo to figure out which kernel the system actually runs:

 # bootinfo -K
 64

If your system still runs the 32bit kernel and you want to change to 64bit the above link and another link unix in /usr/lib/boot have to be set accordingly:

 # cd /
 # rm -f unix
 # ln -s /usr/lib/boot/unix_64 unix
 # cd /usr/lib/boot/
 # rm -f unix
 # ln -s unix_64 unix

After the new links have been set you have to rewrite all BLVs:

 # bosboot -a

 bosboot: Boot image is 19008 512 byte blocks.

After the next reboot your system will run the 64bit kernel.

5. Is it possible to install AIX 5.2 or AIX 5.3 on my box?

Since AIX version 5.2 IBM does not support the whole range of power processor based RS/6000 models anymore. You can issue the command " bootinfo -p " to figure out if your system can be installed with AIX 5.2 or 5.3. Does the command come back with " chrp " your system can run AIX 5.2 or 5.3. Does the command respond " rs6k " or " rspc " however, you cannot install any version of AIX higher 5.1.

 # bootinfo -p
 chrp

Please note that a firmware upgrade might be needed in order to boot from the AIX 5.2 CD!

AIX 6.1 does not support all chrp models!. Particulary 32bit systems are not supported anymore.

6. How can I list the version of an HMC?

On the HMC commandline you can use lshmc to check the HMC version:

 hscroot@hmc:~> lshmc -V
 "version= Version: 5
  Release: 1.0
 HMC Build level 20051110.1
 MH00464: InfoCenter Update for V5R1.0 (10-27-2005)
 MH00493: Fixes for leap second handling, DST time and openssl (11-29-2005)
 MH00507: Maintenance Package for V5R1.0 (12-03-2005)
 "

You could also use WebSM/wsm to get this information:

On the left open "Licensed Internal Code Maintenance"
click on "HMC Code Update"
In the "Status" area you find the same information you saw with lshmc -V.

If you want to learn more about the HMC commandline interface, you might want to have a look at the HMC Commandline Howto.

7. How can I list the firmware level of a p5 system?

 DISPLAY MICROCODE LEVEL                                                              802811
 IBM,9117-570

 The current permanent system firmware image is SF235_185
 The current temporary system firmware image is SF235_185
 The system is currently booted from the temporary firmware image.

8. How can I create a snap suitable for IBM support?

The general command to collect information for the IBM support is snap. First you should remove any old snap files. The easiest way to remove any old snap files is snap -r. To avoid problems when creating the new snap, you should also manually remove the whole subdirectory /tmp/ibmsupt:

 # rm -rf /tmp/ibmsupt
 # snap -gfiLGc
 Checking space requirement for general information........................done.
                        .
                        .
                        .
 Creating compressed pax file...
 Starting pax/compress process... Please wait... done.

 -rw-------   1 0        0           8621905 Mar  2 10:40 snap.pax.Z

You find the compressed snap file under /tmp/ibmsupt/snap.pax.Z. This file should be sent to IBM.

9. How can I figure out the managing HMC (and CSM server) of my LPAR?

To find the managing servers of your LPAR you can use the domain status information - the ctrmc subsystem must be active in order to do so. You can check with:

 # lssrc  -s ctrmc
 Subsystem         Group            PID          Status
  ctrmc            rsct             1151036      active

Is the subsystem indeed active you get the IP address of your HMC with:

 # /usr/sbin/rsct/bin/rmcdomainstatus -s ctrmc

 Management Domain Status: Management Control Points
   O A  0xccdfc3e608ad7624  0001  192.168.100.10

Is your LPAR also connected to a CSM server you would see two Control Points:

 # /usr/sbin/rsct/bin/rmcdomainstatus -s ctrmc

 Management Domain Status: Management Control Points
   I A  0x58016857defc1b87  0001  192.168.100.11  
   I A  0xccdfc3e608ad7624  0002  192.168.100.10

For newer versions of rsct you could also use lsrsrc to get this information:

 # lsrsrc -l "IBM.ManagementServer"
Resource Persistent Attributes for IBM.ManagementServer
resource 1:
        Name             = "192.168.100.10"
        Hostname         = "192.168.100.10"
        ManagerType      = "HMC"
        LocalHostname    = "192.168.100.23"
        ClusterTM        = "9078-160"
        ClusterSNum      = ""
        ActivePeerDomain = ""
        NodeNameList     = {"mylpar22-ext"}
resource 2:
        Name             = "192.168.100.11"
        Hostname         = "192.168.100.11"
        ManagerType      = "CSM"
        LocalHostname    = "mylpar22"
        ClusterTM        = "9078-160"
        ClusterSNum      = "10BF571"
        ActivePeerDomain = ""
        NodeNameList     = {"mylpar22-ext"}

10. How can I see the allocated resources of an LPAR w/o checking on the HMC?

Run this command on your LPAR:

 $ lparstat -i
 Node Name                                  : barney
 Partition Name                             : mylpar2
 Partition Number                           : 2
 Type                                       : Shared-SMT
 Mode                                       : Uncapped
 Entitled Capacity                          : 0.60
 Partition Group-ID                         : 32796
 Shared Pool ID                             : 0
 Online Virtual CPUs                        : 6
 Maximum Virtual CPUs                       : 10
 Minimum Virtual CPUs                       : 1
 Online Memory                              : 3072 MB
 Maximum Memory                             : 16384 MB
 Minimum Memory                             : 1024 MB
 Variable Capacity Weight                   : 80
 Minimum Capacity                           : 0.10
 Maximum Capacity                           : 10.00
 Capacity Increment                         : 0.01
 Maximum Physical CPUs in system            : 16
 Active Physical CPUs in system             : 8
 Active CPUs in Pool                        : 8
 Unallocated Capacity                       : 0.00
 Physical CPU Percentage                    : 10.00%
 Unallocated Weight                         : 0

11. How can I see statistics about all LPARs sharing the same resources?

There is a special option in topas that allows you to see CPU and Memory usage of the whole CEC:

 # topas -C

The output is something like this:

 Topas CEC Monitor             Interval:  10             Wed Aug 13 17:08:13 2008
 Partitions Memory (GB)           Processors
 Shr: 12    Mon:83.0  InUse:20.6  Shr:3.8  PSz:  8   Don: 0.0 Shr_PhysB  0.14
 Ded:  0    Avl:   -              Ded:  0  APP:  7.9 Stl: 0.0 Ded_PhysB  0.00

 Host         OS  M Mem InU Lp  Us Sy Wa Id  PhysB  Vcsw Ent  %EntC PhI
 -------------------------------------shared-------------------------------------
 vioserver1   A53 U 5.0 1.4  6   3  6  0 90   0.03  177  0.30  10.3   2
 vioserver2   A53 U 5.0 1.2  6   3  6  0 90   0.03  257  0.30  10.1   2
 mylpar0      A53 U 5.0 3.1  6   2  4  0 92   0.02  195  0.30   7.6   0
 mylpar1      A53 U 5.0 2.8  6   0  2  0 96   0.01  363  0.30   4.0   0
 mylpar2      A53 U  10 1.5 12   0  0  0 99   0.01  367  0.60   1.2   1
 mylpar3      A53 U  18 2.7  8   0  0  0 98   0.01  288  0.40   1.7   0
 mylpar4      A53 U 6.0 1.3  2   0  3  0 96   0.01  190  0.10   5.1   0
 mylpar5      A53 U 3.0 1.5 12   0  0  0 99   0.01  286  0.60   0.8   0
 mylpar6      A53 U  10 1.5 12   0  0  0 99   0.00  231  0.60   0.8   0
 mylpar7      A53 U 6.0 1.2  2   0  2  0 97   0.00  176  0.10   4.1   2
 mylpar8      A53 U 6.0 1.3  2   0  1  0 97   0.00  135  0.10   3.2   0
 mylpar9      A53 U 4.0 1.1  2   0  1  0 97   0.00  132  0.10   3.1   0
 Host         OS  M Mem InU Lp  Us Sy Wa Id  PhysB  Vcsw  %istl %bstl
 Host         OS  M Mem InU Lp  Us Sy Wa Id  PhysB  Vcsw  %istl %bstl------------
 ------------------------------------dedicated-----------------------------------

12. How to view the current SMT mode settings and processor information

You can use smtctl to get the information. Below is an example:

# smtctl

This system is SMT capable.

SMT is currently enabled.

SMT boot mode is not set.
SMT threads are bound to the same virtual processor.

proc0 has 2 SMT threads.
Bind processor 0 is bound with proc0
Bind processor 1 is bound with proc0


proc2 has 2 SMT threads.
Bind processor 2 is bound with proc2
Bind processor 3 is bound with proc2

13. How can I check the NPIV adapter mappings on the client LPAR?

With AIX 6.1 and higher you can use kdb to see the mapings on the client:

# echo "vfcs" | kdb | grep -E '^NAME|^fcs'
NAME      ADDRESS             STATE   HOST      HOST_ADAP  OPENED NUM_ACTIVE
fcs0      0xF1000A0000186000  0x0008  vioserver1vfchost1  0x01    0x0000
fcs1      0xF1000A0000188000  0x0008  vioserver2vfchost1  0x01    0x0000
fcs2      0xF1000A000018A000  0x0008  vioserver1vfchost2  0x01    0x0000
fcs3      0xF1000A000018C000  0x0008  vioserver2vfchost2  0x01    0x0000
fcs4      0xF1000A001DB06000  0x0008  vioserver1vfchost3  0x01    0x0000
fcs5      0xF1000A000018E000  0x0008  vioserver2vfchost3  0x01    0x0000
fcs6      0xF1000A0000180000  0x0008  vioserver1vfchost4  0x01    0x0000
fcs7      0xF1000A001DB02000  0x0008  vioserver2vfchost4  0x01    0x0000

14. How can I find the serial number of my POWER machine?

# lscfg -vp | grep -p "System VPD" | fgrep "Machine/Cabinet Serial No"
           Machine/Cabinet Serial No...063TD8T

Network

1. How can I create a backup ethernet adapter?

You have to create an etherchannel device. First you have to unconfigure the current ethernet settings on the physical ethernet adapter (en0 in the below example). You also need to make sure that there is a second physical ethernet adapter available (en1 in the below example). If these requirements are met you create the etherchannel with the commands below:

 # ifconfig en0 detach
 # chdev -P -l en0 -a state=down
 # mkdev -c adapter -s pseudo -t ibm_ech -a adapter_names=ent0 \
     -a backup_adapter=ent1 -a netaddr=192.168.100.1 \
     -a num_retries=3 -a retry_time=3
 # mkdev -c if -s EN -t en -a netaddr=192.168.100.33 \
     -a netmask=255.255.255.0 -w en2 -a state=up -a arp=on
 # mkdev -l inet0

The IP address of the first mkdev call belongs to a host (e.g. the gateway) that can be pinged. The system uses this to figure out if the primary network adapter is still alive. The secondmkdev however, defines the IP address you want to set to your new etherchannel logical device (i.e. the communication IP address of the box). The system uses the next free number for the etherchannel device (in our example en2).

2. How to set a static route?

Of course you can use the route command to set a static route. But this way you don't get it back after reboot.

To make a route persistent you need to change inet0. First check which routes are already set:

 # lsattr -El inet0 -a route
 route net,-hopcount,0,,0,192.168.1.1 Route True
 route net,-hopcount,255.255.255.128,,,,,192.168.3.155,192.168.2.1 Route True

These routes would be set with:

 # chdev -l inet0 -a route=net,-hopcount,0,,0,192.168.1.1
 # chdev -l inet0 -a route=net,-hopcount,255.255.255.128,,,,,192.168.3.155,192.168.2.1

To remove these specific static routes:

 # chdev -l inet0 -a delroute=net,-hopcount,0,,0,192.168.1.1
 # chdev -l inet0 -a delroute=net,-hopcount,255.255.255.128,,,,,192.168.3.128,192.168.2.1

In this route string 255.255.255.128 is the netmask, 192.168.3.128 the destination net, and 192.168.2.1 the gateway.

For hostroutes the keyword net has to be replaced with host.

3. How can I force nfs mounts to use reserved ports?

Some UNIX variants such as OpenBSD or Solaris require clients to use reserved ports (below 1024) to mount nfs shares. But AIX uses non-reserved by default when trying to mount an nfs share. However, with the following command you can change the default:

 # nfso -o nfs_use_reserved_ports=1

The above command will lose effect after reboot. To make it permanent you use the '-p' switch on AIX 5L and 6:

 # nfso -p -o nfs_use_reserved_ports=1

This setting is also respected by the autofs subsystem. But be aware that this setting has to be made before a remote filesystem is actually mounted!

4. How do I set an IP alias under AIX?

To put a second IP address to en0, just use ifconfig:

# ifconfig en0 192.168.100.199 netmask 255.255.255.0 alias

This sets an additional IP address 192.168.100.199 to en0. Of course, if you use ifconfig to set an IP alias, the alias won't be present after the next reboot. To make the setting permanent, you have to change the interface's definition in the ODM. The SMIT fastpath is mkinet4al. If you prefer to use the commandline, type:

# chdev -l en0 -a alias4="192.168.100.199,255.255.255.0"

You can remove an IP alias with a comand like this:

# ifconfig en0 delete 192.168.100.199

No need to use the keyword 'alias' here. To delete the IP address permanently you can use chdev again:

# chdev -l en0 -a delalias4="192.168.100.199,255.255.255.0"

5. How can I see the the physical link status of an ethernet adapter?

You can run entstat -d on the physical adapter and search for media speed and link status:

# entstat -d ent0 | egrep "(Link Status|Media Speed)"
Link Status : Up
Media Speed Selected: Auto negotiation
Media Speed Running: 1000 Mbps Full Duplex

6. How can I check the status of an etherchannel?

Run entstat -d command on the etherchannel device and search for Link Status, Active channel, and Physical Port Link State. If you use a backup adapter check also for Backup adapter. To get the information in one view you could run a command like this:

# entstat -d ent6 | egrep "(Link Status|Active channel|Physical Port Link State|Backup adapter|\(ent.\)|\(ent..\))"
ETHERNET STATISTICS (ent6) :
Active channel: primary channel
ETHERNET STATISTICS (ent3) :
Link Status : Up
Backup adapter - ent5:
ETHERNET STATISTICS (ent5) :
Link Status : Up

7. How can I start a service of the internet superserver inetd?

For example - if you want to enable the ftp service just type

# startsrc -t ftp
0513-124 The ftp subserver has been started.

To stop the ftp service again type

# stopsrc -t ftp
0513-127 The ftp subserver was stopped successfully.

No need to manually edit /etc/inetd.conf!

You can check the current status of the inetd with

# lssrc -ls inetd

8. What are the best options to mount an NFS share?

I assume you want the nfs share to be automatically present after a system restart. Then you can use the below command to mount a share /nas/diska/vol1 from the servernas.unixwerk.eu to /import/nas_vol1:

# mknfsmnt -d /nas/diska/vol1 -h nas.unixwerk.eu -f /import/nas_vol1 -b 32768 -c 32768 -A -H -E

And that's how it looks in /etc/filesystems:

/import/nas_vol1:
        dev             = "/nas/diska/vol1"
        vfs             = nfs
        nodename        = nas.unixwerk.eu
        mount           = true
        options         = hard,intr,rsize=32768,wsize=32768,timeo=600
        account         = false

The options in the above example are my choices for mounting NAS volumes. If you want to mount the share read-only you can add -t ro to the above command. If you want to only add a stanza to /etc/filesystems without already mounting the share you can add the option -I to the command:

# mknfsmnt -t ro -d /nas/diska/vol1 -h nas.unixwerk.de -f /import/nas_vol1 -b 32768 -c 32768 -A -H -E -I

9. Is it possible to pipe dd output to the ftp command?

Yes, it's possible:

# ftp server.com
ftp> bin
ftp> put "|dd if=/dev/zero bs=32k count=10000" /dev/null

The above ftp command reads from /dev/zero on the local side and writes to /dev/null on the remote side. This prevents both the local and the remote system from involving disks.

Package Management

1. How can I do a preview of an ML/TL upgrade?

If I want to upgrade an AIX system to a new ML or TL I like to check it first by doing an PREVIEW only. But the result of this action is not very helpful, because I always see that only one fileset will be updated, all the rest is rejected because of dependency failures. However when I start the real update all the other filesets will be upgraded aswell, although the PREVIEW said different.

You have to upgrade the installer separately first (APPLY only). This can be done with a command like this:

 # cd /path/to/TL
 # installp -agX -d. bos.rte.install

Restart the upgrade procedure:

 # smitty update_all

Now you can do a PREVIEW only installation and you see which filesets actually would be upgraded. If you decide to not upgrade the system after the preview you can easily roll back the installer to the old version with:

 # smitty reject

2. How can I apply an efix or ifix?

You don't apply interim fixes (ifix) or emergency fixes (efix) with installp - instead you do it with the Efix Manager. IBM provides these fixes in a compressed epkg format (suffix: .epkg.Z). And that's how it's been applied:

# emgr -e .epkg.Z

You get a list of all installed fixes with

# emgr -l

ID  STATE LABEL    INSTALL TIME      UPDATED BY ABSTRACT
=== ===== ======== ================= ========== ================
1    S    IZ79677  09/16/10 16:09:52            iFix for IZ79677

The Label from the table above is needed when you ever want to remove an efix from the system:

# emgr -r -L

With a TL or SP upgrade installp will automatically remove an interim fix only if the service pack already contains it. If not the upgrade will fail and you have to remove it with the efix manager before upgrading.

3. How can I list filesets and versions on installation media?

I want to know what filesets are present on installation media or a on a directory where I put some files with not human readable names (such as U838402.bff). How can I get the information?

# cd /path/to/bffs
# installp -l -d .
  Fileset Name                Level                     I/U Q Content
  ====================================================================
  Java5.samples               5.0.0.275                  I  N usr
#   Java SDK 32-bit Samples

  Java5.sdk                   5.0.0.325                  I  N usr,root
#   Java SDK 32-bit

  Java5.source                5.0.0.325                  I  N usr
#   Java SDK 32-bit Source

  Java5_64.sdk                5.0.0.325                  I  N usr,root
#   Java SDK 64-bit

  Java6.sdk                   6.0.0.215                  I  N usr,root
#   Java SDK 32-bit

The same way you can display the package name of a single fileset:

# installp -l -d U839870.bff
  Fileset Name                Level                     I/U Q Content
  ====================================================================
  devices.chrp.base.ServiceRM 1.5.0.1                    S  N usr,root
#   RSCT Service Resource Manager

Please note: Using installp requires a .toc file to be present. This is always the case on official media. However, if you downloaded some update filesets the .toc file might be missing. Not a problem - just create it with the inutoc command:

# cd /path/to/bffs
# inutoc .

4. How can I get information about installed filesets and versions in one line?

When I want to list filesets installed on my LPAR I can use "lslpp -l" or "lslpp -L". But the first gives me separate output for root and /usr part of a fileset - the latter breaks lines with long descriptions. Is there something else?

You can display the information as a list separated by colons. Fileset name, version, and descriptions are stored in the fields 2, 3, and 8 respectrively. Here is an example:

# lslpp -Lqc bos.net.nfs.\* | cut -d: -f2,3,8
bos.net.nfs.client:6.1.4.4:Network File System Client
bos.net.nfs.server:6.1.4.0:Network File System Server

«lppchk -v» shows errors. Where can I find the cause?

5. «lppchk -v» shows errors. Where can I find the cause?

After an AIX migration everything seems to be fine. However «lppchk -v» shows an error such as below:

# lppchk -v
lppchk: The following filesets need to be installed or corrected to bring
the system to a consistent state:

rsct.core.rmc v=2, r<5 fileset="" installed="" not="" pre="" requisite="">
The error descriptions does not help much. It does not show which fileset's dependencies are actually violating the consistence of the package database. However we can search the ODM for filesets with such a dependency:

# odmget product|fgrep -p 'rsct.core.rmc v=2 r<5 b="">
product:
lpp_name = "sam.core.rte"
comp_id = ""
update = 0
cp_flag = 275
fesn = ""
name = "sam.core"
state = 5
ver = 3
rel = 2
mod = 0
fix = 0
ptf = ""
media = 3
sceded_by = ""
fixinfo = ""
prereq = "*prereq rsct.core.utils 2.4.13.1\n\
*prereq rsct.core.rmc v=2 r<5 b="">\n\
*prereq rsct.basic.rte 2.4.13.1\n\
"
description = "SA CHARM Runtime Commands"
supersedes = ""

Conclusion: The fileset «sam.core.rte» has such a dependency. If you run in such a problem consider to update the fileset causing the error or check if the fileset is needed at all.

Storage Management

1. Is it possible to increase the maximum number of PPs beyond 1016?

If you want to integrate a new and larger disk into an existing Volume Group you might run into problems with the the maximum number of PPs of one Physical Volume. The reason is, that when creating a new Volume Group the PP size is often set to the smallest value possible. The number of PPs/PV of a standard Volume Group is limited to 1016. What to do?

You can use chvg -t to increase the number of PPs with a factor of 2,4,16, or 32:
# chvg -t 2 rootvg

With the above command you increase the maximum number of PPs per PV in the rootvg to 2032. But be aware, that you decrease the number of PVs (hdisks) per VG with the same factor. In this example the rootvg cannot contain more then 16 PVs.

2. How can I figure out if a fibrechannel card is linked to a switch port?

Check the status of the FC SCSI I/O Controller Protocol Device:

The below example shows the status of the FC SCSI I/O Controller Protocol Device of the first fibre channel adapter if the system is not connected to the switch (cable is present, but switch port not configured) - attach: none, no SCSI ID:
# lsattr -El fscsi0
attach none How this adapter is CONNECTED False
dyntrk no Dynamic Tracking of FC Devices True
fc_err_recov delayed_fail FC Fabric Event Error RECOVERY Policy True
scsi_id Adapter SCSI ID False
sw_fc_class 3 FC Class for Fabric True

... and this is how it looks, if the card is connected to the switch:
# lsattr -El fscsi1
attach switch How this adapter is CONNECTED False
dyntrk no Dynamic Tracking of FC Devices True
fc_err_recov delayed_fail FC Fabric Event Error RECOVERY Policy True
scsi_id 0x610100 Adapter SCSI ID False
sw_fc_class 3 FC Class for Fabric True

... and this is how it looks if there is no cable to a switch at all:
# lsattr -El fscsi1
attach al How this adapter is CONNECTED False
dyntrk no Dynamic Tracking of FC Devices True
fc_err_recov delayed_fail FC Fabric Event Error RECOVERY Policy True
scsi_id 0x610100 Adapter SCSI ID False
sw_fc_class 3 FC Class for Fabric True

al means Arbitrary Loop. You get this if there is no cable plugged into the fibre channel card. But you also get this if the system is directly attached to a storage box (e.g. FAStT). In the latter case there is nothing wrong if you see attach: al

3. How can I create a dummy disk to reserve an hdisk number?

Below you find a situation where the next LUN that is mapped to your system would get an hdisk number 0 (hdisk0):
# lsdev -Cc disk
hdisk1 Available 06-08-00-4,0 16 Bit LVD SCSI Disk Drive
hdisk2 Available 06-08-00-5,0 16 Bit LVD SCSI Disk Drive

To avoid this you could reserve hdisk0 for a dummy disk, e.g.:
# mkdev -l hdisk0 -c disk -t osdisk -s scsi -p scsi0 -w 0,10 -d
hdisk0 defined

Now we see hdisk0 as defined:
# lsdev -Cc disk
hdisk0 Defined 06-08-00-0,10 Other SCSI Disk Drive
hdisk1 Available 06-08-00-4,0 16 Bit LVD SCSI Disk Drive
hdisk2 Available 06-08-00-5,0 16 Bit LVD SCSI Disk Drive

... and the next LUN would be mapped to hdisk3.

Unfortunately this trick only works for systems with a SAS controller assigned. With AIX 5.3 you still have the option to create a dummy SSA disk:

# mkdev -l hdisk0 -p ssar -t hdisk -w dummy
mkdev: 0514-519 The following device was not found in the customized
device configuration database:
name='ssar'

Don't be confused by the error - we have a hdisk0 now:

# lsdev -Cc disk
hdisk0 Defined SSA Logical Disk Drive
hdisk1 Available 06-08-00-4,0 16 Bit LVD SCSI Disk Drive
hdisk2 Available 06-08-00-5,0 16 Bit LVD SCSI Disk Drive

This complicated procedure is not needed any more since AIX 7.1 and AIX 6.1 TL6 - a new command has been made available:
# lspv
hdisk0 00c8b12ce3c7d496 rootvg active
hdisk1 00c8b12cf28e737b None
# rendev -l hdisk1 hdisk99
# lspv
hdisk0 00c8b12ce3c7d496 rootvg active
hdisk99 00c8b12cf28e737b None

4. How can I directly read out the VGDA of a PV (hdisk)?

Information about VGx, LVx, filesystems, etc. are stored in the ODM. But these information are also written to the VGDA of the disks itself. You can read the information directly from the disk's VGDA with a command like this:
# lqueryvg -Atp hdisk100

You can use
# redefinevg -d hdisk100 myvg

to synchronize the ODM with the information of the VGDA. You can also synchronize the VGDA with the information stored in the ODM:
# synclvodm myvg

5. How can I unlock a SAN disk?

Finally I got my LUN mapped to my system, but when I try to create my Volume Group with mkvg -f vpath100 all I get is an I/O error. What can I do?

Probably there is still a SAN lock on the disk. For vpath devices try to unlock it with:
# lquerypr -ch /dev/vpath100

and retry to create your Volume Group. If you use the newer sddpcm drivers the command to unlock would be
# pcmquerypr -ch /dev/hdisk100

6. How can I identify a generic SCSI disk for replacement?

To identify a SCSI disk (attached to a hot swap enclosure) with AIX you can use diag to let it blinking:

# diag

Then select

> Task Selection (Diagnostics, Advanced Diagnostics, Service Aids, etc.)
> Hot Plug Task
> SCSI and SCSI RAID Hot Plug Manager
> Identify a Device Attached to a SCSI Hot Swap Enclosure Device

You see the following screen providing you with a list of hdisks. Select the one you need to identify:

IDENTIFY DEVICE ATTACHED TO SCSI HOT SWAP ENCLOSURE DEVICE

The following is a list of devices attached to SCSI Hot Swap Enclosure devices.
Selecting a slot will set the LED indicator to Identify.

Make selection, use Enter to continue.

U0.1-
ses2 P1-I1/Z1-Af
slot 1 P1-I1/Z1-A8 hdisk2
slot 2 P1-I1/Z1-A9 hdisk3
slot 3 P1-I1/Z1-Aa hdisk4
slot 4 P1-I1/Z1-Ab hdisk5
slot 5 P1-I1/Z1-Ac hdisk6
slot 6 P1-I1/Z1-Ad hdisk7
slot 7 P1-I1/Z1-Ae hdisk8

U0.1-
ses3 P1-I5/Z1-Af
slot 1 P1-I5/Z1-A0 hdisk9
slot 2 P1-I5/Z1-A1 hdisk10
slot 3 P1-I5/Z1-A2 hdisk11
slot 4 P1-I5/Z1-A3 hdisk12
slot 5 P1-I5/Z1-A4 hdisk13
slot 6 P1-I5/Z1-A5 hdisk14
slot 7 +------------------------------------------------------+
| |
| The LED should be in the Identify state for the |
| selected device. |
| |
| Use 'Enter' to put the device LED in the |
| Normal state and return to the previous menu. |
| |
| F3=Cancel F10=Exit Enter |
+------------------------------------------------------+
F1=Help F10=Exit

If you already removed the hdisk with the rmdev command you would still see the slot in the above screen but no device name.

7. How can I change the the name of a tape device?

You can rename a tape device (i.e. rmtX or smcX) easily with chdev. For example, if you want to rename rmt0 to rmt201 just type:
# chdev -l rmt0 -a new_name=rmt201
rmt0 changed

Please note: It only works with tapes! This is because IBM defined a special attribute new_name in the ODM only for tape drives and media changers.

Update: AIX 7.1 and AIX 6.1 T6 introduced a new command rendev that can be used to rename any device. The below command would rename ent0 to ent99:
# rendev -l ent0 -n ent99

8. How can I find all hdisks containing an AIX boot signature?

# ipl_varyon -i

PVNAME BOOT DEVICE PVID VOLUME GROUP ID
hdisk0 YES 00f64183e8ff11c50000000000000000 00f6418300004c00
hdisk1 NO 00f6418384f345d00000000000000000 00f6418300004c00
hdisk2 NO 00f6418384f346210000000000000000 00f6418300004c00
hdisk3 NO 00f6418384f3466c0000000000000000 00f6418300004c00
hdisk4 NO 00f6418384f346b00000000000000000 00f6418300004c00
hdisk5 NO 00f6418384f346f20000000000000000 00f6418300004c00
hdisk6 NO 00f6418384f44fca0000000000000000 00f6418300004c00
hdisk7 NO 00f6418384f450150000000000000000 00f6418300004c00
hdisk8 NO 00f6418384f450540000000000000000 00f6418300004c00
hdisk9 NO 00f6418384f4508f0000000000000000 00f6418300004c00
hdisk10 NO 00f6418384f450ca0000000000000000 00f6418300004c00
hdisk11 NO 00f6418384f347390000000000000000 00f6418300004c00
hdisk12 NO 00f6418384f450ff0000000000000000 00f6418300004c00

Conclusion: Only hdisk0 contains a boot signature.

9. How can I see statistics of an HBA?

Use the fcstat command on the FC adapter:

# fcstat fcs0

The command gives a whole page of output not shown here. The command shows statistics similar to the entstat command. If you are only interested in the port speed, you could type

# fcstat fcs0 | grep 'Port Speed'
Port Speed (supported): 8 GBIT
Port Speed (running): 4 GBIT

10. How can I find WWPNs of FC adapters from the SMS menu?

It is possible to find the WWPNs in the OpenFirmware Prompt - at least on recent hardware. From the HMC boot the LPAR into the Open Firmware Prompt and issue the ioinfo command at the ok-prompt:

1 = SMS Menu 5 = Default Boot List
8 = Open Firmware Prompt 6 = Stored Boot List

Memory Keyboard Network SCSI Speaker

0 > ioinfo

!!! IOINFO: FOR IBM INTERNAL USE ONLY !!!
This tool gives you information about SCSI,IDE,SATA,SAS,and USB devices attached to the system

Select a tool from the following

1. SCSIINFO
2. IDEINFO
3. SATAINFO
4. SASINFO
5. USBINFO
6. FCINFO
7. VSCSIINFO

q - quit/exit

==> 6

FCINFO Main Menu
Select a FC Node from the following list:
# Location Code Pathname
---------------------------------------------------------------
1. U5877.001.0082113-P1-C10-T1 /pci@80000002000012b/fibre-channel@0
2. U5877.001.0082113-P1-C10-T2 /pci@80000002000012b/fibre-channel@0,1
3. U5877.001.0082924-P1-C10-T1 /pci@80000002000013b/fibre-channel@0
4. U5877.001.0082924-P1-C10-T2 /pci@80000002000013b/fibre-channel@0,1

q - Quit/Exit

==> 1

FC Node Menu
FC Node String: /pci@80000002000012b/fibre-channel@0
FC Node WorldWidePortName: 10000000c9d08fd0
-----------------------------------------------------------------
1. List Attached FC Devices
2. Select a FC Device
3. Enable/Disable FC Adapter Debug flags

q - Quit/Exit

Conclusion: The WWPN of the first port of the first FC adapter is: 10000000c9d08fd0. On a running AIX system you would find the same information with

# lscfg -vpl fcs0 | grep 'Network Address'
Network Address.............10000000C9D08FD0

11. How can I check what qdepth the kernel actually uses for a specific LUN?

It's easy to set the qdepth with chdev as it is easy to read it out with lsattr:
# chdev -l hdisk100 -a queue_depth=8
hdisk100 changed
# lsattr -El hdisk100 -a queue_depth
queue_depth 8 Queue DEPTH True

But it's not possible to change the qdepth as long as the hdisk is in use. But you could still change the value in the ODM and wait for the next reboot for the change to apply. But here we have a problem. lsattr already shows the new value while the kernel still uses the old one.
# lsattr -El hdisk100 -a queue_depth
queue_depth 20 Queue DEPTH True
# chdev -l hdisk100 -a queue_depth=8 -P
hdisk100 changed
# lsattr -El hdisk100 -a queue_depth
queue_depth 8 Queue DEPTH True

But what qdepth does the kernel actually use? The only way to get the kernel's value is to use the kernel debugger:
# echo scsidisk hdisk100 | kdb | grep queue_depth
ushort queue_depth = 0x14;

What we see is the hex value of the qdepth. Use the below command to convert the value to decimal as it would be displayed by lsattr:
# printf "%d\n" 0x14
20

12. How can I increase a LUN on the fly?

Whenever the SAN admins increase a LUN I run cfgmgr, but my volume group does not recognize the new size. What to do?

Just run
# chvg -g

and the additional size can be used. Doesn't work for the rootvg and HACMP though¹.

¹ Note: According to IBM Tech Note IZ80021 these restrictions have been removed from the command as of AIX 6.1 TL04.

13. How can I set the number of logical partitions to be synchronized in parallel?

In normal operation the syncvg and varyonvg commands don't synchronize logical partitons in parallel resulting in a very long synchronization time. But this behaviour can be changed by setting the NUM_PARALLEL_LPS variable prior to run the synchronization commands:

# export NUM_PARALLEL_LPS=8
# varyonvg myvg

# export NUM_PARALLEL_LPS=8
# syncvg -v myvg

This way 8 logical partitions will be synchronized in parallel. Depending on available CPU resources this can speed up the synchronization nearly by a factor 8.

With the syncvg command the same effect can be realized with the -P flag:

# syncvg -P 8 -v myvg

However, if you prefer to run varyonvg to synchronize logical partition mirrors setting the NUM_PARALLEL_LPS variable is your only option.

14. How can I get rid of "ghost paths"?
It happens that a LUN is connected via two paths, but lspath shows boths paths twice - once as Missing and another time as Enabled:

# lspath -l hdisk151
Missing hdisk151 fscsi0
Missing hdisk151 fscsi1
Enabled hdisk151 fscsi0
Enabled hdisk151 fscsi1

The reason usually is located somewhere in the SAN infrastructure - a new switchport, a replugged cable, etc. Anyway, how can I get rid of these "ghost paths" without affecting the good paths?

Not a big deal - every path to a LUN has its unique path ID:

# lspath -l hdisk151 -F "path_id:parent:path_status:status"
0:fscsi0:Missing:N/A
1:fscsi1:Missing:N/A
2:fscsi0:Available:Enabled
3:fscsi1:Available:Enabled

So all we have to do is to remove the two paths with the IDs 0 and 1...

# rmpath -dl hdisk152 -i 0
paths Deleted
# rmpath -dl hdisk151 -i 1
paths Deleted

...and the "ghost paths" are gone:

# lspath -l hdisk151 -F "path_id:parent:path_status:status"
2:fscsi0:Available:Enabled
3:fscsi1:Available:Enabled

# lspath -l hdisk151
Enabled hdisk151 fscsi0
Enabled hdisk151 fscsi1

15. How do I create a mapfile to create an exact copy of a Logical Volume?

Let's say hdisk100 is the disk holding the first and only copy of a LV called mylv.map and you want to create a second copy on hdisk101. The below command will do the trick:

# lslv -m mylv | awk '/hdisk/ { printf( "hdisk101:%d\n", $2 ) }' | tee mylv.map
hdisk101:1
hdisk101:2
hdisk101:3
hdisk101:4

If your LV is spread over multiple disks sed is your friend:

# lslv -m mylv | awk '/hdisk/ { printf( "%s:%d\n", $3, $2 ) }' | sed -e 's/hdisk100\:/hdisk200\:/' -e 's/hdisk101\:/hdisk201\:/' | tee mylv.map
hdisk200:1
hdisk201:1
hdisk200:2
hdisk201:2
hdisk200:3
hdisk201:3
hdisk200:4
hdisk201:4

In the above example hdisk100 is to going to be copied to hdisk200 and hdisk101 to hdisk201. To actually create the mirror run mklvcopy with the -m switch:

# mklvcopy -m mylv.map mylv 2

16. How can I change the status of a removed PV back to active?

After an I/O failure to a PV due to a down path or a system crash, a volume group may have a disk in a removed state:

# lsvg -p rootvg
rootvg
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk1 removed 432 136 76..00..00..00..60
hdisk2 active 432 136 76..00..00..00..60

We can use chpv to change the status of the PV back to active:

# chpv -va hdisk1
# syncvg -P 4 -v rootvg

The switch '-P 4' to syncvg may be used to speed up the synchronization process by syncing 4 PPs in parallel.

Miscellaneous

1. How do I create users with long login names (more than 8 characters) under AIX 5.3?

Since AIX version 5.3 one can create users with login names longer than 8 characters. In order to create such a login name you first have to enable longer login names. This can be done with:
# chdev -l sys0 -a max_logname=13

The above example allows login names with up to 12 characters.

2. Can I use passwords with more than 8 (significant) characters?

AIX always accepts passwords with more than 8 characters. But in fact only the first 8 characters are significant. If you want to use passwords with more characters the hash algorithm has to be changed in /etc/security/login.cfg:
usw:
shells = /bin/sh,/bin/bsh,/bin/csh,/bin/ksh,/bin/tsh,/bin/ksh93,/usr/bin/sh,/usr/bin/bsh,/usr/bin/csh,/usr/bin/ksh,/usr/bin/tsh,/usr/bin/ksh93
maxlogins = 32767
logintimeout = 60
maxroles = 8
auth_type = STD_AUTH
pwd_algorithm = ssha256

The last line changes the hash algorithm from crypt to ssha256. The algorithm allows passwords with up to 255 characters. Have a look to /etc/security/pwdalg.cfg to see what other algorithms are allowed.

3. What are the correct settings for daylight saving time in Central Europe?

The timezone is set by the TZ environment variable. To set the timezone globally you have to change the TZ variable in /etc/environment. For the central eurpean countries (Brussels time) this variable should be set as follows:
TZ=CET-1CST,M3.5.0/2:00,M10.5/3:00

All services that read the timezone have to be restarted (e.g. cron). A reboot -of course- will restart everything.

Please note that AIX's default time settings for Central Europe are not correct!

Beginning with AIX 7.1 and AIX 6.1 TL5 symbolic ("Olson") values for TZ are also respected. For The Netherlands you could set:
TZ=Europe/Amsterdam

4. Can I identify deleted files still opened by a process?

Just run fuser -V -d on the filesystem you want to check for deleted but still opened files. This is an example for /tmp:
# fuser -V -d /tmp
/tmp:
inode=7 size=56 fd=2 512238

The PID points to the process which still has an open file descriptor to the deleted file:
# ps -fp 512238
USER PID PPID C STIME TTY TIME CMD
root 512238 1 0 Mar 20 - 3:29 /usr/sbin/rsct/bin/ctcasd

5. How can I figure out what values are known to device attributes?

From the following example output we want to change the attribute init_link of a fibre channel adapter:
# lsattr -El fcs0
bus_intr_lvl 121 Bus interrupt level False
bus_io_addr 0xbfc00 Bus I/O address False
bus_mem_addr 0xc0040000 Bus memory address False
init_link al INIT Link flags True
intr_priority 3 Interrupt priority False
lg_term_dma 0x800000 Long term DMA True
max_xfer_size 0x100000 Maximum Transfer Size True
num_cmd_elems 200 Maximum number of COMMANDS to queue to the adapter True
pref_alpa 0x1 Preferred AL_PA True
sw_fc_class 2 FC Class for Fabric True

True in the last column indicates that we indeed can change the value of this attribute¹. But what is a valid value? This can be easily figured out with the lsattr command:
# lsattr -Rl fcs0 -a init_link
al
pt2pt

Valid values are al and pt2pt. And that's how we could change it:
# chdev -l fcs0 -a init_link=pt2pt
fcs0 changed

¹ Note: This example is taken from an AIX 7.1 box. With AIX 5.3 and 6.1 you would see 'False' here.

6. How can I mount an ISO image file?

With AIX 6.1 TL4 or newer you can use loopmount:
# ls -l *.iso
-rw-r--r-- 1 root system 43974656 Jan 13 17:05 dvd_aix_profilemanager.iso
# loopmount -i dvd_aix_profilemanager.iso -o "-V cdrfs -o ro" -m /mnt
# df /mnt
Filesystem 512-blocks Free %Used Iused %Iused Mounted on
/dev/loop0 84812 0 100% 21203 100% /mnt

7. How can I fix a broken /dev/ipldevice?

I migrated the rootvg to a different disk. Now I get tons of errors when running any mirroring command. I know a reboot solves the problem. But can I fix it without a reboot?

The problem is that /dev/ipldevice points to the device the system was booted from. When you removed this device from the rootvg /dev/ipldevice points to a non-existing device and you see error messages like these:
# unmirrorvg rootvg hdisk2

0516-1734 rmlvcopy: Warning, savebase failed. Please manually run 'savebase' before rebooting.
0516-1734 unmirrorvg: Warning, savebase failed. Please manually run 'savebase' before rebooting.

You can fix it by relinking /dev/ipldevice to the disk holding the BLV. If you have your rootvg mirrored choose the first one.

# lslv -l hd5
hd5:N/A
PV COPIES IN BAND DISTRIBUTION
hdisk16 001:000:000 0% 001:000:000:000:000

# cd /dev
# ls -l ipldevice
crw------- 2 root system 17, 2 Nov 18 2010 ipldevice
# rm -f ipldevice
# ln rhdisk16 ipldevice
# ls -l ipldevice rhdisk16
crw------- 2 root system 17, 16 Jun 25 10:58 ipldevice
crw------- 2 root system 17, 16 Jun 25 10:58 rhdisk16

# savebase

Please note that a hardlink is required.

8. How do I extend a dump device?

«sysdumpdev -e» estimates the size of the dump:

# sysdumpdev -e
0453-041 Estimated dump size in bytes: 547146956

and «sysdumpdev -e» shows the location of the dump device:

# sysdumpdev -l
primary /dev/hd7
secondary /dev/sysdumpnull
copy directory /var/adm/ras
forced copy flag TRUE
always allow dump TRUE
dump compression ON

In our case it's hd7. The size of the dump device is the size of the underlying LV:

# lslv hd7 | egrep 'PP SIZE|LPs'
MAX LPs: 512 PP SIZE: 256 megabyte(s)
LPs: 2 PPs: 2

In our example we need a dump device of at least 547146956 bytes ( = 522 MB) what is a bit more than what we have (2 * 256 MB = 512 MB). So we need to increase our dump device by 1 LP:

# extendlv hd7 1
# lslv hd7 | egrep 'PP SIZE|LPs'
MAX LPs: 512 PP SIZE: 256 megabyte(s)
LPs: 3 PPs: 3

Tanti Technology

Wednesday, 17 February 2016

Lesson - 5 Device Management

Network Installation Manager (NIM) -- Part 1 Introduction & Concepts

FAQ AIX

Contents

Backup and Restore

Hardware and Support

Network

Package Management

Storage Management

Miscellaneous

Backup and Restore

1. How can I archive a directory with backup?

2. How to exclude a directory from being backed up by the mksysb command?

3. How can I extract a compressed tar archive under AIX?

4. How can I restore an old version of a specific file with dsmc?

5. How can I create a bootable DVD image from an mksysb?

6. How can I restore single files or directories from mksysb?

Hardware and Support

1. What is the command to analyze a dump with AIX 5L?

2. How can I display all actually loaded kernel extensions?

3. Can I run a 64bit kernel on my system?

4. How can I activate the 64bit Kernel? Is it even already running??

5. Is it possible to install AIX 5.2 or AIX 5.3 on my box?

6. How can I list the version of an HMC?

7. How can I list the firmware level of a p5 system?

8. How can I create a snap suitable for IBM support?

9. How can I figure out the managing HMC (and CSM server) of my LPAR?

10. How can I see the allocated resources of an LPAR w/o checking on the HMC?

11. How can I see statistics about all LPARs sharing the same resources?

12. How to view the current SMT mode settings and processor information

13. How can I check the NPIV adapter mappings on the client LPAR?

14. How can I find the serial number of my POWER machine?

Network

1. How can I create a backup ethernet adapter?

2. How to set a static route?

3. How can I force nfs mounts to use reserved ports?

4. How do I set an IP alias under AIX?

5. How can I see the the physical link status of an ethernet adapter?

6. How can I check the status of an etherchannel?

7. How can I start a service of the internet superserver inetd?

8. What are the best options to mount an NFS share?

9. Is it possible to pipe dd output to the ftp command?

Package Management

1. How can I do a preview of an ML/TL upgrade?

2. How can I apply an efix or ifix?

3. How can I list filesets and versions on installation media?

4. How can I get information about installed filesets and versions in one line?

5. «lppchk -v» shows errors. Where can I find the cause?

Storage Management

1. Is it possible to increase the maximum number of PPs beyond 1016?

2. How can I figure out if a fibrechannel card is linked to a switch port?

3. How can I create a dummy disk to reserve an hdisk number?

4. How can I directly read out the VGDA of a PV (hdisk)?

5. How can I unlock a SAN disk?

6. How can I identify a generic SCSI disk for replacement?

7. How can I change the the name of a tape device?

8. How can I find all hdisks containing an AIX boot signature?

9. How can I see statistics of an HBA?

10. How can I find WWPNs of FC adapters from the SMS menu?

11. How can I check what qdepth the kernel actually uses for a specific LUN?

12. How can I increase a LUN on the fly?

13. How can I set the number of logical partitions to be synchronized in parallel?

14. How can I get rid of "ghost paths"?

15. How do I create a mapfile to create an exact copy of a Logical Volume?

16. How can I change the status of a removed PV back to active?

Miscellaneous

1. How do I create users with long login names (more than 8 characters) under AIX 5.3?

2. Can I use passwords with more than 8 (significant) characters?

3. What are the correct settings for daylight saving time in Central Europe?

4. Can I identify deleted files still opened by a process?

5. How can I figure out what values are known to device attributes?

6. How can I mount an ISO image file?

7. How can I fix a broken /dev/ipldevice?

8. How do I extend a dump device?