Logical Volume Manager :
Volume Group
To create a vg on hdisk1 :
# mkvg -vg newvg hdisk1
To drain I/O's for a vg and suspends future I/O's :
# chvg -suspend vg03
To resume normal I/O operations for a vg :
# chvg -resume vg03
Unlock a VG if left in a locked state by abnormal termination of another lvm operation :
# chvg -unlock vg03
To add a physical volume to a volume group :
# extendvg vg3 hdisk3
To remove a PV from a VG :
# reducevg vg01 hdisk1
To remove a PV and all residing LVs from a VG with no confirmation :
# reducevg –rmlv –f vg01
To activate a vg :
# activatevg vg03
To deactivate a vg :
# deactivatevg vg03
To mirror the VIO Server’s rootvg to hdisk4 and reboots VIO Server :
# mirrorios –force hdisk4
To mirror the VIO Server’s rootvg to hdisk4 but don’t reboot the server :
# mirrorios –defer hdisk4
To remove the rootvg mirror from hdisk4 :
# unmirrorios hdisk4
To import a VG from hdisk07 :
# importvg –vg vg001 hdisk07
To export a VG :
# exportvg vg3
Note: Volume Group containing a paging space can’t be exported
To sync a VG :
# syncvg –vg vg01
To sync a LV :
# syncvg –lv lv001
To sync a PV :
# syncvg –pv hdisk4 hdisk5
To redefine a VG based on the VGDA from hdisk04 :
# redefvg –dev hdisk04
Logical Volume :
To create a logical volume of size 1MB in vg01 :
# mklv –lv lv001 vg01 1M hdisk1
To create a logical volume with mirror in place in vg01 of size 1GB :
# mklv –mirror vg01 1G
To extend an LV by 3MB :
# extendlv lv01 3M
To extend an LV by 1GB with space taken from hdisk5 :
# extendlv lv01 1G hdisk5
To remove a logical volume :
# rmlv lv05
To display the properties of a logical volume :
# lslv lv03
To display info about LV by Physical volume :
# lslv –pv lv03
To display LVs that can be used as backing devices :
# lslv –free
To make a copy for lv01 in hdisk03 :
# mklvcopy lv01 hdisk03
To remove lv01’s copy from hdisk03 :
# rmlvcopy lv01 hdisk03
To copy the contents of lv01 to lv02 :
# cplv lv01 lv02
To copy the contents of lv01 to a new lv in vg01 :
# cplv –vg vg01 lv01
To change the name of oldlv to newlv :
# chlv –lv newlv oldlv
Physical Volume
To display all physical volumes in the system :
# lspv
To display the status and characteristics of hdisk03 :
# lspv hdisk03
To list all the available PVs used as virtual SCSI backing devices :
# lspv –avail
To list PVs that can be used as virtual SCSI backing devices and are not currently a backing device :
# lspv –free
To move physical partitions from hdisk1 to hdisk2 :
# migratepv hdisk1 hdisk2
To move physical partitions in lv01 from hdisk1 to hdisk2 :
# migratepv –lv lv01 hdisk1 hdisk2
AIX is short for Advanced Interactive eXecutive. AIX is the UNIX operating system from IBM for RS/6000, pSeries and the latest p5 & p5+ systems. Currently, it is called "System P". AIX/5L the 5L addition to AIX stands for version 5 and Linux affinity. AIX and RS/6000 was released on the 14th of February, 1990 in London. Currently, the latest release of AIX is version 6. AIX 7 beta will be released in Aug 2010, along with the new POWER7 hardware range.
Tanti Technology
- sandeep tanti
- Bangalore, karnataka, India
- Multi-platform UNIX systems consultant and administrator in mutualized and virtualized environments I have 4.5+ years experience in AIX system Administration field. This site will be helpful for system administrator in their day to day activities.Your comments on posts are welcome.This blog is all about IBM AIX Unix flavour. This blog will be used by System admins who will be using AIX in their work life. It can also be used for those newbies who want to get certifications in AIX Administration. This blog will be updated frequently to help the system admins and other new learners. DISCLAIMER: Please note that blog owner takes no responsibility of any kind for any type of data loss or damage by trying any of the command/method mentioned in this blog. You may use the commands/method/scripts on your own responsibility. If you find something useful, a comment would be appreciated to let other viewers also know that the solution/method work(ed) for you.
Monday, 20 June 2011
USER ADMINISTRATION
USER ADMINISTRATION
Few Restrictions on the User Name:
1. User names cannot start with a
• dash or minus sign(-)
• plus sign (+)
• At symbol (@)
• Tilde (~)
2. User names cannot include
• colon (:)
• single or double quotation marks( ' or ")
• hash symbol (#)
• comma (')
• equal sign
• Back or forward Slashes ( \ or /)
• Question mark (?)
• Back quote (`)
• White space (space or tab)
3. User names cannot be names ALL or default. Becoz those names are reserved for the AIX OS.
4. User names can have max. 8 characters in AIX V 5.2 or earlier. Starting with AIX 5.3, you can have a
max. of 255 characters. You can change this setting by using the below command,
# chdev -l sys0 -a max_logname=255
To view the setting, use any of the below commands
# lsdev -l sys0 -a max_logname
# getconf LOGIN_NAME_MAX
Configuration Files:
/etc/passwd :
Contains the basic user configuration details like user name, password flag, uid, gid, gecos (description), home directory, shell.
/etc/security/.profile :
It is the template for the user's .profile file. It has been copied to the user's home directory when we create the user.
/etc/security/limits :
It contains all the resource limits (ulimits) for the users.
Here are the various ulimit values ...
fsize, fsize_hard - Soft and hard limit for the size of a file a user can create
core, core_hard - Soft and hard limit for the Size of core file a user can create
cpu, cpu_hard - Soft and hard limit for the amount of system time allowed
data, data_hard - Soft and hard limit for the size of the process data segment
stack, stack_hard - Soft and hard limit for the size of the process stack segment
rss, rss_hard - Soft and hard limit for the physical memory allowed
nofiles, nofiles_hard - Soft and hard limit for the number of open file descriptors at one time
nproc, nproc_hard - Soft and hard limit for the number of running processes at one time
/etc/security/passwd :
This file contains the user's password information such as password, lastupdate and flags.
Here are the various flags user
ADMIN - It can be set so that only the root user can change the user's password.
ADMCHG - It can be set so that the user is prompted to change his or her password on the next login/su.
NOCHECK - It can be set so that any additional restrictions in /etc/security/user are ignored.
/etc/security/user : This file contains very very important settings for every user.
Here are the parameters configured in the file for each and every user :
account_locked - To lock the user account. This can takes values TRUE or FALSE
admin - To specify whether the user is admin or not. It can take calues TRUE or FALSE
expires - It is configured to set the expiration date for the user beyond which the user will be locked. It can take values in the format MMDDHHYY.
histexpire - To specify the # of weeks the user can't reuse a password. It can takes values between 0-260
histsize - To specify the # of passwords previously used that can't be reused. It can take values between 0-50
login - To specify whether a user can log in or not. It can take values TRUE or FALSE.
maxage - To specify the # of weeks a password is valid. It can take values between 0-52.
minage - To specify the # of weeks a user must wait before changing his or her password. It can take values between 0-52.
rlogin - To specify whether a user can be accessed remotely via telnet,ssh, ftp. It can take values TRUE or FALSE.
su - To specify whether other user can use su to access this account. It can take values TRUE or FALSE.
/usr/lib/security/mkuser.default : This file contains the default values that are set while creating an user.
/etc/security/login.cfg : This file contains the message that is displayed whenever you login to the system.
You can always change it using chsec command or by editing this file directly in vi editor.
Here are the few attributes of a user which you may be interested.
id - User Identification Number is a unique i dfor every user. root user's id is always 0.
pgrp - Primary Group of a user
groups - Secondary Groups of a user. An user can belong to maximum 128 groups in AIX 5.3 and 6.1.
home - Home directory to store the user's files
shell - Shell that runs when the user login
gecos - Description or some comments about the user
There are 6 main commands used in the administration of user :
mkuser - Add a user
chuser - Change an attribtue of a user
lsuser - List the attribtues of a user
rmuser - Remove a user
passwd - To set password for a user and for various other purposes
These words can also be used as fastpaths for smitty.
For example, # smitty mkuser will open a form to create a user.
For doing the whole user administration, you can use
# smitty user and go thru the menu items for various operations.
Now let us see the commands to administrate users ...
1. To create a user called 'jack' with default settings and allocate the next available uid :
# mkuser jack
2. To create a user with home dir as /opt/$username, primay group as 'dba' :
# mkuser home=/opt/jack pgrp=dba jack
3. To know about the user :
# finger jack
4. To change the primary group for a user :
# chuser pgrp=oracle jack
5. To list the attributes of a user in stanza structure :
# lsuser -f jack
6. To list the attributes of a user delimited by comma :
# lsuser -c jack
7. To list home and shell attributes for the users jack and tom :
# lsuser -a shell home jack,tom
8. To set the password for a newly created user :
# passwd jack
9. To clear the flag ADMCHK for jack :
# pwdadm -c jack
If you dont do this after setting a password for jack, he will be prompted to change his password on the first login.
10. To change the gecoz for a user :
# passwd -f jack
11. To change the shell for a user :
# passwd -s jack
12. To list the last password update date/time and the flags for a user :
# passwd -q jack
13. To set the ADMIN flag for a user :
# passwd -f ADMIN jack
ADMIN flag ensures that only the root user can change the password for Jack.
14. To remove the user :
# rmuser jack
Note: rmsuer doesn't remove the home directory for a user.
You have to remove it may be after the backup.
15. To remove the user along with his password information :
# rmuser -p jack
16. To list the currently logged in users :
# who
Note: This command will show the contents of /etc/utmp which is a binary file.
17. To list the login and logout information for the machine :
# last
Note: This command will show the contents of /var/adm/wtmp file, which is a binary file. Over a period of time, this file will occupy the /var file system a lot. Hence nullify the file once in a 6 months or depending upon the # of login/logout actions in the system.
To clear(nullify) the wtmp file, you can use any of the below commands
# cp /dev/null /var/adm/wtmp
# > /var/adm/wtmp
18. To change the default message(herald) that is shown after user login :
# chsec -f /etc/security/login.cfg -a default -herald
19. As a user, you have to protect (lock) your terminal whenever you go for a coffee break.
# lock -> To lock your telnet or ssh terminal
If you use XWindowsm you can use the below command
# xlock
20. Sometimes you may want to login as root to execute some admin commands. For this you dont have to logout from current user and login as root.You can use su command to swtich user and execute the commands and say 'exit' to come out of the su window.
To su to root, you can use any of the below commands
# su - root
# su -
To su to other user called tom,
# su - tom
These su operations are logged into /var/adm/sulog file. You have to nullify this file on certain period of time to make some space in /var file system.
21. How to disable direct root login via telnet and ssh ?
To disable direct root login thru telnet or ssh, you have to set 'rlogin' attribtue for root user to false.
You can use the below command to do so.
# chuser rlogin=false root
22. How to enforce automatic logoff after certain timeout period ?
To enforce automatic logoff after timeout period of 10 minutes, enter the following line in /etc/security/.profile after the AIX installation.
TMOUT=600 ; TIMEOUT=600 ; export readonly TMOUT TIMEOUT
/usr/bin/mkuser Contains the mkuser command.
/usr/lib/security/mkuser.default
Contains the default values for new users.
/etc/passwd
Contains the basic attributes of users.
/etc/security/user
Contains the extended attributes of users.
/etc/security/user.roles
Contains the administrative role attributes of users.
/etc/security/passwd
Contains password information.
/etc/security/limits
Defines resource quotas and limits for each user.
/etc/security/environ
Contains the environment attributes of users.
/etc/group
Contains the basic attributes of groups.
/etc/security/group
Contains the extended attributes of groups.
/etc/security/.ids Contains standard and administrative user IDs and group IDs.
/usr/bin/passwd Contains the passwd command.
/etc/passwd Contains user IDs, user names, home directories, login shell, and finger information.
/etc/security/passwd Contains encrypted passwords and security information.
/usr/bin/chuser Contains the chuser command.
/etc/passwd
Contains the basic attributes of users.
/etc/group
Contains the basic attributes of groups.
/etc/security/group
Contains the extended attributes of groups.
/etc/security/user
Contains the extended attributes of users.
/etc/security/user.roles
Contains the administrative role attributes of users.
/etc/security/lastlog
Contains the last login attributes of users.
/etc/security/limits
Defines resource quotas and limits for each user.
/etc/security/audit/config
Contains audit configuration information.
/etc/security/environ
Contains the environment attributes of users.
Few Restrictions on the User Name:
1. User names cannot start with a
• dash or minus sign(-)
• plus sign (+)
• At symbol (@)
• Tilde (~)
2. User names cannot include
• colon (:)
• single or double quotation marks( ' or ")
• hash symbol (#)
• comma (')
• equal sign
• Back or forward Slashes ( \ or /)
• Question mark (?)
• Back quote (`)
• White space (space or tab)
3. User names cannot be names ALL or default. Becoz those names are reserved for the AIX OS.
4. User names can have max. 8 characters in AIX V 5.2 or earlier. Starting with AIX 5.3, you can have a
max. of 255 characters. You can change this setting by using the below command,
# chdev -l sys0 -a max_logname=255
To view the setting, use any of the below commands
# lsdev -l sys0 -a max_logname
# getconf LOGIN_NAME_MAX
Configuration Files:
/etc/passwd :
Contains the basic user configuration details like user name, password flag, uid, gid, gecos (description), home directory, shell.
/etc/security/.profile :
It is the template for the user's .profile file. It has been copied to the user's home directory when we create the user.
/etc/security/limits :
It contains all the resource limits (ulimits) for the users.
Here are the various ulimit values ...
fsize, fsize_hard - Soft and hard limit for the size of a file a user can create
core, core_hard - Soft and hard limit for the Size of core file a user can create
cpu, cpu_hard - Soft and hard limit for the amount of system time allowed
data, data_hard - Soft and hard limit for the size of the process data segment
stack, stack_hard - Soft and hard limit for the size of the process stack segment
rss, rss_hard - Soft and hard limit for the physical memory allowed
nofiles, nofiles_hard - Soft and hard limit for the number of open file descriptors at one time
nproc, nproc_hard - Soft and hard limit for the number of running processes at one time
/etc/security/passwd :
This file contains the user's password information such as password, lastupdate and flags.
Here are the various flags user
ADMIN - It can be set so that only the root user can change the user's password.
ADMCHG - It can be set so that the user is prompted to change his or her password on the next login/su.
NOCHECK - It can be set so that any additional restrictions in /etc/security/user are ignored.
/etc/security/user : This file contains very very important settings for every user.
Here are the parameters configured in the file for each and every user :
account_locked - To lock the user account. This can takes values TRUE or FALSE
admin - To specify whether the user is admin or not. It can take calues TRUE or FALSE
expires - It is configured to set the expiration date for the user beyond which the user will be locked. It can take values in the format MMDDHHYY.
histexpire - To specify the # of weeks the user can't reuse a password. It can takes values between 0-260
histsize - To specify the # of passwords previously used that can't be reused. It can take values between 0-50
login - To specify whether a user can log in or not. It can take values TRUE or FALSE.
maxage - To specify the # of weeks a password is valid. It can take values between 0-52.
minage - To specify the # of weeks a user must wait before changing his or her password. It can take values between 0-52.
rlogin - To specify whether a user can be accessed remotely via telnet,ssh, ftp. It can take values TRUE or FALSE.
su - To specify whether other user can use su to access this account. It can take values TRUE or FALSE.
/usr/lib/security/mkuser.default : This file contains the default values that are set while creating an user.
/etc/security/login.cfg : This file contains the message that is displayed whenever you login to the system.
You can always change it using chsec command or by editing this file directly in vi editor.
Here are the few attributes of a user which you may be interested.
id - User Identification Number is a unique i dfor every user. root user's id is always 0.
pgrp - Primary Group of a user
groups - Secondary Groups of a user. An user can belong to maximum 128 groups in AIX 5.3 and 6.1.
home - Home directory to store the user's files
shell - Shell that runs when the user login
gecos - Description or some comments about the user
There are 6 main commands used in the administration of user :
mkuser - Add a user
chuser - Change an attribtue of a user
lsuser - List the attribtues of a user
rmuser - Remove a user
passwd - To set password for a user and for various other purposes
These words can also be used as fastpaths for smitty.
For example, # smitty mkuser will open a form to create a user.
For doing the whole user administration, you can use
# smitty user and go thru the menu items for various operations.
Now let us see the commands to administrate users ...
1. To create a user called 'jack' with default settings and allocate the next available uid :
# mkuser jack
2. To create a user with home dir as /opt/$username, primay group as 'dba' :
# mkuser home=/opt/jack pgrp=dba jack
3. To know about the user :
# finger jack
4. To change the primary group for a user :
# chuser pgrp=oracle jack
5. To list the attributes of a user in stanza structure :
# lsuser -f jack
6. To list the attributes of a user delimited by comma :
# lsuser -c jack
7. To list home and shell attributes for the users jack and tom :
# lsuser -a shell home jack,tom
8. To set the password for a newly created user :
# passwd jack
9. To clear the flag ADMCHK for jack :
# pwdadm -c jack
If you dont do this after setting a password for jack, he will be prompted to change his password on the first login.
10. To change the gecoz for a user :
# passwd -f jack
11. To change the shell for a user :
# passwd -s jack
12. To list the last password update date/time and the flags for a user :
# passwd -q jack
13. To set the ADMIN flag for a user :
# passwd -f ADMIN jack
ADMIN flag ensures that only the root user can change the password for Jack.
14. To remove the user :
# rmuser jack
Note: rmsuer doesn't remove the home directory for a user.
You have to remove it may be after the backup.
15. To remove the user along with his password information :
# rmuser -p jack
16. To list the currently logged in users :
# who
Note: This command will show the contents of /etc/utmp which is a binary file.
17. To list the login and logout information for the machine :
# last
Note: This command will show the contents of /var/adm/wtmp file, which is a binary file. Over a period of time, this file will occupy the /var file system a lot. Hence nullify the file once in a 6 months or depending upon the # of login/logout actions in the system.
To clear(nullify) the wtmp file, you can use any of the below commands
# cp /dev/null /var/adm/wtmp
# > /var/adm/wtmp
18. To change the default message(herald) that is shown after user login :
# chsec -f /etc/security/login.cfg -a default -herald
19. As a user, you have to protect (lock) your terminal whenever you go for a coffee break.
# lock -> To lock your telnet or ssh terminal
If you use XWindowsm you can use the below command
# xlock
20. Sometimes you may want to login as root to execute some admin commands. For this you dont have to logout from current user and login as root.You can use su command to swtich user and execute the commands and say 'exit' to come out of the su window.
To su to root, you can use any of the below commands
# su - root
# su -
To su to other user called tom,
# su - tom
These su operations are logged into /var/adm/sulog file. You have to nullify this file on certain period of time to make some space in /var file system.
21. How to disable direct root login via telnet and ssh ?
To disable direct root login thru telnet or ssh, you have to set 'rlogin' attribtue for root user to false.
You can use the below command to do so.
# chuser rlogin=false root
22. How to enforce automatic logoff after certain timeout period ?
To enforce automatic logoff after timeout period of 10 minutes, enter the following line in /etc/security/.profile after the AIX installation.
TMOUT=600 ; TIMEOUT=600 ; export readonly TMOUT TIMEOUT
/usr/bin/mkuser Contains the mkuser command.
/usr/lib/security/mkuser.default
Contains the default values for new users.
/etc/passwd
Contains the basic attributes of users.
/etc/security/user
Contains the extended attributes of users.
/etc/security/user.roles
Contains the administrative role attributes of users.
/etc/security/passwd
Contains password information.
/etc/security/limits
Defines resource quotas and limits for each user.
/etc/security/environ
Contains the environment attributes of users.
/etc/group
Contains the basic attributes of groups.
/etc/security/group
Contains the extended attributes of groups.
/etc/security/.ids Contains standard and administrative user IDs and group IDs.
/usr/bin/passwd Contains the passwd command.
/etc/passwd Contains user IDs, user names, home directories, login shell, and finger information.
/etc/security/passwd Contains encrypted passwords and security information.
/usr/bin/chuser Contains the chuser command.
/etc/passwd
Contains the basic attributes of users.
/etc/group
Contains the basic attributes of groups.
/etc/security/group
Contains the extended attributes of groups.
/etc/security/user
Contains the extended attributes of users.
/etc/security/user.roles
Contains the administrative role attributes of users.
/etc/security/lastlog
Contains the last login attributes of users.
/etc/security/limits
Defines resource quotas and limits for each user.
/etc/security/audit/config
Contains audit configuration information.
/etc/security/environ
Contains the environment attributes of users.
AIX commands and tools for DB2 troubleshooting
AIX commands and tools for DB2 troubleshooting
Introduction
There are many scenarios where the troubleshooting of DB2 issues can involve and benefit from gathering operating system level data and analyzing it to understand the issues further.
This article discusses a number of problems you may face with your database including CPU usage problems, orphan processes, database corruption, memory leaks, hangs and unresponsive application.
Here the author tried to explain some AIX utilities and commands to help you understand and resolve each of these troublesome issues. The data you collect from running these commands can be sent to the IBM Technical Support Team when opening a problem management request (PMR) in order to expedite the PMR support process. The end of each section of this article discusses the documents you should gather to send to the Technical Support Team. While this article gives troubleshooting tips to use as a guideline, you should contact the IBM Technical Support Team for official advice about these problems.
1.Monitor CPU usage
In working with your database, you might notice a certain DB2 process consuming a high amount of CPU space. This section describes some AIX utilities and commands which you can use either to analyse the issue yourself or to gather data before submitting a PMR to IBM Technical Support:
2.Through ps Command:
A ps command reveals the current status of an active process. You can use
ps -auxw | sort r +3 |head 10
to sort and get a list of the top 10 highest CPU consuming processes. Listing 1 shows the ps output:
Listing 1. Sample ps output
root@mavrickit $ ps auxw|sort -r +3|head -10
USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND
scot 1658958 0.1 9.0 218016 214804 - A Sep 13 38:16 db2agent (idle) 0
dpf 1036486 0.0 1.0 14376 14068 - A Sep 17 3:10 db2hmon 0
scot 1822932 0.0 1.0 12196 11608 - A Sep 12 6:41 db2hmon 0
dpf 1011760 0.0 0.0 9264 9060 - A Sep 17 3:03 db2hmon 3
dpf 1532116 0.0 0.0 9264 9020 - A Sep 17 3:04 db2hmon 2
dpf 786672 0.0 0.0 9264 8984 - A Sep 17 3:02 db2hmon 5
dpf 1077470 0.0 0.0 9264 8968 - A Sep 17 3:03 db2hmon 1
dpf 1269798 0.0 0.0 9248 9044 - A Sep 17 2:50 db2hmon 4
db2inst1 454756 0.0 0.0 9012 7120 - A Jul 19 0:52 db2sysc 0
3.Through topas Command
When executing a ps -ef command, you see the CPU usage of a certain process. You can also use the topas command to get further details. Similar to the ps command, a topas command retrieves selected statistics about the activity on the local system. Listing 2 is a sample topas output that shows a DB2 process consuming 33.3% CPU. You can use the topas output to get specific information such as the process id, the CPU usage and the instance owner who started the process. It is normal to see several db2sysc processes for a single instance owner. DB2 processes are renamed depending on the utility being used to list process information:
Listing 2. Sample topas output
Name PID CPU% PgSp Owner
db2sysc 105428 33.3 11.7 udbtest
db2sysc 38994 14.0 11.9 udbtest
test 14480 1.4 0.0 root
db2sysc 36348 0.8 1.6 udbtest
db2sysc 116978 0.5 1.6 udbtest
db2sysc 120548 0.5 1.5 udbtest
sharon 30318 0.3 0.5 root
lrud 9030 0.3 0.0 root
db2sysc 130252 0.3 1.6 udbtest
db2sysc 130936 0.3 1.6 udbtest
topas 120598 0.3 3.0 udbtest
db2sysc 62248 0.2 1.6 udbtest
db2sysc 83970 0.2 1.6 udbtest
db2sysc 113870 0.2 1.7 root
Through vmstat Command
The vmstat command can be used to monitor CPU utilization; you can get details on the amount of user CPU utilization as well as system CPU usage. Listing 3 shows the output from a vmstat command:
Listing 3. Sample vmstat output
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
32 3 1673185 44373 0 0 0 0 0 0 4009 60051 9744 62 38 0 0
24 0 1673442 44296 0 0 0 0 0 0 4237 63775 9214 67 33 0 0
30 3 1678417 39478 0 0 0 0 0 0 3955 70833 8457 69 31 0 0
33 1 1677126 40816 0 0 0 0 0 0 4101 68745 8336 68 31 0 0
28 0 1678606 39183 0 0 0 0 0 0 4525 75183 8708 63 37 0 0
35 1 1676959 40793 0 0 0 0 0 0 4085 70195 9271 72 28 0 0
23 0 1671318 46504 0 0 0 0 0 0 4780 68416 9360 64 36 0 0
30 0 1677740 40178 0 0 0 0 0 0 4326 58747 9201 66 34 0 0
30 1 1683402 34425 0 0 0 0 0 0 4419 76528 10042 60 40 0 0
0 0 1684160 33808 0 0 0 0 0 0 4186 72187 9661 73 27 0 0
When reading a vmstat output, as above, you can ignore the first line. The important columns to look at are us, sy, id and wa. Whereas
id: Time spent idle.
wa: Time spent waiting for I/O.
us: Time spent running non-kernel code. (user time)
sy: Time spent running kernel code. (system time)
In Listing 3, the system is hitting an average of 65% user CPU usage and 35% system CPU usage. Pi and Po values are equal to 0, thus there are no paging issues. The wa column shows there does not seem to be any I/O issues.
Listing 4 shows the wa (waiting on I/O) to be unusually high and this indicates there might be I/O bottlenecks on the system which in turn causes the CPU usage to be inefficient. You can check errpt -a output to see if there are any reported issues with the media or I/O on the system.
Listing 4. Sample vmstat output showing I/O issues
Kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
2 8 495803 3344 0 0 0 929 1689 0 998 6066 1832 4 3 76 16
0 30 495807 3340 0 0 0 0 0 0 1093 4697 1326 0 2 0 98
0 30 495807 3340 0 0 0 0 0 0 1055 2291 1289 0 1 0 99
0 30 495807 3676 0 2 0 376 656 0 1128 6803 2210 1 2 0 97
0 29 495807 3292 0 1 3 2266 3219 0 1921 8089 2528 14 4 0 82
1 29 495810 3226 0 1 0 5427 7572 0 3175 16788 4257 37 11 0 52
4 24 495810 3247 0 3 0 6830 10018 0 2483 10691 2498 40 7 0 53
4 25 495810 3247 0 0 0 3969 6752 0 1900 14037 1960 33 5 1 61
2 26 495810 3262 0 2 0 5558 9587 0 2162 10629 2695 50 8 0 42
3 22 495810 3245 0 1 0 4084 7547 0 1894 10866 1970 53 17 0 30
4.Through iostat Command
An iostat command quickly tells you if your system has a disk I/O-bound performance problem. Listing 5 is an example of an iostat command output:
Listing 5. Sample iostat output
System configuration: lcpu=4 disk=331
tty: tin tout avg-cpu: % user % sys % idle % iowait
0.0 724.0 17.9 12.3 0.0 69.7
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk119 100.0 5159.2 394.4 1560 24236
hdisk115 100.0 5129.6 393.0 1656 23992
hdiskpower26 100.0 10288.8 790.8 3216 48228
%tm_act : Reports back the percentage of time that the physical disk was active or the
total time of disk requests.
Kbps : Reports back the amount of data transferred to the drive in kilobytes.
tps : Reports back the number of transfers-per-second issued to the physical disk.
Kb_read : Reports back the total data (kilobytes) from your measured interval that is read
from the physical volumes.
Kb_wrtn : Reports back the amount of data (kilobytes) from your measured interval that is written to the physical volumes.
To check if you are experiencing resource contention, you can focus on the %tm_act value from the above output. An increase in this value, especially more than 40%, implies that processes are waiting for I/O to complete, and you have an I/O issue on your hands. Checking which hard disk has higher disk activity percentage and whether DB2 uses those hard disks gives you a better idea if these two factors are related.
What to collect
You should collect the following information before opening a PMR with IBM Technical Support:
• db2support.zip
• of high cpu process
• of high cpu process
Technical support might also send you the db2service.perf1 script which basically collects data repeatedly over a period of time. The output of the script needs to be bundled and sent back to the support team for their further analysis.
35.Troubleshoot orphan processes
There are scenarios when, even after doing a db2stop, you notice (by doing a ps -ef | grep DB2) certain DB2 processes such as the db2fmp process still running and consuming resources. If there was a case of abnormal shutdown, it is advised to do a ipclean after the instance has been stopped. Doing a db2stop should inherently shutdown all DB2 related processes; however, if an application using those processes was abnormally terminated, this might cause related DB2 processes to become orphan processes.
Orphan DB2 processes are those which are not attached or linked to any other DB2 processes. Abnormal termination of an application includes shutting it down by doing a Ctrl+C, closing the KSH session or killing it with a -9 option.
One way of confirming that the process is orphaned, is to try and match the process ID (PID) of the orphaned process from the ps -ef output with the Coordinator column of the db2 list applications show detail output. If the PID cannot be found in the db2 list apps output, then it is an orphan process. For example, if you issue a db2 list applications show detail command, you get this output:
Listing 6. Sample list applications output
CONNECT Auth Id Application Name Appl. Application Id Seq# Number of Coordinating DB
Coordinator Status Status Change Time DB Name DB Path
Handle Agents partition number pid/thread
JDE test.exe 2079 AC1C5C38.G80D.011F44162421 0001 1 0 2068646
UOW Waiting 04/04/2006 09:25:17.036230 PTPROD
/db2pd/otprod/ptprod/otprod/NODE0000/SQL00001/
--NOTICE PID 2068646. This is the PID on the local server.
Part of the ps -ef output from the server:
ps -ef |grep 2068646
otprod 2068646 483566 0 09:06:28 - 0:59 db2agent (PTPROD) 0
This output shows the process with PID of 2068646 is not an orphaned process and is still attached to a DB2 process.
In order to avoid orphan processes, you may want to do the following: Make normal, clean exits at the client side so that DB2 is aware and can clean up resources on the server. Tweak values of TCPKEEPIDLE time to a number less than the default, and tune the DB2CHECKCLIENTINTERVAL and KEEPALIVE values.
6.What to collect
If you do notice orphan processes and wish to investigate this issue, you should collect the following information before opening a PMR with IBM Technical Support:
- grep db2 output
-db2support.zip with -c option
- A callstack of the process that is collected using dbx, db2pd -stack or kill -36. The dbx command is a popular command line debugger used in both Solaris and AIX systems. The dbx output is helpful and can be run as follows:
Listing 7. The dbx command
dbx -a
At the dbx prompt type
th --- Displays all threads for the process
th info --- Displays additional info about the threads
where --- Get stack trace for thread 1
th current 1 --- Makes t1 current
where --- Displays stack for thread 1
th current 2 --- Makes thread 2 current
where --- Displays stack for thread 2.
... continue for all threads of the process
detach - --- Detach from process
dbx -a
7.Detect database corruption
You can start to investigate whether the database is corrupted if a user complains of not being able to access certain database objects or is unable to connect to a specific database partition. The following section highlights some of the errors that are logged by DB2 and how you can ensure that there are no operating system (OS) level issues affecting or causing DB2 database corruption. You might notice errors similar to the one in Listing 8 being logged in the db2diag.log:
Listing 8. Corruption errors
RETCODE : ZRC=0x87040001=-2029780991=SQLD_BADPAGE "Bad Data Page"
DIA8500C A data file error has occurred, record id is "".
Or
RETCODE: ZRC=0x86020019=-2046689255=SQLB_CSUM "Bad Page, Checksum Error"
DIA8426C A invalid page checksum was found for page "".
Or
2007-07-09-11.29.45.696176+120 I16992C16377 LEVEL: Severe
PID : 68098 TID : 1 PROC : db2agent (sample)
INSTANCE: instest NODE : 000 DB : sample
APPHDL : 0-635 APPID: *LOCAL.instest.070709082609
FUNCTION: DB2 UDB, buffer pool services, sqlbcres, probe:20
MESSAGE : Important: CBIT Error
DATA #1 : Hexdump, 4096 bytes
These errors are logged when DB2 tries to access data in a container and there is some form of corruption. In such an instance when DB2 cannot access the data, the database might be marked as bad. You can narrow down where there might be possible corruption. In the db2diag.log, look for messages similar to the following:
Listing 9. Corruption errors showing database object details
2006-04-15-03.15.37.271601-360 I235258C487 LEVEL: Error
PID : 152482 TID : 1 PROC : db2reorg (SAMPLE) 0
INSTANCE: instest NODE : 000 DB : SAMPLE
APPHDL : 0-68 APPID: *LOCAL.SAMPLE.060415091532
FUNCTION: DB2 UDB, buffer pool services, sqlbrdpg, probe:1146
DATA #1 : String, 124 bytes
Obj={pool:5;obj:517;type:0} State=x27 Parent={5;517}, EM=55456,
PP0=55488 Page=55520 Cont=0 Offset=55552 BlkSize=12
BadPage
The above errors indicate corruption has occurred in tablespace:5 and tableid:517. To check which table this refers to, execute the following SQL query:
Listing 10. Query to find a table with corruption
db2 "select tabname, tbspace from syscat.tables where tbspaceid = 5 and tableid = 517"
On the Operating System (OS) level, the most common causes for corruption are either hardware issues or file system corruption. For example, in the db2diag.log if you see the database being marked damaged with a ECORRUPT (89) error as follows :
Listing 11. Sample file system-related corruption errors
2007-05-22-13.45.52.268785-240 E20501C453 LEVEL: Error (OS)
PID : 1646696 TID : 1 PROC : db2agent (SAMPLE) 0
INSTANCE: tprod NODE : 000 DB : SAMPLE
APPHDL : 0-32 APPID: GA260B45.M505.012BC2174219
FUNCTION: DB2 UDB, oper system services, sqloopenp, probe:80
CALLED : OS, -, unspecified_system_function
OSERR : ECORRUPT (89) "Invalid file system control data detected."
8.You can check the following
Review the errpt -a output and look for hardware I/O or disk-related messages. Listing 12 is an example of an errpt -a output which shows a file system corruption:
Listing 12. Sample errpt output
LABEL: J2_FSCK_REQUIRED
IDENTIFIER: B6DB68E0
Date/Time: Thu Jun 7 20:59:49 DFT 2007
Sequence Number: 139206
Machine Id: 000BA256D600
Node Id: cmab
Class: O
Type: INFO
Resource Name: SYSJ2
Description
FILE SYSTEM RECOVERY REQUIRED
Probable Causes
INVALID FILE SYSTEM CONTROL DATA DETECTED
Recommended Actions
PERFORM FULL FILE SYSTEM RECOVERY USING FSCK UTILITY
OBTAIN DUMP
CHECK ERROR LOG FOR ADDITIONAL RELATED ENTRIES
Detail Data
ERROR CODE
0000 0005
JFS2 MAJOR/MINOR DEVICE NUMBER
0032 0004
CALLER
0028 8EC8
CALLER
0025 D5E4
CALLER
002B 4AC8
2. Run the fsck command on the file system where the container resides to be sure that it is sound. fsck interactively checks and repairs any file system malfunction. From the pSeries and AIX Information Center we can find the following examples of using the fsck command.
Listing 13. The fsck command
To check all the default file systems enter:
fsck
This form of the fsck command asks you for permission
before making any changes to a file system.
To check the file system /dev/hd1, enter:
fsck /dev/hd1
This checks the unmounted file system located on the /dev/hd1 device.
9.What to collect
You should collect the following information before opening a PMR with IBM Technical Support:
1. errpt -a
2. db2support.zip
3. fsck results
10.Debug memory leaks
It is important to distinguish, if possible, between a memory leak and a system-wide performance degradation due to increased demands for memory. So initially it is pertinent to check that nothing has changed in the environment that could explain increased memory usage. The rest of this section discusses how to use AIX Operating System techniques to spot, track and debug those leaks. The article does not discuss detailed DB2 tools and techniques, although there is some mention where necessary.
11.What is a memory leak?
A particular kind of unintentional memory consumption by a computer program where the program fails to release memory when no longer needed. This condition is normally the result of a bug in a program that prevents it from freeing up memory that it no longer needs. The term is meant as a humorous misnomer, since memory is not physically lost from the computer. Rather, memory is allocated to a program, and that program subsequently loses the ability to access it due to program logic flaws.
Specifically, it is a bug in the code whereby malloc() memory allocation calls are not met by corresponding free() memory calls. No corresponding free() system calls lead to unfreed blocks. Typically this is a slow process and occurs over days or weeks — particularly if the process is left active as is often the case. Some leaks are not even detectable, particularly if the application terminates and its processes are destroyed.
Lisitng 14 is an example of a C code snippet that demonstrates memory leak. In this instance, memory was available and pointed to by the variable 's,' but it was not saved. After this function returns, the pointer is destroyed and the allocated memory becomes unreachable, but it remains allocated.
Listing 14. Sample c code
#include
#include
void f(void)
{
void* s;
s = malloc(50); /* get memory */
return; /* memory leak - see note below */
/*
* Memory was available and pointed to by s, but not saved.
* After this function returns, the pointer is destroyed,
* and the allocated memory becomes unreachable.
*
* To "fix" this code, either the f() function itself
* needs to add "free(s)" somewhere or the s needs
* to be returned from the f() and the caller of f() needs
* to do the free().
*/
}
int main(void)
{
/* this is an infinite loop calling the above function */
while (1) f(); /* Malloc will return NULL sooner or later, due to lack of memory */
return 0;
}
12.How to spot, track and debug memory leaks
To begin with, you should call IBM if you suspect a DB2 process is leaking memory. But how do you know that you are experiencing this situation? This section discusses some of the options.
The first option is to use the ps utility. The ps utility can be used to quickly and simply determine if a process is leaking. This example demonstrates how a particular process is growing in size:
Listing 15. Sample 'ps aux' output showing the process growing in size
ps aux:
1st iteration:
USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME
COMMAND
db2inst1 225284 0.2 0.0 19468 18280 - A 11:26:06 10:34
db2logmgr
2nd iteration:
db2inst1 225284 0.1 0.0 19696 18512 - A 11:26:06 10:34
db2logmgr
3rd iteration:
db2inst1 225284 0.1 0.0 19908 18724 - A 11:26:06 10:36
db2logmgr
4th iteration:
db2inst1 225284 0.1 0.0 20116 18932 - A 11:26:06 10:36
db2logmgr
5th iteration:
db2inst1 225284 0.1 0.0 20312 19128 - A 11:26:06 10:37
db2logmgr
ps -kelf:
1st iteration:
F S UID PID PPID C PRI NI ADDR SZ WCHAN
STIME TTY TIME CMD
40001 A db2inst1 225284 254158 0 60 20 580e59400 18466
11:26:06 - 10:34 db2logmgr (***) 0
2nd iteration:
40001 A db2inst1 225284 254158 1 60 20 580e59400 18696
11:26:06 - 10:34 db2logmgr (***) 0
3rd iteration:
40001 A db2inst1 225284 254158 0 60 20 580e59400 18900
11:26:06 - 10:36 db2logmgr (***) 0
4th iteration:
40001 A db2inst1 225284 254158 0 60 20 580e59400 20106
11:26:06 - 10:36 db2logmgr (***) 0
5th iteration:
40001 A db2inst1 225284 254158 0 60 20 580e59400 20312
11:26:06 - 10:37 db2logmgr (***) 0
The SZ and RSS values in the ps aux output are the 2 key columns to focus on when trying to spot a potential memory leak. As you can see, the values in bold are increasing. It is not sufficient, however, to determine root cause and more debugging is certainly required. Again, please raise this issue with IBM Technical Support, but what follows are some likely problem determination steps IBM will take.
13.Debug using procmap and gencore
As root:
1. procmap> procmap.1
2. ps aux > ps_aux.1
3. ps -kelf > ps_kelf.1
4. gencore and sleep for a period of time, then
procmap > procmap.2
1. ps aux > ps_aux.2
2. ps -kelf > ps_kelf.2
3. gencore < file>
Then repeat these steps again for another 2 or 3 iterations. Please note, on 64 bit AIX, the gencore creates very large files. Regardless of the word size, fullcore needs to be enabled. The following commands can be used to check that the environment is set up correctly:
Listing 16. The lsattr command
lsattr -El sys0| grep -i core
fullcore true Enable full CORE dump True
And the limits for the instance owner needs to be set appropriately too. You may well be asked to enable MALLOC_DEBUG and export this to the DB2 environment. What follows is an example of this:
To start DB2 memory debugging for the next time the instance is started, run: db2set DB2MEMDBG=FFDC .
> To start malloc debugging for the next time the instance is started, run: export MALLOCDEBUG log:extended stack_depth 12.
And append MALLOCDEBUG to the DB2 registry variable DB2ENVLIST:
> db2set DB2ENVLIST MALLOCDEBUG.
Then stop and restart DB2.
Once the core files have been created, you can use snapcore to bundle the core files and libraries into pax file. An example of snapcore is as follows:
Listing 17. Sample snapcore
snapcore /home/db2inst1/sqllib/db2dump/c123456/core
/home/db2inst1/sqllib/adm/db2sysc
This creates a file with a *.pax extension in /tmp/snapcore by default. The core file is useless without the executable that cored, in this case it was db2sysc not db2logmgr, which was seen to be growing, because that is a process not an executable. DB2 support is then able to interrogate the core to track the DB2 malloc() allocations against free() calls.
Recover from hangs
14.What is a hang
A hang occurs when a process has not moved forward or changed after a period of time. This can happen if a thread or process reaches a point in its execution where it can go no further and is waiting for a response. It also occurs when the process is in a very tight loop and never completes the function.
The first step is to identify if what you are experiencing is a hang or a severe degradation. Then you need to understand what is affected, or the scope. Some simple questions can help a lot:
• Why do you think it has hung?
• Are all DB2 commands hanging?
• How long has the command been running for?
• How long does it normally run for
Then to access the scope:
• Are OS commands hanging too? If the answer to this is yes, then you need get assistance from the AIX support team.
• Are db2 connect statements affected?
• Can SQL be issued over existing connections?
• If in a DPF environment, can you issue commands against other partitions?
• Can you issue commands against other databases?
Recovery
Remember, please collect the stacks before you recover. Once you have the stacks the only choice you have is to issue db2_kill. Then check for any processes and IPCs shared memory, message queues and semaphores left lying around after the kill. You may have to remove any you find manually. You could also try ipclean to remove these resources. If the IPCs are not cleared out by ipclean or ipcrm and the processes are removed by kill -9, then the process is most likely hung in the kernel and you need to call AIX support.
Once it has come down, restart with db2start and then do a restart db command.
45.What to collect
The single most important piece of information to collect is a stack trace of the process that is believed to be hung. IBM DB2 support cannot debug a hang without this, and the stack trace must be collected prior to recovering DB2. If this is not done, you may have another outage in the future.
There will be pressure to restart DB2, but you must resist. The system must be in a hung state in order to diagnose the root cause of the problem and do the necessary debugging. A restart clears the situation and you have lost the window of opportunity to make the necessary changes. More seriously, you cannot provide any confidence that it won't recur. Thus, you need to resist the pressure to restart DB2 until you have collected all the diagnostics.
The following table describes good probelm determination (PD) and data caputre versus bad PD and data capture. Note that the best PD and data caputre requires the fewest steps and has a better change of success in determining root cause.
Poor PD and data capture:
• Occurrence
• Detection
• Recovery
• FFDC on (requires restart)
• Restart (outage #2) Schedule outage, hopefully problem does not reoccur before
• Occurrence (outage #3)
• Detection
• Data Collection
• Recovery
• Diagnosis (clock ticking)
Better PD and data capture:
• Occurrence (outage #1)
• Detection
• Recovery
• FFDC on
• Occurrence (outage #2)
• Detection
• Data Collection
• Recovery
• Diagnosis (clock ticking)
Good PD and data capture:
• Occurrence (outage #1)
• Detection
• Data Collection
• Recovery
• Diagnosis (clock ticking)
Stack traces
A stack trace is a snapshot of the function calls at a particular point in time. So multiple stack traces, a few minutes apart, provide a sense of motion. There are a variety of ways to collect stack traces; the following lists are, in my opinion, the most reliable:
Procstack >> pid.pstack.out
This is an AIX utility that just dumps the stack to a file. In this instance, I am appending the file because it is run again later and I do not want to have to re-write it.
Kill -36
This command does not kill the process, but it sends a signal to dump its stack. This actually creates a fully-formatted trap file to the DIAGPATH area of DB2. Because it gives more information than procstack and the way it works internally, it is generally more expensive, particularly if there are hundreds of processes, which is often the case. The main focus of this article is to discuss AIX operating system tools to debug DB2. No discussion of hang problem determination is complete without mentioning db2pd, so the following invocations can be used to generate stacks traces:
db2pd -stacks (This generates stack dumps again all PID)
db2pd -stack (This generates a stack dump for the PID specified)
The trap file is created in the DIAGPATH area. Listing 18 shows an example of its usage:
Listing 18. db2pd -stacks usage
1. -stacks
$ db2pd -stacks
Attempting to dump all stack traces for instance.
See current DIAGPATH for trapfiles.
2. -stack
$ db2pd -stack 1454326
Attempting to dump stack trace for pid 1454326.
See current DIAGPATH for trapfile.
The DB2 support will ask you to tar and compress the DIAGPATH area. Most commonly they will ask you to run a db2support command which does it for you, providing the correct flags are used. However, if you use the OS method of procstack, you have to submit the output files.
Truss
The truss command can be used but is not as effective as a stack dump and is only likely to reveal anything if the processes is looping and can be reproduced. If the process is hung, only a stack dump can reveal how it got there.
ps
It is also a good idea to collect ps listings for all partitions, if applicable, before and after the stack dumps. If you collect the data manually the pseudo-code looks like this:
Listing 19. procstack
Procstack Pid or PIDs >> procstack.out
Ps eafl >> pseafl.out
Ps aux >> psaux.out
Sleep 120
Repeat for at least 3 iterations.
Or:
Kill -36 or PIDs
Ps eafl >> pseafl.out
Ps aux >> psaux.out
Sleep 120
Repeat for at least 3 iterations.
NB: IBM DB2 support can provide a data collect script which automates this process.
Investigate unresponsive applications
Sometimes applications are merely unresponsive, and you have to figure out why it is unresponsive and how to get it to respond. If you issue a force application and it does not respond, you may be left wondering what you can do. First of all, it is important to know that force makes no guarantees to force. It is simply a wrapper around an OS kill command.
Without going into the architectural details of DB2, there are some situations which are dangerous to force. As such, the db2agent sets its priority level to be higher than that of the force. Under these circumstances, force does not work, and this is by design.
The bottom line is, not every unresponsive application is caused by a bug. It is possible that the application is just doing something important and not responding to any additional commands until it completes its current task.
Recovery
Recovery almost certainly requires a db2stop,db2start as DB2 does not take kindly to key engine processes being killed. It tends to invoke panic and bring the instance down. I would asses the impact the rogue application is having and, if possible, leave it in situ until you can recycle. It may be holding locks that are contending with other users, for example, and this is adversely affecting the application, in which case you may have to take an outage to remove it.
What to collect
The debugging of an unresponsive application is treated in the same way as a hung, but clearly the scope is narrower. You need to collect the following elements to send to IBM Technical Support:
- Iterative stack traces of the db2agent or DB2 process that is unresponsive.
- ps listings and other items, like: db2level, dbm cfg, db cfg, db2diag.log and possibly an application snapshot.
Conclusion
Problem determination in DB2 is made simpler because of the tools and utilities available in AIX. Often it is necessary to use both AIX and DB2 tools and commands to figure out what the problem is. This article discusses some of the problems associated with troubleshooting in DB2 and has hopefully given you the tools you need to fix your database.
1.ODM Delete Command easy Step to remove a Disk
odmdelete -o CuDv -q name=hdisk1
2.Check the status of a mksysb tape (Guessing tape drive is rmt0)
chdev -l rmt0 -a block_size=0
mt -t /dev/rmt0.1 fsf 3
lsmksysb -c -f /dev/rmt0.1
or
restore -Tvf /dev/rmt0.1 -s4
3.How to remove vpath
rmdev -Rdl dpo
4.NIM showlog command example
nim -o showlog -a full_log=yes -a log_type=nimerr 530TL4spot
5.Command to boot from network (provided maint boot enabled in the boot server)
bootlist -m normal ent0 speed=auto duplex=auto gateway=X.X.X.X bserver=X.X.X.X client=X.X.X.X
( replace x with the real IPs and speed/duplex according to your network speed settings)
6.Remove a mksysb image from NIM Server
nim -o remove -a rm_image=yes mksysbname
7.Create a image.data from mksysb image
restore -xvqf /images/mksysb.image ./image.data
8.List all ODM Definitions
odmget CuAt - to see all the attributes
odmget CuDv - to see all the devices
9.To remove a mirror copy from a LV
/usr/sbin/rmlvcopy fslv01 1 hdisk4 hdisk5
Please enable JavaScript to view this page content properly.
10.Creating a spot from mksysb
nim -o define -t spot -a source=mksysb1 -a server=master -a location=/export/spot spot1
11.Restore a file from mksysb image
restore -xvqf ./mksysb.image ./etc/passwd
12.Create an lpp source from existing directory
nim -o define -t lpp_source -a server=master -a location=/export/lpp_source/530TL5lpp 530TL5lpp
13.Create a spot from existing lpp source
nim -o define -t spot -a server=master -a location=/export/spot/530TL5spot -a source=530TL5lpp 530TL5spot
14.How to update a lpp source from a downloaded file sets
gencopy -X -b "-qv" -d /TMP_FOR_UPDATE_CD -t /export/lpp_source/530TL6lpp/ -f ALL 2>&1
Learn 10 good UNIX usage habits from IBM
http://www-128.ibm.com/developerworks/aix/library/au-badunixhabits.html
15.How to find a Tape is Mksysb or not
Run this command to see the list of files. If it doesn't show anything then the tape is NOT MKSYSB
chdev -l rmt0 -a block_size=0
mt -t /dev/rmt0.1 fsf 3
lsmksysb -c -f //dev/rmt0.1
or
restore -Tvf /dev/rmt0.1 -s4
Please enable JavaScript to view this page content properly.
16.To define a mksysb resource custimgname in NIM
nim -o define -t mksysb -a server=master -a location=/images/custimg.img custimgname
17.How to find out the Physical Location of a disk
lsdev -Cc disk -l hdisk0 -F "name location"
18.Install all software from CD
/usr/sbin/installp -aX -Y -d/dev/cd0 * all
19.Install Atape software from utility directory
/usr/sbin/installp -aX -Y -d/utility Atape*
20.To display BOS installation status information while the installation is progressing, run the following command on the master:
lsnim -a info -a Cstate ClientName
or
lsnim -l ClientName
21.To perform a base system installaion on a machine venus (if you don't want any bosinst_data, script, fbscript.
image data then just don't use them in the command line) from the NIM Server then run this.
nim -o bos_inst -a source=rte -a spot=530ML7SP3spot -a lpp_source=530ML7SP3lpp a bosinst_data=No_Prompt -a script=FTPSCR -a fb_script=Install_Drivers\
-a accept_licenses=yes -a preserve_res=yes -a no_client_boot=yes a set_bootlist=no -a force_push=no venus
Or with fewer option
nim -o bos_inst -a source=rte -a spot=530ML7SP3spot -a lpp_source=530ML7SP3lpp -a bosinst_data=No_Prompt -a script=FTPSCR -a fb_script=Install_Drivers\
-a accept_licenses=yes -a no_client_boot=yes -a force_push=no venus
Now boot the client machine from the network
22.To resync a logival volume in AIX. Here is an example
Note down the LV IDENTIFIER
root@zeus lslv hd6
LOGICAL VOLUME: hd6 VOLUME GROUP: rootvg
LV IDENTIFIER: 00c8411e00004c000000011731887e00.2 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/stale
TYPE: paging WRITE VERIFY: off
MAX LPs: 512 PP SIZE: 128 megabyte(s)
COPIES: 2 SCHED POLICY: parallel
LPs: 2 PPs: 4
STALE PPs: 2 BB POLICY: non-relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 32
MOUNT POINT: N/A LABEL: None
MIRROR WRITE CONSISTENCY: off
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?: NO
Now run this command on the STALE LV
root@zeus lresynclv -l 00c8411e00004c000000011731887e00.2
Or this script will resync all the Logical Volume of all Volume Groups. Modify this as per your requirement. This one I created for our test environment and so far seems to be working in test environment. Please test it in a test box before you use
lsvg|while read VG
do
lsvg l $VG|awk '{print $1}'
done|sed -e '/LV/d' -e '/\:/d'|while read LV
do
lslv $LV|grep 'LV IDENT'|awk '{print $3}'
done|while read LVIDENT
do
lresynclv -l $LVIDENT
done
Please enable JavaScript to view this page content properly.
23.To add all the filesets on /dev/cd0 to NIM lpp source 530TL5lpp
nim -o update -a packages=all -a source=/dev/cd0 530ML5lpp
24.To add all the filesets from a directory /utility/aixml to NIM lpp source 530TL5lpp
nim -o update -a packages=all -a source=/utility/aixml 530ML5lpp
25.How to create a NIM LPP Source from CD
To create from an existing directory use the full path of the existing directory ex. /export/lpp_source/510ML2lpp instead of /dev/cd0
gencopy -X -b "-qv" -d /dev/cd0 -t /export/lpp_source/510ML8lpp -f file 2>&1
26.How to create a lpp_source from existing directory i,e /export/lpp_source/520TL10SP2lpp
nim -o define -t lpp_source -a server=master -a location=/export/lpp_source/520TL10SP2 lpp 520TL10SP2lpp
27. NIM Network boot problem
# smitty nim_control_boot ==>
>> Limit Boot Image Creation to Defined Interfaces?
>> It was currently set to "NO"
28.Booting a Client from NIM Server in diagnostics mode from the command line
Follow this procedure for performing the diag operation from the master and client. To perform the diag operation from the client, enter:
nimclient -o diag -a spot=SPOTName
To perform the diag operation from the master, enter:
nim -o diag -a spot=SPOTName MachineObjectName
29.How to restore a Customer non bootable mksysb image to Client M/C (Try to use the next procedure in NIM Server and try NIM load instead)
Don't try restbyname in NIM Server. It will overwrite all the NIM server data with the tape Image. Now either use tctl or mt ( Guessing we are using rmt0)
If you want to restore in client M/C itself then
tctl -f /dev/rmt0.1 rewind
tctl -f /dev/rmt0.1 fsf 3
restbyname -xqf /dev/rmt0.1
30.How to create a NIM Image from Customer mksysb Tape
In NIM Server check a filesystem with at least 5-6 GB free space. I prefer seperate filesystem for this. Let us guess we are using /export/mksysb directory so cd to /export/mksysb and restore the image from tape for the Server venus
cd /export/mksysb
chdev -l rmt0 -a block_size=0 (To make sure it can read any block size)
mt -t /dev/rmt0.1 fsf 3
dd if=/dev/rmt0.1 of=/images/mksysb.venus bs=4m
(and use this image. using 4m to avoid any dd buffer error.)
nim -o define -t mksysb -a server=master -a location=/export/mksysb.venus venus_mksysb
Now initiate the MKSYSB installation for the client venus
nim -o bos_inst -a source=mksysb -a mksysb=venus_mksysb -a spot=530ML7SP3spot -a lpp_source=530ML7SP3lpp -a accept_licenses=yes -a \ no_client_boot=yes -a force_push=no venus
All these information might not be necessary. We normally use in our environment to allocate the mksysb and necessary lppsource & spots. In our NIM definition our bosinst.data is called No_Prompt, our Script is called as FTPSCR and FB Script is called Install_Drivers. These are just names but does lot more than then their names said.
nim -o allocate -a source=mksysb -a mksysb=mksysb.venus -a lpp_source=530TL5lpp -a spot=530TL5spot -a bosinst_data=No_Prompt -a script=FTPSCR \
-a fb_script=Install_Drivers -a accept_licenses=yes -a boot_client=no venus
31.How to display NIM Machines
lsnim -c machines
32.How to display NIM networks
lsnim -c networks
33.If NFS Mount failes to mount with following error message -
RPC: 1832-019 Program not registered
Then -> Run on the client. And if both the server & client is new then run on both uncomment portmap in /etc/rc.tcpip if not already done
make sure rc.nfs is not commented out in /etc/inittab
stopsrc -g nfs
startsrc -s portmap
/etc/rc.nfs
Now it should mount.
34.Installing Aix when booting from a mksysb tape fails.
Try clone load first. Clone load is boot from AIX CD1 and then recover from tape. Or you can try the other procedure too. You need to access the firmware command line prompt, which usually appears as an option in the SMS menus. At the firmware command line prompt, type following two commands:
setenv real-base 1000000
reset-all
The system will then reboot, and you will be able to boot from tape, assuming that you have an valid boot image on your tape media.
35.Create a Filesystem using command line
mkvg -y testvg hdisk1
mklv -y testlv testvg 500 hdisk1 (500 is 500 LP )
chlv -t jfs2 testlv
crfs -v jfs -a nbpi=16384 -A yes -d testlv -p rw -m /custimg
crfs -v jfs2 -A yes -d testlv -p rw -m /custimg
or
Create aJFS2 filesystems on VG testvg with size 10MB mount point /fs1 with adding entry in /etc/filesystems
crfs -v jfs2 -g testvg -a size=10M -m /fs1 -A yes
36.ODM command to delete network.
odmdelete -q name = en0 -o CuAt
odmdelete -q parent = en0 -o CuDv
odmdelete -q name = en0 -o CuDv
odmdelete -q name = en0 -o CuDep
odmdelete -q dependency = en0 -o CuDep
odmdelete -q value1 = en0 -o CuDvDr
odmdelete -q value3 = en0 -o CuDvDr
odmdelete -q name=inet0 -o CuAt
Please enable JavaScript to view this page content properly.
37.Etherchannel problem after loading the server from Customer mksysb tape
You must remove the ODM entries first before you configure etherchannel
run this on the correct network interface. Ex. en0
odmdelete -q name=en0 -o CuAt
odmdelete -q name=inet0 -o CuAt
38.How to remove a failed Disk from ODM
If you have been working with a PVID value rather than with an hdisk name,
ensure that the PVID is removed from the ODM with the following command. The
32-digit value supplied consists of the PVID plus 16 zeros. For example:
odmdelete -q value=0073659c2c6d26f10000000000000000 -o CuAt ( add 16 zeros)
To get the PVID run
lsvg -p vgname
Then run
rmlvcopy 1 0073659c2c6d26f1 (16 Digit PVID)
39.Restoring tar backup with absolute pathname to different directory
A tar backup created using absolute path names can only be restored to the directory from which it was created. One way to restore it to a different directory is by using the pax command. For example, suppose you receive a tar tape created using absolute path names.
tar -cvf /dev/rmt0 /work/*
but want to restore it to the /test directory. The pax command would be:
pax -rf /dev/rmt0 -s/work/test/p
The -s/work/test/p does the directory change.
40.Determine the path to your system's error log file by running the following command:
/usr/lib/errdemon -l
41.To change the maximum size of the error log file enter:
/usr/lib/errdemon -s LogSize
42.To change the size of the error log device driver's internal buffer, enter:
/usr/lib/errdemon -B BufferSize
43.To list all events for which logging is currently disabled, enter:
errpt -t -F Log=0
44.To list all events for which reporting is currently disabled, enter:
errpt -t -F Report=0
45.IBM 3494 Library testing commands
mtlib -l /dev/lmcp0 -D -E
mtlib -l /dev/lmcp0 -qM
look man mtlib for more options
46.Vpaths not created for all hdisks of an AIX host or missing vpaths for some hdisks.
In some cases a customer may notice that some hdisks are not associated with any vpaths. Or a customer may not see the expected number of vpaths for the number of hdisks that they have on their AIX host.
In either case the problem could be caused by the fact that the hdisks with no vpath association are listed in a file called /etc/vpexclude. If this file exists a customer can remove the file and run cfgmgr and the hdisks will now be associated with vpaths.
The only way that the vpexclude file can be created is if a customer runs a querysn command on the AIX host or if the customer manually edits the /etc/vpexclude file to include the hdisks.
47.Resetting the NIM state from the command line
Follow this procedure for resetting the NIM state from the command line.
To return a machine to the ready state, enter:
nim -Fo reset MachineName
To deallocate resources, enter:
nim -o deallocate -a subclass=all MachineName
48.Recovering the /etc/niminfo file from the command line
nimconfig -r
49.To list all duplicate and conflicting updates in the /myimages image source directory
/usr/lib/instl/lppmgr -d /myimages -u
50.To remove all duplicate and conflicting updates in the /myimages image source directory, type:
/usr/lib/instl/lppmgr -d /myimages -u -r
51.How to change the console to tty0 if tty0 not available
smitty devices > add a tty >tty rs232 Asynchronous terminal > sa0 ( or sa1) in next screen select port to 0, baud rate to 9600 and Enable Login to enable and hit enter. Now run smitty console and change the device from /dev/lft0 to /dev/tty0
52.To attempt to boot through a gateway using Ethernet with Duplex & speed Auto, and then try other devices, enter: Bserver=Boot Server, may be your NIM Server too. Even if you don't have a gateway you need to mention it. In that case use 0.0.0.0 as gateway. And client is ther Server you want to load from NIM
bootlist -m normal ent0 speed=auto duplex=auto gateway=192.168.0.1 bserver=192.168.0.10 client=192.168.0.45 hdisk0 rmt0
53.ODMDELETE COMMAND TO DELETE NIM OBJECTS
Suppose you want to delete the entry with TRYME mksysb and lsnim shows the name as mksysb.TRYME and you are unable to delete it normal way.
MAKE SURE YOU BACKUP NIM DATABSE BEFORE THIS. READ THE LAST LINE TOO. OTHERWISE NIM SERVER WON"T WORK
odmget nim_attr >/tmp/nim_attr.out
vi /tmp/nim_attr.out and look for TRYME entry
Note down the id no for Ex. id=1161733976
odmdelete -o nim_attr -q id=1161733976
Now Delete it from nim_objects
odmget nim_objects >/tmp/nim_objects.out
vi that file and note down the id for TRYME
odmdelete -o nim_object -q id=1162344443
now from websm screen or smitty nim add the routing information to NIM
MASTER object
resources -> master ->properties ->nim interface. ( Add the interface again)
54.Identifying the Origin of "core" Files
When an application core dumps, a "core" file is placed in the current directory. Core files are often a symptom of a problem that needs attention. You can determine which application caused the "core" file going to the directory where the core file is located and running the command:
$ lquerypv -h core 6b0 64
The name of the application causing the core file is listed in the section on the right. In the sample output below, the "ftpd" application
caused the core file.
000006B0 7FFFFFFF FFFFFFFF 7FFFFFFF FFFFFFFF |................|
000006C0 00000000 000007D0 7FFFFFFF FFFFFFFF |................|
000006D0 00170000 53245A2C 00000000 00000015 |....S$Z,........|
000006E0 66747064 00000000 00000000 00000000 |ftpd............|
000006F0 00000000 00000000 00000000 00000000 |................|
00000700 00000000 00000000 00000000 000000CF |................|
00000710 00000000 00000020 00000000 000000BE |....... ........|
In addition, AIX can be configured to detect when core files are created and mail a message to root, alerting root that an application has failed. The instructions for setting this up are in a README file in the /usr/samples/findcore directory. These programs are delivered with the bos.sysmgt.serv_aid fileset.
55.Extend a filesystem in AIX command line
Suppose you want to extend /usr file system to 4GB
chfs -a size=4G /usr
or
chfs -a size=4000M /usr
or you want to add some more space like 2GB with existing size
chfs -a size=+2G /usr
You can extend the root file system same way. Suppose the new size you want is 2GB
then
chfs -a size=2G /
or
chfs -a size=2000M /
Please enable JavaScript to view this page content properly.
56.Sendmail Warning: .cf file is out of date: sendmail AIX5.3/8.13.4 supports version 10, .cf file is version 9
Solution : vi /etc/mail/sendmail.cf and change V9 to V10
57.How to erase complete data from a disk on aix 5.2 TL6 and 5.3TL4
diag -d hdiskX -T format
58.How to make IP changes permanent from command line
/usr/sbin/mktcpip -h'P550B_LP01' -a'30.3.0.120' -m'255.255.0.0' -i'en2' -g'30.3.0.120'
59.How to copy from one streaming tape to a another tape
tcopy /dev/rmt0 /dev/rmt1
60.How to check integrety of a tape
tapechk
61.How to display all the VLAN Adapter
lsdev -Cc adapter -t eth -s vlan
62.How to use BSD style network setting in AIX
smit configtcp fast path and then select BSD Style rc Configuration.
and configure the /etc/rc.bsdnet file using a standard text editor.
63.How to check the last fsck log of /utility filesystem
/sbin/helpers/jfs2/fscklog /utility
64.How to check the inode status of a file or inode or to check last accessed time etc
/sbin/helpers/jfs2/istat /etc/passwd
or
/sbin/helpers/jfs2/istat 40 /dev/hd4 ( to check inode 40 of /dev/hd4)
65.How to cleanup deleted ODM spaces
/usr/samples/odm/odmclean -d CuDvDr
66.How to find which fileset contains a particular binary for example ls
lslpp -w /usr/bin/ls
67.To display if the hardware is 32-bit or 64-bit, type:
bootinfo -y
68.How to change AIX OS from 32 bit kernel to 64 Bit kernel
ln -sf /usr/lib/boot/unix_64 /unix
ln -sf /usr/lib/boot/unix_64 /usr/lib/boot/unix
bosboot -ad /dev/ipldevice
shutdown -r
69.How to know if the kernel is 32-bit enabled or 64-bit enabled ?
bootinfo -K
70.How to lock and unlock a user
To unlock
chuser account_locked=false user
or
chsec -f /etc/security/user -a account_locked=false -s user
To lock
chuser account_locked=true user
or
chsec -f /etc/security/user -a account_locked=true -s user
71.How to define whether the user name should be echoed on a port
vi /etc/security/default stanza and change usernameecho = false
or
chsec -f /etc/security/login.cfg -s default -a usernameecho=false
72.How to change the password prompt for example
chsec -f /etc/security/login.cfg -s default -a pwdprompt="Enter your Password now:
73.How to change login prompt from telnet session like it will display the words in quote
chsec -f /etc/security/login.cfg -s default -a herald="Enter your user ID now:
74.How to supressthe login messages
touch .hushlogin
75.How to save current network parameter options for next boot
/usr/sbin/tunsave -a -F nextboot -t no
76.How to reset a user "asis"s failed login count
chsec -f /etc/security/lastlog -a "unsuccessful_login_count=0" -s 'asis'
77.How to restore a file from a savevg backup
/usr/bin/restorevgfiles -s -r -f'/dev/rmt0' -b'4096' -a'' /etc/passwd
78.How to preview information about a savevg backup with block size 4MB
listvgbackup -l -f'/dev/rmt0' -b'4096' -a''
79.What is the command to create VG on VPATH device
mkvg4vp
80.What is the command to add a Datapath PV to a vg
extendvg4vp
81.How to identify a PCI Slot at U1.5-P2-I8
drslot -c pci -i -s 'U1.5-P2-I8'
82.How to display all graphics adapters in a machine
lsdisp
83.How to display all Read Write Optical Device List ( Optical Jukebox)
lsdev -Cc rwoptical
84.How to add path to available Data Path Devices
/usr/sbin/addpaths
85.How to define and configure all Data path Devices
/usr/lib/methods/cfcallvpath
86.How to display all the vpath devices
lsdev -Cc disk -s dpo -t vpath
87.How to display Data Path Device Configuration
lsvpcfg
88.How to configure a defined tty
mkdev -l tty0
89.How to display the PMTU table
pmtu display
or
netstat -in
90.How to display all locked users (including system users)
usrck -l ALL (lowercase L)
91.How to generate hardware and software inventory of a server
/usr/sbin/geninv -c
or
/usr/sbin/geninv -l
92.How to display and change setting of the core files
lscore - to diplay settings
chcore - to change settings
93.How to search for and correct physical partitions that are stale or unable to
perform I/O operations on rootvg. ( Look manual for more options for this command)
mirscan -v rootvg
94.How to determine the status of your system battery
diag -B -c
95.How to run diggonostics on all SCSI devices without user action
diag -S 5 -c
96.How to determine if the 64-bit kernel extension is loaded ?
genkex |grep 64
97.Restore a Backup by Name
To restore a remote backup archive by name, use the following command:
rsh remotehost "dd if=/dev/rmt0 bs=blocksize" | restore -xvqdf- pathname
98.Restore a Backup by inode
To restore a remote backup archive by inode, use the following command:
rsh remotehost "dd if=/dev/rmt0 bs=blocksize" | restore -xvqf- pathname
99.Restore a Remote cpio Archive
To restore a remote archive created with the cpio command, use the following command:
rsh remotehost "dd if=/dev/rmt0 ibs=blocksize obs=5120" | cpio -icvdumB
100.Restore a tar Archive
To restore a remote tar archive, use the following command:
rsh remotehost "dd if=/dev/rmt0 bs=blocksize" | tar -xvpf- pathname
101.Restore a Remote Dump
To restore a remote dump of the /myfs file system, use the following command:
cd /myfs rrestore -rvf remotehost:/dev/rmt0
102.Backup by Name
To remotely create a backup archive by name, use the following command:
find pathname -print | backup -ivqf- | rsh remotehost "dd of=/dev/rmt0 bs=blocksize conv=sync"
103.To remotely create a backup archive by inode, first unmount your file system then use the backup command. For example:
umount /myfs backup -0 -uf- /myfs | rsh remotehost "dd of=/dev/rmt0 bs=blocksize conv=sync"
104.To create and copy an archive to the remote tape device, use the following command:
find pathname -print | cpio -ovcB | rsh remotehost "dd ibs=5120 obs=blocksize of=/dev/rmt0"
105.Create a tar Archive remotely :
tar -cvdf - pathname | rsh remotehost "dd of=/dev/rmt0 bs=blocksize conv=sync"
106.Create a Remote Dump remotely. To create a remote dump of the /myfs file system, use the following command:
rdump -u -0 -f remotehost:/dev/rmt0 /myfs
107.How to compare two directory
dircmp /dir1 /dir1
108.How to identify if a file is sparsely-allocated, for ex. /etc/passwd.
fileplace -v /etc/passwd
109.How to displaythe placement of file blocks within logical or physical volumes
fileplace -v /usr/bin/ls
fileplace -p /usr/bin/ls ( Will display the PV it resides in)
110.How to verify the list of bootable PVs :
ipl_varyon -i
111.How to display the filesystems in a volume group
lsvgfs rootvg
112.How to display the jfs/jfs2 file systems, run
lsjfs
or
lsjfs2
113.How to clean up a failed software installation
installp -C
114.How to unlock a rootvg
putlvodm -K `getlvodm -v rootvg`
115.How to run 64BIT application on 32 bit kernel
Smitty -> System Environments ->Enable 64bit Application environment
or
/etc/methods/cfg64
and run the following command
mkitab "load64bit:2:wait:/etc/methods/cfg64 >/dev/console 2>&1 # Enable 64-bit execs"
116.How make AIX replying to broadcast ping run this command
no -o bcastping=1
How to out the Status of VGDA of rootvg and hdisk0
lqueryvg -g `getlvodm -v rootvg` -At -p hdisk0
117.How to change a users attribute like pasword length
chsec -f /etc/security/user -s sid -a minlen=8
or
chuser minlen=8 sid
118.How to determine the tape block size
Use the dd command to read a single block from the device and find out what block size is used for the archive:
dd if=/dev/rmt0 bs=128k count=1 | wc -c
This will return to you the size in bytes of the block being read. Assuming that your backup was made with the
same physical block size, you can change your device to use this block size.
or
Use the tcopy command as follows to find out the block size:
# tcopy /dev/rmt0
tcopy : Tape File: 1; Records: 1 to 7179 ; size:512
tcopy : Tape File: 1; End of file after :7179 records; 3675648 bytes
119How to mirror a terminal
portmir -t pts/0 ( To start)
portmir -o (To stop)
120.How to restart inetd
refresh -s inetd
Q.How to identifying the current run level at the command line:
# cat /etc/.init.state
2
or
who -r
121.How to displays the names of the files added to the system during installation of the specified fileset. for Ex. openssh
lslpp -f openssh.base.server
122.How to list all the softwares in a cdrom ( To display directory use the path)
installp -L -d /dev/cd0
123.How to resize the VG size after increasing the lun sizes on Fast-T
chvg -g vgname
How to check the LVCB data
getlvcb -AT
How to find the latest service pack inside a SPOT
nim -o fix_query 530TL8spot |grep SP
124.How to configure STK L700 Library with AIX for Veritas Netbackup 6.x
You need to know two things first
1. Which fcs card you zoned the Fiber Robotic device
For Example fcs0 or fcs1
2. FCID of the robot. Which you will find from the Fiber switch in the Zone. Or run fcsstat. it will look like 0x242DB1
Now you need to run
1) /usr/openv/volmgr/bin/driver/install_ovpass
2) mkdev -c media_changer -t ovpass -s fcp -p fscsi0 -w 0x0242DB1,0
(fcsi0 if connected to fcs0, fscsi1 if fcs1 , FCID from Fiber Switch, add ,0 after that)
3)/usr/openv/volmgr/bin/scsi_command -d /dev/ovpass0 -inquiry (will show the robot)
4)/usr/openv/volmgr/scan will give you details of the robot if added correctly
Then run the netbackup Admin GUI
/usr/openv/netbackup/bin/jnbSA&
And discover everything from the main menu wizard. Don't go to device robot. Most of the types veritas discover devices including robots correctly
125. How to find the devices in pre defined subclass
lsdev -P -H
then run
lsdev -Cc disk -Fname -sscsi - for scsi disks
lsdev -Cc cdrom -Fname -sscsi - for scsi cdrom
lsdev -Cc disk -Fname -sfcp - for fiber disks
lsdev -Cc tape -Fname -sfcp - for fiber tapes
How to find the system id number of AIX Server
lsattr -El sys0 -a systemid
or
uname -u
126 . How to check and repair two file systems simultaneously on different drives
(from dfsck man page from AIX Server)
dfsck -p /dev/hd1 - -p /dev/hd7
How to fix SAN disks issue after rerecovering mksysb to different hardware connected to different SAN disks.
Sceanario :
In a recent disaster recovery scenario I had to recover two lpar to different hardware. mksysb created on two lpars connected to EMC Server and I was recovering to two different hardware connected to IBM Shark. Both have AIX 5.3 TL10. After recovering I figured out that one lpar can see the Shark SAN disks but showing as defined. And the other lpar no SAN disks are showing.
Solution I used : First I ran lslpp -l |egrep 'emc|ibm2105|sdd' and found that mksysb has Clarion drivers/powerpath/ibm2105 etc. As we are not using Clarion or EMC disks I am free to remove those packages. So I ran installp -u EMC* and removes all the EMC softwares. Then I unstalled IBM2105 packages same way. The 2nd lpar now automatically showing the Shark disks but they are showing as defined. Now I ran
rmdev -rdl fscsi0 and rmdev -rdl fscsi1 ( as SAN disks are connected to fcs0 & fcs1)
After that I ran cfgmgr -vl fcs0 & cfgmgr -vl fcs1 and all the disks came as MPIO device and as available. Now if I want vpath software then I would download latest sdd drivers and ibm2105.rte from ibm website and install them.
If you want to learn how to Install AIX 5L. Here is the link from IBM. I think this is one of the best document which covered almost everything of AIX installation.
http://www-128.ibm.com/developerworks/aix/library/au-install-aix.html
Introduction
There are many scenarios where the troubleshooting of DB2 issues can involve and benefit from gathering operating system level data and analyzing it to understand the issues further.
This article discusses a number of problems you may face with your database including CPU usage problems, orphan processes, database corruption, memory leaks, hangs and unresponsive application.
Here the author tried to explain some AIX utilities and commands to help you understand and resolve each of these troublesome issues. The data you collect from running these commands can be sent to the IBM Technical Support Team when opening a problem management request (PMR) in order to expedite the PMR support process. The end of each section of this article discusses the documents you should gather to send to the Technical Support Team. While this article gives troubleshooting tips to use as a guideline, you should contact the IBM Technical Support Team for official advice about these problems.
1.Monitor CPU usage
In working with your database, you might notice a certain DB2 process consuming a high amount of CPU space. This section describes some AIX utilities and commands which you can use either to analyse the issue yourself or to gather data before submitting a PMR to IBM Technical Support:
2.Through ps Command:
A ps command reveals the current status of an active process. You can use
ps -auxw | sort r +3 |head 10
to sort and get a list of the top 10 highest CPU consuming processes. Listing 1 shows the ps output:
Listing 1. Sample ps output
root@mavrickit $ ps auxw|sort -r +3|head -10
USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND
scot 1658958 0.1 9.0 218016 214804 - A Sep 13 38:16 db2agent (idle) 0
dpf 1036486 0.0 1.0 14376 14068 - A Sep 17 3:10 db2hmon 0
scot 1822932 0.0 1.0 12196 11608 - A Sep 12 6:41 db2hmon 0
dpf 1011760 0.0 0.0 9264 9060 - A Sep 17 3:03 db2hmon 3
dpf 1532116 0.0 0.0 9264 9020 - A Sep 17 3:04 db2hmon 2
dpf 786672 0.0 0.0 9264 8984 - A Sep 17 3:02 db2hmon 5
dpf 1077470 0.0 0.0 9264 8968 - A Sep 17 3:03 db2hmon 1
dpf 1269798 0.0 0.0 9248 9044 - A Sep 17 2:50 db2hmon 4
db2inst1 454756 0.0 0.0 9012 7120 - A Jul 19 0:52 db2sysc 0
3.Through topas Command
When executing a ps -ef command, you see the CPU usage of a certain process. You can also use the topas command to get further details. Similar to the ps command, a topas command retrieves selected statistics about the activity on the local system. Listing 2 is a sample topas output that shows a DB2 process consuming 33.3% CPU. You can use the topas output to get specific information such as the process id, the CPU usage and the instance owner who started the process. It is normal to see several db2sysc processes for a single instance owner. DB2 processes are renamed depending on the utility being used to list process information:
Listing 2. Sample topas output
Name PID CPU% PgSp Owner
db2sysc 105428 33.3 11.7 udbtest
db2sysc 38994 14.0 11.9 udbtest
test 14480 1.4 0.0 root
db2sysc 36348 0.8 1.6 udbtest
db2sysc 116978 0.5 1.6 udbtest
db2sysc 120548 0.5 1.5 udbtest
sharon 30318 0.3 0.5 root
lrud 9030 0.3 0.0 root
db2sysc 130252 0.3 1.6 udbtest
db2sysc 130936 0.3 1.6 udbtest
topas 120598 0.3 3.0 udbtest
db2sysc 62248 0.2 1.6 udbtest
db2sysc 83970 0.2 1.6 udbtest
db2sysc 113870 0.2 1.7 root
Through vmstat Command
The vmstat command can be used to monitor CPU utilization; you can get details on the amount of user CPU utilization as well as system CPU usage. Listing 3 shows the output from a vmstat command:
Listing 3. Sample vmstat output
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
32 3 1673185 44373 0 0 0 0 0 0 4009 60051 9744 62 38 0 0
24 0 1673442 44296 0 0 0 0 0 0 4237 63775 9214 67 33 0 0
30 3 1678417 39478 0 0 0 0 0 0 3955 70833 8457 69 31 0 0
33 1 1677126 40816 0 0 0 0 0 0 4101 68745 8336 68 31 0 0
28 0 1678606 39183 0 0 0 0 0 0 4525 75183 8708 63 37 0 0
35 1 1676959 40793 0 0 0 0 0 0 4085 70195 9271 72 28 0 0
23 0 1671318 46504 0 0 0 0 0 0 4780 68416 9360 64 36 0 0
30 0 1677740 40178 0 0 0 0 0 0 4326 58747 9201 66 34 0 0
30 1 1683402 34425 0 0 0 0 0 0 4419 76528 10042 60 40 0 0
0 0 1684160 33808 0 0 0 0 0 0 4186 72187 9661 73 27 0 0
When reading a vmstat output, as above, you can ignore the first line. The important columns to look at are us, sy, id and wa. Whereas
id: Time spent idle.
wa: Time spent waiting for I/O.
us: Time spent running non-kernel code. (user time)
sy: Time spent running kernel code. (system time)
In Listing 3, the system is hitting an average of 65% user CPU usage and 35% system CPU usage. Pi and Po values are equal to 0, thus there are no paging issues. The wa column shows there does not seem to be any I/O issues.
Listing 4 shows the wa (waiting on I/O) to be unusually high and this indicates there might be I/O bottlenecks on the system which in turn causes the CPU usage to be inefficient. You can check errpt -a output to see if there are any reported issues with the media or I/O on the system.
Listing 4. Sample vmstat output showing I/O issues
Kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
2 8 495803 3344 0 0 0 929 1689 0 998 6066 1832 4 3 76 16
0 30 495807 3340 0 0 0 0 0 0 1093 4697 1326 0 2 0 98
0 30 495807 3340 0 0 0 0 0 0 1055 2291 1289 0 1 0 99
0 30 495807 3676 0 2 0 376 656 0 1128 6803 2210 1 2 0 97
0 29 495807 3292 0 1 3 2266 3219 0 1921 8089 2528 14 4 0 82
1 29 495810 3226 0 1 0 5427 7572 0 3175 16788 4257 37 11 0 52
4 24 495810 3247 0 3 0 6830 10018 0 2483 10691 2498 40 7 0 53
4 25 495810 3247 0 0 0 3969 6752 0 1900 14037 1960 33 5 1 61
2 26 495810 3262 0 2 0 5558 9587 0 2162 10629 2695 50 8 0 42
3 22 495810 3245 0 1 0 4084 7547 0 1894 10866 1970 53 17 0 30
4.Through iostat Command
An iostat command quickly tells you if your system has a disk I/O-bound performance problem. Listing 5 is an example of an iostat command output:
Listing 5. Sample iostat output
System configuration: lcpu=4 disk=331
tty: tin tout avg-cpu: % user % sys % idle % iowait
0.0 724.0 17.9 12.3 0.0 69.7
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk119 100.0 5159.2 394.4 1560 24236
hdisk115 100.0 5129.6 393.0 1656 23992
hdiskpower26 100.0 10288.8 790.8 3216 48228
%tm_act : Reports back the percentage of time that the physical disk was active or the
total time of disk requests.
Kbps : Reports back the amount of data transferred to the drive in kilobytes.
tps : Reports back the number of transfers-per-second issued to the physical disk.
Kb_read : Reports back the total data (kilobytes) from your measured interval that is read
from the physical volumes.
Kb_wrtn : Reports back the amount of data (kilobytes) from your measured interval that is written to the physical volumes.
To check if you are experiencing resource contention, you can focus on the %tm_act value from the above output. An increase in this value, especially more than 40%, implies that processes are waiting for I/O to complete, and you have an I/O issue on your hands. Checking which hard disk has higher disk activity percentage and whether DB2 uses those hard disks gives you a better idea if these two factors are related.
What to collect
You should collect the following information before opening a PMR with IBM Technical Support:
• db2support.zip
• of high cpu process
• of high cpu process
Technical support might also send you the db2service.perf1 script which basically collects data repeatedly over a period of time. The output of the script needs to be bundled and sent back to the support team for their further analysis.
35.Troubleshoot orphan processes
There are scenarios when, even after doing a db2stop, you notice (by doing a ps -ef | grep DB2) certain DB2 processes such as the db2fmp process still running and consuming resources. If there was a case of abnormal shutdown, it is advised to do a ipclean after the instance has been stopped. Doing a db2stop should inherently shutdown all DB2 related processes; however, if an application using those processes was abnormally terminated, this might cause related DB2 processes to become orphan processes.
Orphan DB2 processes are those which are not attached or linked to any other DB2 processes. Abnormal termination of an application includes shutting it down by doing a Ctrl+C, closing the KSH session or killing it with a -9 option.
One way of confirming that the process is orphaned, is to try and match the process ID (PID) of the orphaned process from the ps -ef output with the Coordinator column of the db2 list applications show detail output. If the PID cannot be found in the db2 list apps output, then it is an orphan process. For example, if you issue a db2 list applications show detail command, you get this output:
Listing 6. Sample list applications output
CONNECT Auth Id Application Name Appl. Application Id Seq# Number of Coordinating DB
Coordinator Status Status Change Time DB Name DB Path
Handle Agents partition number pid/thread
JDE test.exe 2079 AC1C5C38.G80D.011F44162421 0001 1 0 2068646
UOW Waiting 04/04/2006 09:25:17.036230 PTPROD
/db2pd/otprod/ptprod/otprod/NODE0000/SQL00001/
--NOTICE PID 2068646. This is the PID on the local server.
Part of the ps -ef output from the server:
ps -ef |grep 2068646
otprod 2068646 483566 0 09:06:28 - 0:59 db2agent (PTPROD) 0
This output shows the process with PID of 2068646 is not an orphaned process and is still attached to a DB2 process.
In order to avoid orphan processes, you may want to do the following: Make normal, clean exits at the client side so that DB2 is aware and can clean up resources on the server. Tweak values of TCPKEEPIDLE time to a number less than the default, and tune the DB2CHECKCLIENTINTERVAL and KEEPALIVE values.
6.What to collect
If you do notice orphan processes and wish to investigate this issue, you should collect the following information before opening a PMR with IBM Technical Support:
- grep db2 output
-db2support.zip with -c option
- A callstack of the process that is collected using dbx, db2pd -stack or kill -36
Listing 7. The dbx command
dbx -a
At the dbx prompt type
th --- Displays all threads for the process
th info --- Displays additional info about the threads
where --- Get stack trace for thread 1
th current 1 --- Makes t1 current
where --- Displays stack for thread 1
th current 2 --- Makes thread 2 current
where --- Displays stack for thread 2.
... continue for all threads of the process
detach - --- Detach from process
dbx -a
7.Detect database corruption
You can start to investigate whether the database is corrupted if a user complains of not being able to access certain database objects or is unable to connect to a specific database partition. The following section highlights some of the errors that are logged by DB2 and how you can ensure that there are no operating system (OS) level issues affecting or causing DB2 database corruption. You might notice errors similar to the one in Listing 8 being logged in the db2diag.log:
Listing 8. Corruption errors
RETCODE : ZRC=0x87040001=-2029780991=SQLD_BADPAGE "Bad Data Page"
DIA8500C A data file error has occurred, record id is "".
Or
RETCODE: ZRC=0x86020019=-2046689255=SQLB_CSUM "Bad Page, Checksum Error"
DIA8426C A invalid page checksum was found for page "".
Or
2007-07-09-11.29.45.696176+120 I16992C16377 LEVEL: Severe
PID : 68098 TID : 1 PROC : db2agent (sample)
INSTANCE: instest NODE : 000 DB : sample
APPHDL : 0-635 APPID: *LOCAL.instest.070709082609
FUNCTION: DB2 UDB, buffer pool services, sqlbcres, probe:20
MESSAGE : Important: CBIT Error
DATA #1 : Hexdump, 4096 bytes
These errors are logged when DB2 tries to access data in a container and there is some form of corruption. In such an instance when DB2 cannot access the data, the database might be marked as bad. You can narrow down where there might be possible corruption. In the db2diag.log, look for messages similar to the following:
Listing 9. Corruption errors showing database object details
2006-04-15-03.15.37.271601-360 I235258C487 LEVEL: Error
PID : 152482 TID : 1 PROC : db2reorg (SAMPLE) 0
INSTANCE: instest NODE : 000 DB : SAMPLE
APPHDL : 0-68 APPID: *LOCAL.SAMPLE.060415091532
FUNCTION: DB2 UDB, buffer pool services, sqlbrdpg, probe:1146
DATA #1 : String, 124 bytes
Obj={pool:5;obj:517;type:0} State=x27 Parent={5;517}, EM=55456,
PP0=55488 Page=55520 Cont=0 Offset=55552 BlkSize=12
BadPage
The above errors indicate corruption has occurred in tablespace:5 and tableid:517. To check which table this refers to, execute the following SQL query:
Listing 10. Query to find a table with corruption
db2 "select tabname, tbspace from syscat.tables where tbspaceid = 5 and tableid = 517"
On the Operating System (OS) level, the most common causes for corruption are either hardware issues or file system corruption. For example, in the db2diag.log if you see the database being marked damaged with a ECORRUPT (89) error as follows :
Listing 11. Sample file system-related corruption errors
2007-05-22-13.45.52.268785-240 E20501C453 LEVEL: Error (OS)
PID : 1646696 TID : 1 PROC : db2agent (SAMPLE) 0
INSTANCE: tprod NODE : 000 DB : SAMPLE
APPHDL : 0-32 APPID: GA260B45.M505.012BC2174219
FUNCTION: DB2 UDB, oper system services, sqloopenp, probe:80
CALLED : OS, -, unspecified_system_function
OSERR : ECORRUPT (89) "Invalid file system control data detected."
8.You can check the following
Review the errpt -a output and look for hardware I/O or disk-related messages. Listing 12 is an example of an errpt -a output which shows a file system corruption:
Listing 12. Sample errpt output
LABEL: J2_FSCK_REQUIRED
IDENTIFIER: B6DB68E0
Date/Time: Thu Jun 7 20:59:49 DFT 2007
Sequence Number: 139206
Machine Id: 000BA256D600
Node Id: cmab
Class: O
Type: INFO
Resource Name: SYSJ2
Description
FILE SYSTEM RECOVERY REQUIRED
Probable Causes
INVALID FILE SYSTEM CONTROL DATA DETECTED
Recommended Actions
PERFORM FULL FILE SYSTEM RECOVERY USING FSCK UTILITY
OBTAIN DUMP
CHECK ERROR LOG FOR ADDITIONAL RELATED ENTRIES
Detail Data
ERROR CODE
0000 0005
JFS2 MAJOR/MINOR DEVICE NUMBER
0032 0004
CALLER
0028 8EC8
CALLER
0025 D5E4
CALLER
002B 4AC8
2. Run the fsck command on the file system where the container resides to be sure that it is sound. fsck interactively checks and repairs any file system malfunction. From the pSeries and AIX Information Center we can find the following examples of using the fsck command.
Listing 13. The fsck command
To check all the default file systems enter:
fsck
This form of the fsck command asks you for permission
before making any changes to a file system.
To check the file system /dev/hd1, enter:
fsck /dev/hd1
This checks the unmounted file system located on the /dev/hd1 device.
9.What to collect
You should collect the following information before opening a PMR with IBM Technical Support:
1. errpt -a
2. db2support.zip
3. fsck results
10.Debug memory leaks
It is important to distinguish, if possible, between a memory leak and a system-wide performance degradation due to increased demands for memory. So initially it is pertinent to check that nothing has changed in the environment that could explain increased memory usage. The rest of this section discusses how to use AIX Operating System techniques to spot, track and debug those leaks. The article does not discuss detailed DB2 tools and techniques, although there is some mention where necessary.
11.What is a memory leak?
A particular kind of unintentional memory consumption by a computer program where the program fails to release memory when no longer needed. This condition is normally the result of a bug in a program that prevents it from freeing up memory that it no longer needs. The term is meant as a humorous misnomer, since memory is not physically lost from the computer. Rather, memory is allocated to a program, and that program subsequently loses the ability to access it due to program logic flaws.
Specifically, it is a bug in the code whereby malloc() memory allocation calls are not met by corresponding free() memory calls. No corresponding free() system calls lead to unfreed blocks. Typically this is a slow process and occurs over days or weeks — particularly if the process is left active as is often the case. Some leaks are not even detectable, particularly if the application terminates and its processes are destroyed.
Lisitng 14 is an example of a C code snippet that demonstrates memory leak. In this instance, memory was available and pointed to by the variable 's,' but it was not saved. After this function returns, the pointer is destroyed and the allocated memory becomes unreachable, but it remains allocated.
Listing 14. Sample c code
#include
#include
void f(void)
{
void* s;
s = malloc(50); /* get memory */
return; /* memory leak - see note below */
/*
* Memory was available and pointed to by s, but not saved.
* After this function returns, the pointer is destroyed,
* and the allocated memory becomes unreachable.
*
* To "fix" this code, either the f() function itself
* needs to add "free(s)" somewhere or the s needs
* to be returned from the f() and the caller of f() needs
* to do the free().
*/
}
int main(void)
{
/* this is an infinite loop calling the above function */
while (1) f(); /* Malloc will return NULL sooner or later, due to lack of memory */
return 0;
}
12.How to spot, track and debug memory leaks
To begin with, you should call IBM if you suspect a DB2 process is leaking memory. But how do you know that you are experiencing this situation? This section discusses some of the options.
The first option is to use the ps utility. The ps utility can be used to quickly and simply determine if a process is leaking. This example demonstrates how a particular process is growing in size:
Listing 15. Sample 'ps aux' output showing the process growing in size
ps aux:
1st iteration:
USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME
COMMAND
db2inst1 225284 0.2 0.0 19468 18280 - A 11:26:06 10:34
db2logmgr
2nd iteration:
db2inst1 225284 0.1 0.0 19696 18512 - A 11:26:06 10:34
db2logmgr
3rd iteration:
db2inst1 225284 0.1 0.0 19908 18724 - A 11:26:06 10:36
db2logmgr
4th iteration:
db2inst1 225284 0.1 0.0 20116 18932 - A 11:26:06 10:36
db2logmgr
5th iteration:
db2inst1 225284 0.1 0.0 20312 19128 - A 11:26:06 10:37
db2logmgr
ps -kelf:
1st iteration:
F S UID PID PPID C PRI NI ADDR SZ WCHAN
STIME TTY TIME CMD
40001 A db2inst1 225284 254158 0 60 20 580e59400 18466
11:26:06 - 10:34 db2logmgr (***) 0
2nd iteration:
40001 A db2inst1 225284 254158 1 60 20 580e59400 18696
11:26:06 - 10:34 db2logmgr (***) 0
3rd iteration:
40001 A db2inst1 225284 254158 0 60 20 580e59400 18900
11:26:06 - 10:36 db2logmgr (***) 0
4th iteration:
40001 A db2inst1 225284 254158 0 60 20 580e59400 20106
11:26:06 - 10:36 db2logmgr (***) 0
5th iteration:
40001 A db2inst1 225284 254158 0 60 20 580e59400 20312
11:26:06 - 10:37 db2logmgr (***) 0
The SZ and RSS values in the ps aux output are the 2 key columns to focus on when trying to spot a potential memory leak. As you can see, the values in bold are increasing. It is not sufficient, however, to determine root cause and more debugging is certainly required. Again, please raise this issue with IBM Technical Support, but what follows are some likely problem determination steps IBM will take.
13.Debug using procmap and gencore
As root:
1. procmap
2. ps aux > ps_aux.1
3. ps -kelf > ps_kelf.1
4. gencore
procmap
1. ps aux > ps_aux.2
2. ps -kelf > ps_kelf.2
3. gencore
Then repeat these steps again for another 2 or 3 iterations. Please note, on 64 bit AIX, the gencore creates very large files. Regardless of the word size, fullcore needs to be enabled. The following commands can be used to check that the environment is set up correctly:
Listing 16. The lsattr command
lsattr -El sys0| grep -i core
fullcore true Enable full CORE dump True
And the limits for the instance owner needs to be set appropriately too. You may well be asked to enable MALLOC_DEBUG and export this to the DB2 environment. What follows is an example of this:
To start DB2 memory debugging for the next time the instance is started, run: db2set DB2MEMDBG=FFDC .
> To start malloc debugging for the next time the instance is started, run: export MALLOCDEBUG log:extended stack_depth 12.
And append MALLOCDEBUG to the DB2 registry variable DB2ENVLIST:
> db2set DB2ENVLIST MALLOCDEBUG.
Then stop and restart DB2.
Once the core files have been created, you can use snapcore to bundle the core files and libraries into pax file. An example of snapcore is as follows:
Listing 17. Sample snapcore
snapcore /home/db2inst1/sqllib/db2dump/c123456/core
/home/db2inst1/sqllib/adm/db2sysc
This creates a file with a *.pax extension in /tmp/snapcore by default. The core file is useless without the executable that cored, in this case it was db2sysc not db2logmgr, which was seen to be growing, because that is a process not an executable. DB2 support is then able to interrogate the core to track the DB2 malloc() allocations against free() calls.
Recover from hangs
14.What is a hang
A hang occurs when a process has not moved forward or changed after a period of time. This can happen if a thread or process reaches a point in its execution where it can go no further and is waiting for a response. It also occurs when the process is in a very tight loop and never completes the function.
The first step is to identify if what you are experiencing is a hang or a severe degradation. Then you need to understand what is affected, or the scope. Some simple questions can help a lot:
• Why do you think it has hung?
• Are all DB2 commands hanging?
• How long has the command been running for?
• How long does it normally run for
Then to access the scope:
• Are OS commands hanging too? If the answer to this is yes, then you need get assistance from the AIX support team.
• Are db2 connect statements affected?
• Can SQL be issued over existing connections?
• If in a DPF environment, can you issue commands against other partitions?
• Can you issue commands against other databases?
Recovery
Remember, please collect the stacks before you recover. Once you have the stacks the only choice you have is to issue db2_kill. Then check for any processes and IPCs shared memory, message queues and semaphores left lying around after the kill. You may have to remove any you find manually. You could also try ipclean to remove these resources. If the IPCs are not cleared out by ipclean or ipcrm and the processes are removed by kill -9, then the process is most likely hung in the kernel and you need to call AIX support.
Once it has come down, restart with db2start and then do a restart db command.
45.What to collect
The single most important piece of information to collect is a stack trace of the process that is believed to be hung. IBM DB2 support cannot debug a hang without this, and the stack trace must be collected prior to recovering DB2. If this is not done, you may have another outage in the future.
There will be pressure to restart DB2, but you must resist. The system must be in a hung state in order to diagnose the root cause of the problem and do the necessary debugging. A restart clears the situation and you have lost the window of opportunity to make the necessary changes. More seriously, you cannot provide any confidence that it won't recur. Thus, you need to resist the pressure to restart DB2 until you have collected all the diagnostics.
The following table describes good probelm determination (PD) and data caputre versus bad PD and data capture. Note that the best PD and data caputre requires the fewest steps and has a better change of success in determining root cause.
Poor PD and data capture:
• Occurrence
• Detection
• Recovery
• FFDC on (requires restart)
• Restart (outage #2) Schedule outage, hopefully problem does not reoccur before
• Occurrence (outage #3)
• Detection
• Data Collection
• Recovery
• Diagnosis (clock ticking)
Better PD and data capture:
• Occurrence (outage #1)
• Detection
• Recovery
• FFDC on
• Occurrence (outage #2)
• Detection
• Data Collection
• Recovery
• Diagnosis (clock ticking)
Good PD and data capture:
• Occurrence (outage #1)
• Detection
• Data Collection
• Recovery
• Diagnosis (clock ticking)
Stack traces
A stack trace is a snapshot of the function calls at a particular point in time. So multiple stack traces, a few minutes apart, provide a sense of motion. There are a variety of ways to collect stack traces; the following lists are, in my opinion, the most reliable:
Procstack
This is an AIX utility that just dumps the stack to a file. In this instance, I am appending the file because it is run again later and I do not want to have to re-write it.
Kill -36
This command does not kill the process, but it sends a signal to dump its stack. This actually creates a fully-formatted trap file to the DIAGPATH area of DB2. Because it gives more information than procstack and the way it works internally, it is generally more expensive, particularly if there are hundreds of processes, which is often the case. The main focus of this article is to discuss AIX operating system tools to debug DB2. No discussion of hang problem determination is complete without mentioning db2pd, so the following invocations can be used to generate stacks traces:
db2pd -stacks (This generates stack dumps again all PID)
db2pd -stack
The trap file is created in the DIAGPATH area. Listing 18 shows an example of its usage:
Listing 18. db2pd -stacks usage
1. -stacks
$ db2pd -stacks
Attempting to dump all stack traces for instance.
See current DIAGPATH for trapfiles.
2. -stack
$ db2pd -stack 1454326
Attempting to dump stack trace for pid 1454326.
See current DIAGPATH for trapfile.
The DB2 support will ask you to tar and compress the DIAGPATH area. Most commonly they will ask you to run a db2support command which does it for you, providing the correct flags are used. However, if you use the OS method of procstack, you have to submit the output files.
Truss
The truss command can be used but is not as effective as a stack dump and is only likely to reveal anything if the processes is looping and can be reproduced. If the process is hung, only a stack dump can reveal how it got there.
ps
It is also a good idea to collect ps listings for all partitions, if applicable, before and after the stack dumps. If you collect the data manually the pseudo-code looks like this:
Listing 19. procstack
Procstack Pid or PIDs >> procstack.out
Ps eafl >> pseafl.out
Ps aux >> psaux.out
Sleep 120
Repeat for at least 3 iterations.
Or:
Kill -36
Ps eafl >> pseafl.out
Ps aux >> psaux.out
Sleep 120
Repeat for at least 3 iterations.
NB: IBM DB2 support can provide a data collect script which automates this process.
Investigate unresponsive applications
Sometimes applications are merely unresponsive, and you have to figure out why it is unresponsive and how to get it to respond. If you issue a force application and it does not respond, you may be left wondering what you can do. First of all, it is important to know that force makes no guarantees to force. It is simply a wrapper around an OS kill command.
Without going into the architectural details of DB2, there are some situations which are dangerous to force. As such, the db2agent sets its priority level to be higher than that of the force. Under these circumstances, force does not work, and this is by design.
The bottom line is, not every unresponsive application is caused by a bug. It is possible that the application is just doing something important and not responding to any additional commands until it completes its current task.
Recovery
Recovery almost certainly requires a db2stop,db2start as DB2 does not take kindly to key engine processes being killed. It tends to invoke panic and bring the instance down. I would asses the impact the rogue application is having and, if possible, leave it in situ until you can recycle. It may be holding locks that are contending with other users, for example, and this is adversely affecting the application, in which case you may have to take an outage to remove it.
What to collect
The debugging of an unresponsive application is treated in the same way as a hung, but clearly the scope is narrower. You need to collect the following elements to send to IBM Technical Support:
- Iterative stack traces of the db2agent or DB2 process that is unresponsive.
- ps listings and other items, like: db2level, dbm cfg, db cfg, db2diag.log and possibly an application snapshot.
Conclusion
Problem determination in DB2 is made simpler because of the tools and utilities available in AIX. Often it is necessary to use both AIX and DB2 tools and commands to figure out what the problem is. This article discusses some of the problems associated with troubleshooting in DB2 and has hopefully given you the tools you need to fix your database.
1.ODM Delete Command easy Step to remove a Disk
odmdelete -o CuDv -q name=hdisk1
2.Check the status of a mksysb tape (Guessing tape drive is rmt0)
chdev -l rmt0 -a block_size=0
mt -t /dev/rmt0.1 fsf 3
lsmksysb -c -f /dev/rmt0.1
or
restore -Tvf /dev/rmt0.1 -s4
3.How to remove vpath
rmdev -Rdl dpo
4.NIM showlog command example
nim -o showlog -a full_log=yes -a log_type=nimerr 530TL4spot
5.Command to boot from network (provided maint boot enabled in the boot server)
bootlist -m normal ent0 speed=auto duplex=auto gateway=X.X.X.X bserver=X.X.X.X client=X.X.X.X
( replace x with the real IPs and speed/duplex according to your network speed settings)
6.Remove a mksysb image from NIM Server
nim -o remove -a rm_image=yes mksysbname
7.Create a image.data from mksysb image
restore -xvqf /images/mksysb.image ./image.data
8.List all ODM Definitions
odmget CuAt - to see all the attributes
odmget CuDv - to see all the devices
9.To remove a mirror copy from a LV
/usr/sbin/rmlvcopy fslv01 1 hdisk4 hdisk5
Please enable JavaScript to view this page content properly.
10.Creating a spot from mksysb
nim -o define -t spot -a source=mksysb1 -a server=master -a location=/export/spot spot1
11.Restore a file from mksysb image
restore -xvqf ./mksysb.image ./etc/passwd
12.Create an lpp source from existing directory
nim -o define -t lpp_source -a server=master -a location=/export/lpp_source/530TL5lpp 530TL5lpp
13.Create a spot from existing lpp source
nim -o define -t spot -a server=master -a location=/export/spot/530TL5spot -a source=530TL5lpp 530TL5spot
14.How to update a lpp source from a downloaded file sets
gencopy -X -b "-qv" -d /TMP_FOR_UPDATE_CD -t /export/lpp_source/530TL6lpp/ -f ALL 2>&1
Learn 10 good UNIX usage habits from IBM
http://www-128.ibm.com/developerworks/aix/library/au-badunixhabits.html
15.How to find a Tape is Mksysb or not
Run this command to see the list of files. If it doesn't show anything then the tape is NOT MKSYSB
chdev -l rmt0 -a block_size=0
mt -t /dev/rmt0.1 fsf 3
lsmksysb -c -f //dev/rmt0.1
or
restore -Tvf /dev/rmt0.1 -s4
Please enable JavaScript to view this page content properly.
16.To define a mksysb resource custimgname in NIM
nim -o define -t mksysb -a server=master -a location=/images/custimg.img custimgname
17.How to find out the Physical Location of a disk
lsdev -Cc disk -l hdisk0 -F "name location"
18.Install all software from CD
/usr/sbin/installp -aX -Y -d/dev/cd0 * all
19.Install Atape software from utility directory
/usr/sbin/installp -aX -Y -d/utility Atape*
20.To display BOS installation status information while the installation is progressing, run the following command on the master:
lsnim -a info -a Cstate ClientName
or
lsnim -l ClientName
21.To perform a base system installaion on a machine venus (if you don't want any bosinst_data, script, fbscript.
image data then just don't use them in the command line) from the NIM Server then run this.
nim -o bos_inst -a source=rte -a spot=530ML7SP3spot -a lpp_source=530ML7SP3lpp a bosinst_data=No_Prompt -a script=FTPSCR -a fb_script=Install_Drivers\
-a accept_licenses=yes -a preserve_res=yes -a no_client_boot=yes a set_bootlist=no -a force_push=no venus
Or with fewer option
nim -o bos_inst -a source=rte -a spot=530ML7SP3spot -a lpp_source=530ML7SP3lpp -a bosinst_data=No_Prompt -a script=FTPSCR -a fb_script=Install_Drivers\
-a accept_licenses=yes -a no_client_boot=yes -a force_push=no venus
Now boot the client machine from the network
22.To resync a logival volume in AIX. Here is an example
Note down the LV IDENTIFIER
root@zeus lslv hd6
LOGICAL VOLUME: hd6 VOLUME GROUP: rootvg
LV IDENTIFIER: 00c8411e00004c000000011731887e00.2 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/stale
TYPE: paging WRITE VERIFY: off
MAX LPs: 512 PP SIZE: 128 megabyte(s)
COPIES: 2 SCHED POLICY: parallel
LPs: 2 PPs: 4
STALE PPs: 2 BB POLICY: non-relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 32
MOUNT POINT: N/A LABEL: None
MIRROR WRITE CONSISTENCY: off
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?: NO
Now run this command on the STALE LV
root@zeus lresynclv -l 00c8411e00004c000000011731887e00.2
Or this script will resync all the Logical Volume of all Volume Groups. Modify this as per your requirement. This one I created for our test environment and so far seems to be working in test environment. Please test it in a test box before you use
lsvg|while read VG
do
lsvg l $VG|awk '{print $1}'
done|sed -e '/LV/d' -e '/\:/d'|while read LV
do
lslv $LV|grep 'LV IDENT'|awk '{print $3}'
done|while read LVIDENT
do
lresynclv -l $LVIDENT
done
Please enable JavaScript to view this page content properly.
23.To add all the filesets on /dev/cd0 to NIM lpp source 530TL5lpp
nim -o update -a packages=all -a source=/dev/cd0 530ML5lpp
24.To add all the filesets from a directory /utility/aixml to NIM lpp source 530TL5lpp
nim -o update -a packages=all -a source=/utility/aixml 530ML5lpp
25.How to create a NIM LPP Source from CD
To create from an existing directory use the full path of the existing directory ex. /export/lpp_source/510ML2lpp instead of /dev/cd0
gencopy -X -b "-qv" -d /dev/cd0 -t /export/lpp_source/510ML8lpp -f file 2>&1
26.How to create a lpp_source from existing directory i,e /export/lpp_source/520TL10SP2lpp
nim -o define -t lpp_source -a server=master -a location=/export/lpp_source/520TL10SP2 lpp 520TL10SP2lpp
27. NIM Network boot problem
# smitty nim_control_boot ==>
>> Limit Boot Image Creation to Defined Interfaces?
>> It was currently set to "NO"
28.Booting a Client from NIM Server in diagnostics mode from the command line
Follow this procedure for performing the diag operation from the master and client. To perform the diag operation from the client, enter:
nimclient -o diag -a spot=SPOTName
To perform the diag operation from the master, enter:
nim -o diag -a spot=SPOTName MachineObjectName
29.How to restore a Customer non bootable mksysb image to Client M/C (Try to use the next procedure in NIM Server and try NIM load instead)
Don't try restbyname in NIM Server. It will overwrite all the NIM server data with the tape Image. Now either use tctl or mt ( Guessing we are using rmt0)
If you want to restore in client M/C itself then
tctl -f /dev/rmt0.1 rewind
tctl -f /dev/rmt0.1 fsf 3
restbyname -xqf /dev/rmt0.1
30.How to create a NIM Image from Customer mksysb Tape
In NIM Server check a filesystem with at least 5-6 GB free space. I prefer seperate filesystem for this. Let us guess we are using /export/mksysb directory so cd to /export/mksysb and restore the image from tape for the Server venus
cd /export/mksysb
chdev -l rmt0 -a block_size=0 (To make sure it can read any block size)
mt -t /dev/rmt0.1 fsf 3
dd if=/dev/rmt0.1 of=/images/mksysb.venus bs=4m
(and use this image. using 4m to avoid any dd buffer error.)
nim -o define -t mksysb -a server=master -a location=/export/mksysb.venus venus_mksysb
Now initiate the MKSYSB installation for the client venus
nim -o bos_inst -a source=mksysb -a mksysb=venus_mksysb -a spot=530ML7SP3spot -a lpp_source=530ML7SP3lpp -a accept_licenses=yes -a \ no_client_boot=yes -a force_push=no venus
All these information might not be necessary. We normally use in our environment to allocate the mksysb and necessary lppsource & spots. In our NIM definition our bosinst.data is called No_Prompt, our Script is called as FTPSCR and FB Script is called Install_Drivers. These are just names but does lot more than then their names said.
nim -o allocate -a source=mksysb -a mksysb=mksysb.venus -a lpp_source=530TL5lpp -a spot=530TL5spot -a bosinst_data=No_Prompt -a script=FTPSCR \
-a fb_script=Install_Drivers -a accept_licenses=yes -a boot_client=no venus
31.How to display NIM Machines
lsnim -c machines
32.How to display NIM networks
lsnim -c networks
33.If NFS Mount failes to mount with following error message -
RPC: 1832-019 Program not registered
Then -> Run on the client. And if both the server & client is new then run on both uncomment portmap in /etc/rc.tcpip if not already done
make sure rc.nfs is not commented out in /etc/inittab
stopsrc -g nfs
startsrc -s portmap
/etc/rc.nfs
Now it should mount.
34.Installing Aix when booting from a mksysb tape fails.
Try clone load first. Clone load is boot from AIX CD1 and then recover from tape. Or you can try the other procedure too. You need to access the firmware command line prompt, which usually appears as an option in the SMS menus. At the firmware command line prompt, type following two commands:
setenv real-base 1000000
reset-all
The system will then reboot, and you will be able to boot from tape, assuming that you have an valid boot image on your tape media.
35.Create a Filesystem using command line
mkvg -y testvg hdisk1
mklv -y testlv testvg 500 hdisk1 (500 is 500 LP )
chlv -t jfs2 testlv
crfs -v jfs -a nbpi=16384 -A yes -d testlv -p rw -m /custimg
crfs -v jfs2 -A yes -d testlv -p rw -m /custimg
or
Create aJFS2 filesystems on VG testvg with size 10MB mount point /fs1 with adding entry in /etc/filesystems
crfs -v jfs2 -g testvg -a size=10M -m /fs1 -A yes
36.ODM command to delete network.
odmdelete -q name = en0 -o CuAt
odmdelete -q parent = en0 -o CuDv
odmdelete -q name = en0 -o CuDv
odmdelete -q name = en0 -o CuDep
odmdelete -q dependency = en0 -o CuDep
odmdelete -q value1 = en0 -o CuDvDr
odmdelete -q value3 = en0 -o CuDvDr
odmdelete -q name=inet0 -o CuAt
Please enable JavaScript to view this page content properly.
37.Etherchannel problem after loading the server from Customer mksysb tape
You must remove the ODM entries first before you configure etherchannel
run this on the correct network interface. Ex. en0
odmdelete -q name=en0 -o CuAt
odmdelete -q name=inet0 -o CuAt
38.How to remove a failed Disk from ODM
If you have been working with a PVID value rather than with an hdisk name,
ensure that the PVID is removed from the ODM with the following command. The
32-digit value supplied consists of the PVID plus 16 zeros. For example:
odmdelete -q value=0073659c2c6d26f10000000000000000 -o CuAt ( add 16 zeros)
To get the PVID run
lsvg -p vgname
Then run
rmlvcopy 1 0073659c2c6d26f1 (16 Digit PVID)
39.Restoring tar backup with absolute pathname to different directory
A tar backup created using absolute path names can only be restored to the directory from which it was created. One way to restore it to a different directory is by using the pax command. For example, suppose you receive a tar tape created using absolute path names.
tar -cvf /dev/rmt0 /work/*
but want to restore it to the /test directory. The pax command would be:
pax -rf /dev/rmt0 -s/work/test/p
The -s/work/test/p does the directory change.
40.Determine the path to your system's error log file by running the following command:
/usr/lib/errdemon -l
41.To change the maximum size of the error log file enter:
/usr/lib/errdemon -s LogSize
42.To change the size of the error log device driver's internal buffer, enter:
/usr/lib/errdemon -B BufferSize
43.To list all events for which logging is currently disabled, enter:
errpt -t -F Log=0
44.To list all events for which reporting is currently disabled, enter:
errpt -t -F Report=0
45.IBM 3494 Library testing commands
mtlib -l /dev/lmcp0 -D -E
mtlib -l /dev/lmcp0 -qM
look man mtlib for more options
46.Vpaths not created for all hdisks of an AIX host or missing vpaths for some hdisks.
In some cases a customer may notice that some hdisks are not associated with any vpaths. Or a customer may not see the expected number of vpaths for the number of hdisks that they have on their AIX host.
In either case the problem could be caused by the fact that the hdisks with no vpath association are listed in a file called /etc/vpexclude. If this file exists a customer can remove the file and run cfgmgr and the hdisks will now be associated with vpaths.
The only way that the vpexclude file can be created is if a customer runs a querysn command on the AIX host or if the customer manually edits the /etc/vpexclude file to include the hdisks.
47.Resetting the NIM state from the command line
Follow this procedure for resetting the NIM state from the command line.
To return a machine to the ready state, enter:
nim -Fo reset MachineName
To deallocate resources, enter:
nim -o deallocate -a subclass=all MachineName
48.Recovering the /etc/niminfo file from the command line
nimconfig -r
49.To list all duplicate and conflicting updates in the /myimages image source directory
/usr/lib/instl/lppmgr -d /myimages -u
50.To remove all duplicate and conflicting updates in the /myimages image source directory, type:
/usr/lib/instl/lppmgr -d /myimages -u -r
51.How to change the console to tty0 if tty0 not available
smitty devices > add a tty >tty rs232 Asynchronous terminal > sa0 ( or sa1) in next screen select port to 0, baud rate to 9600 and Enable Login to enable and hit enter. Now run smitty console and change the device from /dev/lft0 to /dev/tty0
52.To attempt to boot through a gateway using Ethernet with Duplex & speed Auto, and then try other devices, enter: Bserver=Boot Server, may be your NIM Server too. Even if you don't have a gateway you need to mention it. In that case use 0.0.0.0 as gateway. And client is ther Server you want to load from NIM
bootlist -m normal ent0 speed=auto duplex=auto gateway=192.168.0.1 bserver=192.168.0.10 client=192.168.0.45 hdisk0 rmt0
53.ODMDELETE COMMAND TO DELETE NIM OBJECTS
Suppose you want to delete the entry with TRYME mksysb and lsnim shows the name as mksysb.TRYME and you are unable to delete it normal way.
MAKE SURE YOU BACKUP NIM DATABSE BEFORE THIS. READ THE LAST LINE TOO. OTHERWISE NIM SERVER WON"T WORK
odmget nim_attr >/tmp/nim_attr.out
vi /tmp/nim_attr.out and look for TRYME entry
Note down the id no for Ex. id=1161733976
odmdelete -o nim_attr -q id=1161733976
Now Delete it from nim_objects
odmget nim_objects >/tmp/nim_objects.out
vi that file and note down the id for TRYME
odmdelete -o nim_object -q id=1162344443
now from websm screen or smitty nim add the routing information to NIM
MASTER object
resources -> master ->properties ->nim interface. ( Add the interface again)
54.Identifying the Origin of "core" Files
When an application core dumps, a "core" file is placed in the current directory. Core files are often a symptom of a problem that needs attention. You can determine which application caused the "core" file going to the directory where the core file is located and running the command:
$ lquerypv -h core 6b0 64
The name of the application causing the core file is listed in the section on the right. In the sample output below, the "ftpd" application
caused the core file.
000006B0 7FFFFFFF FFFFFFFF 7FFFFFFF FFFFFFFF |................|
000006C0 00000000 000007D0 7FFFFFFF FFFFFFFF |................|
000006D0 00170000 53245A2C 00000000 00000015 |....S$Z,........|
000006E0 66747064 00000000 00000000 00000000 |ftpd............|
000006F0 00000000 00000000 00000000 00000000 |................|
00000700 00000000 00000000 00000000 000000CF |................|
00000710 00000000 00000020 00000000 000000BE |....... ........|
In addition, AIX can be configured to detect when core files are created and mail a message to root, alerting root that an application has failed. The instructions for setting this up are in a README file in the /usr/samples/findcore directory. These programs are delivered with the bos.sysmgt.serv_aid fileset.
55.Extend a filesystem in AIX command line
Suppose you want to extend /usr file system to 4GB
chfs -a size=4G /usr
or
chfs -a size=4000M /usr
or you want to add some more space like 2GB with existing size
chfs -a size=+2G /usr
You can extend the root file system same way. Suppose the new size you want is 2GB
then
chfs -a size=2G /
or
chfs -a size=2000M /
Please enable JavaScript to view this page content properly.
56.Sendmail Warning: .cf file is out of date: sendmail AIX5.3/8.13.4 supports version 10, .cf file is version 9
Solution : vi /etc/mail/sendmail.cf and change V9 to V10
57.How to erase complete data from a disk on aix 5.2 TL6 and 5.3TL4
diag -d hdiskX -T format
58.How to make IP changes permanent from command line
/usr/sbin/mktcpip -h'P550B_LP01' -a'30.3.0.120' -m'255.255.0.0' -i'en2' -g'30.3.0.120'
59.How to copy from one streaming tape to a another tape
tcopy /dev/rmt0 /dev/rmt1
60.How to check integrety of a tape
tapechk
61.How to display all the VLAN Adapter
lsdev -Cc adapter -t eth -s vlan
62.How to use BSD style network setting in AIX
smit configtcp fast path and then select BSD Style rc Configuration.
and configure the /etc/rc.bsdnet file using a standard text editor.
63.How to check the last fsck log of /utility filesystem
/sbin/helpers/jfs2/fscklog /utility
64.How to check the inode status of a file or inode or to check last accessed time etc
/sbin/helpers/jfs2/istat /etc/passwd
or
/sbin/helpers/jfs2/istat 40 /dev/hd4 ( to check inode 40 of /dev/hd4)
65.How to cleanup deleted ODM spaces
/usr/samples/odm/odmclean -d CuDvDr
66.How to find which fileset contains a particular binary for example ls
lslpp -w /usr/bin/ls
67.To display if the hardware is 32-bit or 64-bit, type:
bootinfo -y
68.How to change AIX OS from 32 bit kernel to 64 Bit kernel
ln -sf /usr/lib/boot/unix_64 /unix
ln -sf /usr/lib/boot/unix_64 /usr/lib/boot/unix
bosboot -ad /dev/ipldevice
shutdown -r
69.How to know if the kernel is 32-bit enabled or 64-bit enabled ?
bootinfo -K
70.How to lock and unlock a user
To unlock
chuser account_locked=false user
or
chsec -f /etc/security/user -a account_locked=false -s user
To lock
chuser account_locked=true user
or
chsec -f /etc/security/user -a account_locked=true -s user
71.How to define whether the user name should be echoed on a port
vi /etc/security/default stanza and change usernameecho = false
or
chsec -f /etc/security/login.cfg -s default -a usernameecho=false
72.How to change the password prompt for example
chsec -f /etc/security/login.cfg -s default -a pwdprompt="Enter your Password now:
73.How to change login prompt from telnet session like it will display the words in quote
chsec -f /etc/security/login.cfg -s default -a herald="Enter your user ID now:
74.How to supressthe login messages
touch .hushlogin
75.How to save current network parameter options for next boot
/usr/sbin/tunsave -a -F nextboot -t no
76.How to reset a user "asis"s failed login count
chsec -f /etc/security/lastlog -a "unsuccessful_login_count=0" -s 'asis'
77.How to restore a file from a savevg backup
/usr/bin/restorevgfiles -s -r -f'/dev/rmt0' -b'4096' -a'' /etc/passwd
78.How to preview information about a savevg backup with block size 4MB
listvgbackup -l -f'/dev/rmt0' -b'4096' -a''
79.What is the command to create VG on VPATH device
mkvg4vp
80.What is the command to add a Datapath PV to a vg
extendvg4vp
81.How to identify a PCI Slot at U1.5-P2-I8
drslot -c pci -i -s 'U1.5-P2-I8'
82.How to display all graphics adapters in a machine
lsdisp
83.How to display all Read Write Optical Device List ( Optical Jukebox)
lsdev -Cc rwoptical
84.How to add path to available Data Path Devices
/usr/sbin/addpaths
85.How to define and configure all Data path Devices
/usr/lib/methods/cfcallvpath
86.How to display all the vpath devices
lsdev -Cc disk -s dpo -t vpath
87.How to display Data Path Device Configuration
lsvpcfg
88.How to configure a defined tty
mkdev -l tty0
89.How to display the PMTU table
pmtu display
or
netstat -in
90.How to display all locked users (including system users)
usrck -l ALL (lowercase L)
91.How to generate hardware and software inventory of a server
/usr/sbin/geninv -c
or
/usr/sbin/geninv -l
92.How to display and change setting of the core files
lscore - to diplay settings
chcore - to change settings
93.How to search for and correct physical partitions that are stale or unable to
perform I/O operations on rootvg. ( Look manual for more options for this command)
mirscan -v rootvg
94.How to determine the status of your system battery
diag -B -c
95.How to run diggonostics on all SCSI devices without user action
diag -S 5 -c
96.How to determine if the 64-bit kernel extension is loaded ?
genkex |grep 64
97.Restore a Backup by Name
To restore a remote backup archive by name, use the following command:
rsh remotehost "dd if=/dev/rmt0 bs=blocksize" | restore -xvqdf- pathname
98.Restore a Backup by inode
To restore a remote backup archive by inode, use the following command:
rsh remotehost "dd if=/dev/rmt0 bs=blocksize" | restore -xvqf- pathname
99.Restore a Remote cpio Archive
To restore a remote archive created with the cpio command, use the following command:
rsh remotehost "dd if=/dev/rmt0 ibs=blocksize obs=5120" | cpio -icvdumB
100.Restore a tar Archive
To restore a remote tar archive, use the following command:
rsh remotehost "dd if=/dev/rmt0 bs=blocksize" | tar -xvpf- pathname
101.Restore a Remote Dump
To restore a remote dump of the /myfs file system, use the following command:
cd /myfs rrestore -rvf remotehost:/dev/rmt0
102.Backup by Name
To remotely create a backup archive by name, use the following command:
find pathname -print | backup -ivqf- | rsh remotehost "dd of=/dev/rmt0 bs=blocksize conv=sync"
103.To remotely create a backup archive by inode, first unmount your file system then use the backup command. For example:
umount /myfs backup -0 -uf- /myfs | rsh remotehost "dd of=/dev/rmt0 bs=blocksize conv=sync"
104.To create and copy an archive to the remote tape device, use the following command:
find pathname -print | cpio -ovcB | rsh remotehost "dd ibs=5120 obs=blocksize of=/dev/rmt0"
105.Create a tar Archive remotely :
tar -cvdf - pathname | rsh remotehost "dd of=/dev/rmt0 bs=blocksize conv=sync"
106.Create a Remote Dump remotely. To create a remote dump of the /myfs file system, use the following command:
rdump -u -0 -f remotehost:/dev/rmt0 /myfs
107.How to compare two directory
dircmp /dir1 /dir1
108.How to identify if a file is sparsely-allocated, for ex. /etc/passwd.
fileplace -v /etc/passwd
109.How to displaythe placement of file blocks within logical or physical volumes
fileplace -v /usr/bin/ls
fileplace -p /usr/bin/ls ( Will display the PV it resides in)
110.How to verify the list of bootable PVs :
ipl_varyon -i
111.How to display the filesystems in a volume group
lsvgfs rootvg
112.How to display the jfs/jfs2 file systems, run
lsjfs
or
lsjfs2
113.How to clean up a failed software installation
installp -C
114.How to unlock a rootvg
putlvodm -K `getlvodm -v rootvg`
115.How to run 64BIT application on 32 bit kernel
Smitty -> System Environments ->Enable 64bit Application environment
or
/etc/methods/cfg64
and run the following command
mkitab "load64bit:2:wait:/etc/methods/cfg64 >/dev/console 2>&1 # Enable 64-bit execs"
116.How make AIX replying to broadcast ping run this command
no -o bcastping=1
How to out the Status of VGDA of rootvg and hdisk0
lqueryvg -g `getlvodm -v rootvg` -At -p hdisk0
117.How to change a users attribute like pasword length
chsec -f /etc/security/user -s sid -a minlen=8
or
chuser minlen=8 sid
118.How to determine the tape block size
Use the dd command to read a single block from the device and find out what block size is used for the archive:
dd if=/dev/rmt0 bs=128k count=1 | wc -c
This will return to you the size in bytes of the block being read. Assuming that your backup was made with the
same physical block size, you can change your device to use this block size.
or
Use the tcopy command as follows to find out the block size:
# tcopy /dev/rmt0
tcopy : Tape File: 1; Records: 1 to 7179 ; size:512
tcopy : Tape File: 1; End of file after :7179 records; 3675648 bytes
119How to mirror a terminal
portmir -t pts/0 ( To start)
portmir -o (To stop)
120.How to restart inetd
refresh -s inetd
Q.How to identifying the current run level at the command line:
# cat /etc/.init.state
2
or
who -r
121.How to displays the names of the files added to the system during installation of the specified fileset. for Ex. openssh
lslpp -f openssh.base.server
122.How to list all the softwares in a cdrom ( To display directory use the path)
installp -L -d /dev/cd0
123.How to resize the VG size after increasing the lun sizes on Fast-T
chvg -g vgname
How to check the LVCB data
getlvcb -AT
How to find the latest service pack inside a SPOT
nim -o fix_query 530TL8spot |grep SP
124.How to configure STK L700 Library with AIX for Veritas Netbackup 6.x
You need to know two things first
1. Which fcs card you zoned the Fiber Robotic device
For Example fcs0 or fcs1
2. FCID of the robot. Which you will find from the Fiber switch in the Zone. Or run fcsstat. it will look like 0x242DB1
Now you need to run
1) /usr/openv/volmgr/bin/driver/install_ovpass
2) mkdev -c media_changer -t ovpass -s fcp -p fscsi0 -w 0x0242DB1,0
(fcsi0 if connected to fcs0, fscsi1 if fcs1 , FCID from Fiber Switch, add ,0 after that)
3)/usr/openv/volmgr/bin/scsi_command -d /dev/ovpass0 -inquiry (will show the robot)
4)/usr/openv/volmgr/scan will give you details of the robot if added correctly
Then run the netbackup Admin GUI
/usr/openv/netbackup/bin/jnbSA&
And discover everything from the main menu wizard. Don't go to device robot. Most of the types veritas discover devices including robots correctly
125. How to find the devices in pre defined subclass
lsdev -P -H
then run
lsdev -Cc disk -Fname -sscsi - for scsi disks
lsdev -Cc cdrom -Fname -sscsi - for scsi cdrom
lsdev -Cc disk -Fname -sfcp - for fiber disks
lsdev -Cc tape -Fname -sfcp - for fiber tapes
How to find the system id number of AIX Server
lsattr -El sys0 -a systemid
or
uname -u
126 . How to check and repair two file systems simultaneously on different drives
(from dfsck man page from AIX Server)
dfsck -p /dev/hd1 - -p /dev/hd7
How to fix SAN disks issue after rerecovering mksysb to different hardware connected to different SAN disks.
Sceanario :
In a recent disaster recovery scenario I had to recover two lpar to different hardware. mksysb created on two lpars connected to EMC Server and I was recovering to two different hardware connected to IBM Shark. Both have AIX 5.3 TL10. After recovering I figured out that one lpar can see the Shark SAN disks but showing as defined. And the other lpar no SAN disks are showing.
Solution I used : First I ran lslpp -l |egrep 'emc|ibm2105|sdd' and found that mksysb has Clarion drivers/powerpath/ibm2105 etc. As we are not using Clarion or EMC disks I am free to remove those packages. So I ran installp -u EMC* and removes all the EMC softwares. Then I unstalled IBM2105 packages same way. The 2nd lpar now automatically showing the Shark disks but they are showing as defined. Now I ran
rmdev -rdl fscsi0 and rmdev -rdl fscsi1 ( as SAN disks are connected to fcs0 & fcs1)
After that I ran cfgmgr -vl fcs0 & cfgmgr -vl fcs1 and all the disks came as MPIO device and as available. Now if I want vpath software then I would download latest sdd drivers and ibm2105.rte from ibm website and install them.
If you want to learn how to Install AIX 5L. Here is the link from IBM. I think this is one of the best document which covered almost everything of AIX installation.
http://www-128.ibm.com/developerworks/aix/library/au-install-aix.html
AIX tips
AIX - Tips n Tricks - Part I
1. To confirm which network adapter is plugged into the switch in IBM AIX
Try this next time to confirm what adapter is actually plugged in to a switch:
This will not only tell you where you have a connection but it will also tell you what speed the port on the switch is.
# netstat -v grep -E "ETHERMedia"
ETHERNET STATISTICS (ent0) :
Media Speed Selected: 100 Mbps Full Duplex
Media Speed Running: 100 Mbps Full Duplex
ETHERNET STATISTICS (ent1) :
Media Speed Selected: 1000 Mbps Full Duplex
Media Speed Running: 1000 Mbps Full Duplex
----------------------------------------------------------------------------------------------------------------
2. How to determine if IBM AIX 64 bit kernel (sotware) is installed on your IBM AIX server?
# lslpp -l bos.64bit
bos.64bit 4.3.3.76 COMMITTED Base Operating System 64 bit
----------------------------------------------------------------------------------------------------------------
3. How you create a snapshot of your IBM AIX server to send to IBM for tech support ?
This is requried whenver you face issues with your server and you seek help from IBM to sort out the issue.
# snap -gc
This will create a file called snap.pax.Z in the directory /tmp/ibmsupt. Send this file to IBM so that they will get full configuration of your server.
----------------------------------------------------------------------------------------------------------------
4. How to enable entended history in AIX 5.3 ?
In AIX 5.3, you have the capability to have a time stamped history. To enable it, just set the following variable:
EXTENDED_HISTORY=ON
Example:
export EXTENDED_HISTORY=ON
If required add this line to your .profile.
----------------------------------------------------------------------------------------------------------------
5. How to find the microcode level of tape drives ?
To find microcode level (firmware) of tape drives in IBM AIX:
# tapeutil -f /dev/rmt1 vpd
----------------------------------------------------------------------------------------------------------------
6. pgrep and pkill - how to terminate processes ?
You can use the pgrep and pkill commands to identify and stop command processes that you no longer want to run. These commands are useful when you mistakenly start a process that takes a long time to run.
To terminate a process:
a. pgrep - to find out the PID(s) for the process(es)
b. pkill - followed by the PID(s)
The following example illustrates how to find all the processes with a specific name (xterm) and terminate the xterm process that was started last.
# pgrep xterm 17818 17828 17758 18210
# pkill -n 18210
Note: If you need to forcibly terminate a process, use the -9 option to the pkill command.
For Example
# kill -9 -n xterm
----------------------------------------------------------------------------------------------------------------
7. How to modify Asynchronous I/O variables in AIX ?
To modify the minservers asynchronous I/O variable (MINIMUM number of servers) in IBM AIX:
# chdev -l aio0 -a minservers='1'
To modify the maxservers asynchronous I/O variable (MAXIMUM number of servers per cpu) in IBM AIX 5L:
# chdev -l aio0 -a maxservers='10'
To modify the maxservers asynchronous I/O variable (MAXIMUM number of servers) in IBM AIX v4.3:
# chdev -l aio0 -a maxservers='80'
To modify the requests asynchronous I/O variable (Maximum number of REQUESTS) in IBM AIX:
# chdev -l aio0 -a requests='4096'
Notes:
1) Valeus will only take effect after a reboot
2) You may use multiple -a options on the same command line3)
The maxservers variable is PER CPU for AIX 5L and TOTAL for AIX v4.3
----------------------------------------------------------------------------------------------------------------
8. How to display microcode and firmware levels of the system and adapters in IBM AIX ?
To displays microcode level information for all supported devices in IBM AIX :
# lsmcode -A
sys0!system:SF240_284 (t) SF240_261 (p) SF240_284 (t)
ent0!14108902.DV0210
ent1!14108902.DV0210
ent2!14108902.DV0210
ent3!14108902.DV0210
sisscsia0!44415254.05080064
sisscsia1!44415255.050A0064
hdisk0!ST37320.4A553042.43373038
hdisk1!ST37320.4A553042.43373038
----------------------------------------------------------------------------------------------------------------
9. How to determine if simultaneous multi-threading (SMT) is enabled in AIX ?
Your system is capable of SMT if it's a POWER5-based system running AIX 5L Version 5.3.To determine if it is enabled:# smtctl
To enable SMT: # smtctl -m on [ -w boot now]
To disable SMT: # smtctl -m off [ -w boot now]
Note: If neither the -w boot or the -w now options are specified, then the mode change is made immediately. It persists across subsequent reboots if you run the bosboot command before the next system reboot.
----------------------------------------------------------------------------------------------------------------
10. How to find top users of memory space in IBM AIX ?
To list the top ten users of paging space in IBM AIX:
# svmon -Pgt 10
To list the top ten users of realmem in IBM AIX:
# svmon -Put 10
----------------------------------------------------------------------------------------------------------------
11. How to list the filesystems in a volume group in IBM AIX ?
# lsvgfs volume_group
----------------------------------------------------------------------------------------------------------------
12. How to query the volume group descriptor area on a drive in IBM AIX ?
To query the volume group descriptor area on the drive, so you can find out if there's a VG on the disk, even if there isn't anything imported on the drive in IBM AIX:
# lqueryvg -Atp hdisk#
----------------------------------------------------------------------------------------------------------------
13. How to set IBM AIX for full core dumps and files to unlimited
To set IBM AIX for full core dumps to unlimited:
# ulimit -c unlimited
To set IBM AIX for files to unlimited:
# ulimit -f unlimited
To view your ulimit settings:
# ulimit –a
----------------------------------------------------------------------------------------------------------------
14. How to determine what the speed and duplex is of an interface in AIX ?
# entstat -d en0 grep "Media Speed"
------------------------------------------------------------------------------------------------
15. How to find the highest technology level and service pack installed in IBM AIX ?
Starting in 2006, in IBM AIX, a Maintenance Level will be referred to as a Technology Level and will only be released twice per year.
The Service Pack concept will allow service-only updates (as known as PTF’s) that are released between Technology Levels to be grouped together for easier identification.
Sample output for a V5.3 system, with Technology Level 4, and Service Pack 2 installed would be:
# oslevel –s
----------------------------------------------------------------------------------------------------------------
16. How to find number of active processors in IBM AIX ?
To find the number of active processors in IBM AIX:
# bindprocessor -q
The available processors are: 0 1 2
To find the number of processors installed (but not necessarily available):
# lscfg -v grep proc
proc0 00-00 Processor
proc2 00-02 Processor
proc4 00-04 Processor
----------------------------------------------------------------------------------------------------------------
17. How to read the contents of /etc/security/failedlogin in IBM AIX ?
# who /etc/security/failedlogin# /usr/sbin/acct/fwtmp < /etc/security/failedlogin ---------------------------------------------------------------------------------------------------------------- 18. How to reserve or lock a terminal under AIX 5L ? This command will lock your terminal and reserve it for later use in AIX 5L # lock # lock -30 (lock for 30 minutes) Note: default lock time is 15 mintes lock will ask you for a password twice then lock the terminal. you can unlock it by entering the password a third time. ---------------------------------------------------------------------------------------------------------------- 19. How to stop IBM AIX from forcing a user to change their password at first login ? # pwdadm -c username ---------------------------------------------------------------------------------------------------------------- 20. Redhat Package Manager for IBM AIX and Linux To query all packages installed: # rpm -q –a To list file in a specific package: # rpm -q -l package.rpm To install a RPM package: # rpm -i package.rpm To delete (erase) a RPM package: # rpm -e package.rpm To query for RPM package owning file: # rpm -q -f /path/to/file To upgrade a RPM package # rpm -U package.rpm ---------------------------------------------------------------------------------------------------------------- 21. How to capture TCPIP packet information in IBM AIX ? Here is a command to capture TCP/IP packet information between your server and another in IBM AIX: Become root user, Find a temporary directory to capture the data (/tmp in this example) Run the iptrace command: # iptrace -a -d host_destination -b /tmp/ip.out iptrace will run in the background and results will be in /tmp/ip.out To see the results of the trace: # ipreport /tmp/ip.out more Don't foregt to kill iptrace when you're done: # ps -ef grep iptrace grep -v grep awk '{system("kill " $2)}' Some other cool options of iptrace: -d : specify destination IP address -s : specify origin IP address -b : show 2-way traffic (as in "-s xxx -b" or "-d xxx -b") -a : no ARP requests (less pollution in the trace) To see all packets going in and out of server, unixserv, without ARP requests: # iptrace -a -d unixserv -b /tmp/ip.out iptrace and ipreport are in IBM AIX LPP "bos.net.tcp.server" ---------------------------------------------------------------------------------------------------------------- 22. How to check if a system dump completed successfully on IBM AIX ? To verify if a system dump completed successfully on an IBM AIX server: # sysdumpdev –L 0453-039 Device name: /dev/hd6 Major device number: 10 Minor device number: 1 Size: 124371456 bytes Date/Time: Sun Sep 15 12:19:02 EDT 2002 Dump status: 0 ( 0 = Ok) dump completed successfully 0481-195 Failed to copy the dump from /dev/hd6 to /var/adm/ras ---------------------------------------------------------------------------------------------------------------- 23. How to check two file systems simultaneously on different drives in IBM AIX ? To check two file systems simultaneously on different drives in IBM AIX.The dfsck command permits you to interact with two fsck commands at once. To aid in this, the dfsck command displays the file system name with each message. When responding to a question from the dfsck command, prefix your response with a 1 or a 2 to indicate whether the answer refers to the first or second file system group. # dfsck [ FlagList1 ] FileSystem1 [ FlagList2 ] FileSystem2 Example to check two filesystems: # dfsck -p /dev/hd1 - -p /dev/hd7 Note: you can also specify the file system names found in the /etc/filesystems. Attention: Do not use the dfsck command to check the root file system. ---------------------------------------------------------------------------------------------------------------- 24. How to make a disk flash in IBM AIX ? For disk replacement, it is often useful to make the disk flash so that you know which disk to replace in IBM AIX: # diag to continue Select "Task Selection (Diagnostics, Advanced Diagnostics, Service Aids, etc.)" Select "Hot Plug Task"Select "SCSI and SCSI RAID Hot Plug Manager" Select "Identify a Device Attached to a SCSI Hot Swap Enclosure Device" Select the slot you wish the disk to flash Replace the appropriate disk by checking which disk is flashing ---------------------------------------------------------------------------------------------------------------- 25. How to determine which application created the OS core file in AIX ? # /usr/sbin/lquerypv -h /path/to/core 6b0 64 The output of this command is neat, clean and easy to read. Here is an example: # lquerypv -h core 6b0 64 000006B0 7FFFFFFF FFFFFFFF 7FFFFFFF FFFFFFFF ................ 000006C0 00000000 000007D0 7FFFFFFF FFFFFFFF ................ 000006D0 00120000 1312C9C0 00000000 00000017 ................ 000006E0 6E657473 63617065 5F616978 34000000 netscape_aix4... 000006F0 00000000 00000000 00000000 00000000 ................ 00000700 00000000 00000000 00000000 00000ADB ................ 00000710 00000000 000008BF 00000000 00000A1E ................ The executable is located between the pipes on the right hand side of the output. In this case, the core was generated by Netscape. ---------------------------------------------------------------------------------------------------------------- 26. How to find the system id number of an IBM AIX server ? # lsattr -El sys0 -a systemid # uname -u # lscfg -vpgrep -p "System VPD:" grep -i Serial Note: commands may not work on all IBM models ---------------------------------------------------------------------------------------------------------------- 27. Tips on Memory Tuning : Do not use the command vmtune in AIX 5L. From AIX 5L. vmo and ioo commands are introduced to tune memory and I/O. Here is how we set various memory options now: # vmo -p -o maxfree=128 # vmo -p -o minperm%=5 # vmo -p -o maxclient%=10 # vmo -p -o maxperm%=10 # vmo -p -o maxfree=632 # vmo -p -o minfree=600 # ioo -p -o maxpgahead=32 To see the results of the changes: # vmo –L # ioo –L If you have a problem with slow telnet sessions: # chdev -l sys0 -a maxpout='33' -a minpout='24' To set AIO (asynchronous IO) options use smitty ---------------------------------------------------------------------------------------------------------------- 28. If you reinstall an IBM AIX server and the mksysb used was for a server on another vlan, you can end up with 2 default routes. Lets see how to solve this issue In order to see the default routes stored in the ODM: # lsattr -El inet0 grep Route route net,-hopcount,0,,0,172.26.247.92 Route True route net,-hopcount,0,,0,172.26.14.1 Route True To see the default routes you have in your routing table: # netstat -rn grep default default 172.26.14.1 UGc 0 0 en0 - - =>
default 172.26.247.92 UGc 0 0 en0 - -
To remove one of the default routes, use smitty and not a the route command otherwise you will end up with 2 default routes after a reboot.
----------------------------------------------------------------------------------------------------------------
29. EFix - How to manage ?
To list efix :
# emgr –l
To install an efix package:
# emgr -e efixPackage
----------------------------------------------------------------------------------------------------------------
30. Kernel Processes in AIX: An Overview
# ps -kl
# pstat -a (as root)
There are several kprocs, and they do a number of things. Often they have fixed priority and will run ahead of any user processes.
A couple are:
Kproc Kernel (wait) wait process
Kproc Kernel (lrud) Least Recently Used Daemon (mem mgmt)
Kproc Kernal (swapper) Memory/Process swapping ?
Kproc Kernel (kbiod) Kernel Block I/O daemon (disk I/O)?
kproc Kernel ( gil) 1032 Networking off-level stuff
Kproc Kernel (netm) Network memory allocator
Kproc Kernel (aump) Automounter
A kproc is a kernel process, started by the kernel on behalf of either another kernel process, or as a result of an application initiating a system call or call to a kernel service.
Wait - You will find that the "wait" kproc will have accumulated a lot of cpu time. This just means your system is idle a lot. When nothing else needs to run, the wait kproc is charged the time slice
GIL - "Global ISR List"
ISR->Interrupt Service Routines - multithreaded kproc runs at fixed pri of 37 Used to process various timers (tcp, streams, ....) and also used to pass packets from demux layer to IP layer for non-CDLI drivers.
ps -lk
The processes with nice value of -- are running with fixed priorities. Their nice values can not be changed. On my system that would processes such as swapper (pid=0, pri=16) and the wait kproc (pid=514, pri=127).
This will list out all the kprocs. Now do a pstat -a to find what they really are?
This will show you the real name of the kproc.
Aump kernel thread is left over after you stop the automounter daemon.
Yes. It goes away after reboot. However, there is an APAR for it: IY33240
----------------------------------------------------------------------------------------------------------------
31. How to manage network tuning parameters in AIX ?
The no command is used to configure network tuning parameters in IBM AIX. The no command sets or displays current or next boot values for network tuning parameters. This command can also make permanent changes or defer changes until the next reboot.
Syntax:
no [ -p -r ] { -o Tunable[=NewValue] }
no [ -p -r ] {-d Tunable }
no [ -p -r ] { -D }
no [ -p -r ] -a
no -h [ Tunable ]
no -L [ Tunable ]
no -x [ Tunable ]
where:
-a : displays current, reboot or permanent value for all tunable parameters
-d : resets Tunable its to default value
-D : resets all tunables to their default value
-h : displays help about Tunable parameter
-L : lists the characteristics of one or all Tunables
-o : displays the value or sets the Tunable to NewValue
-p : makes changes apply to both current and reboot values (I heard this option in many interviews)
-r : makes changes apply to reboot values (I heard this option in many interviews)
-x : lists characteristics of one or all tunables using spreadhset format
Example to set the value of tunable tcp_sendspace to 65536 permanently:
# no -p -o tcp_sendspace=65536
Example to set the value of tunable sb_max to 2097152 at next reboot:
# no -r -o sb_max=2097152
Example to display all values of tunables:
# no -a
Example to display all values of tunables at next reboot:
# no -r -a
Example to set the value of tunable tcp_sendspace to 65536:
# no -o tcp_sendspace=65536
----------------------------------------------------------------------------------------------------------------
32. How to verify your ntp setup is working properly ?
# ntpq -c peersremote refid st t when poll reach delay offset disp =========================================================================
*time.domain.co doghaus.cns.uto 3 u 12 64 177 1.46 -2.515 145.19
If you have a star (*) in the first column of the name of the time server, your time is being synchronised properly.
The third column, st, is the stratum. The lower the number, the closer you are to the time source.
Stratum 16 means you are not synchronised.
----------------------------------------------------------------------------------------------------------------
33. How to syncronise your server time to a time server ?
If you don't want to run NTP (network time protocol), you can update your system time with the command:
# ntpdate timeserver.domain.com
----------------------------------------------------------------------------------------------------------------
34.How to list open files ?
To list all open files:
# lsof
To list all open files on a device:
# lsof /dev/hd4
----------------------------------------------------------------------------------------------------------------
35. How to find the world-wide name (WWN) or network address of a fibre-channel (FC) card in IBM AIX ?
First find the name of your fibre-channel cards:
# lsdev -vp grep fcs
Then get the WWN (for fcs0 in this example):
# lscfg -vp -l fcs0 grep "Network Address"
----------------------------------------------------------------------------------------------------------------
36. How to change a server hostname with the uname command in IBM AIX ?
To display the current hostname
# uname -n
localhost
To change the hostname
# uname -S newhostname
Again display the hostname
# uname -n
newhostname
----------------------------------------------------------------------------------------------------------------
37. How can you find out when a system was installed?
Enter the following command:
lslpp -h bos.rte
The output of this command will show the history of when the operating system was installed. Read the entry for the AIX level (ie, 4.3.3.0).
----------------------------------------------------------------------------------------------------------------
38. How to make sure all of the user definitions are correct in the user database
#usrck -n ALL
Do the same for the groups:
#grpck -n ALL
----------------------------------------------------------------------------------------------------------------
39. Reduce the Size of a File System in Your Root Volume Group ?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto/HT_baseadmn_rootvg_reduce.htm#baseadmn_rootvg_reduce
----------------------------------------------------------------------------------------------------------------
40.How to reset an Unknown Root Password ?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto/HT_baseadmn_recoverrootpswd.htm#baseadmn_recoverrootpswd
----------------------------------------------------------------------------------------------------------------
41. How to configure Domain Name Servers ?http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto/HT_commadmn_dns.htm#commadmn_dns
-----------------------------------------------------------------------------------------------------------------
42. How to re-create corrupted boot image ?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto/HT_baseadmn_bad_boot_img.htm#baseadmn_bad_boot_img
----------------------------------------------------------------------------------------------------------------
43. How to configure NIM master server using EZNIM ?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto/HT_insgdrf_configure_eznim.htm#insgdrf_configure_eznim
----------------------------------------------------------------------------------------------------------------
44. How do I use network (For normal users)
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto_user/Use_Network_U.htm#category_use_network
----------------------------------------------------------------------------------------------------------------
45. How do I view system and environment information?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto_user/AccessSys_Envir_U.htm#category_accesssys_envir
----------------------------------------------------------------------------------------------------------------
46. How do I use shell scripts ?http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto_user/Use_Shells_U.htm#category_use_shells
----------------------------------------------------------------------------------------------------------------
47. How do I redirect standard input, output, and error?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto_user/RedirStand_IO_Error_U.htm#category_redirstand_io_error
------------------------------------------------------------------------------------------------
48. How do I make my system more secure?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto_user/Security_U.htm#category_security
----------------------------------------------------------------------------------------------------------------
49. How to configure network adapters for Redundancy ?
http://users.ca.astound.net/~baspence/AIXtip/etherchannel.htm
----------------------------------------------------------------------------------------------------------------
50. Useful link about AIX Error Log codes, LED codes, 7-Digit Error codes.http://rainsux.dyndns.org/AIX5L-Messages-Codes.html
----------------------------------------------------------------------------------------------------------------
51. How to disable remote root login ?
When multiple users have root access to a system, a common security question is who logged in as root? One alternative is to disable remote logins for the root id (chuser -rlogin=false root). This forces users to first login in with their regular user id, then "su -" to root. All "su" activity is captured in /var/adm/sulog, thus answering the question of "who logged in as root."
Comment: In general it is a good practice to disable root remote access as it provides two layers of password protection.
----------------------------------------------------------------------------------------------------------------
52. Replacing a disk drive in AIX.
http://users.ca.astound.net/~baspence/AIXtip/download/failed_disk.pdf
----------------------------------------------------------------------------------------------------------------
53. How to enabling Non-root Users to Administer Passwords ?
The AIX pwdadm command can be used to offload password administration to non-root administrators. The pwdadm command allows the administrator to change anothers password, or force users to change their password at the next login. To enable a non-root administrator to use pwdadm, simply add their ID to the "security" group.
For more information: http://www.rs6000.ibm.com/doc_link/en_US/a_doc_lib/cmds/aixcmds4/pwdadm.htm
----------------------------------------------------------------------------------------------------------------
54. Fun with device locations
Here are a few commands to locate physical devices. These commands are useful in a partitioned environment where locations are virtual.
lsdev -Cc adapter -s pci - list all adapter slots lsdev -p adapter - lists devices owned by an adapter lsdev -Cl adapter -F parent lists the parent adapter for a device (like a disk drive) lsdev -Cl adapter - virtual device location (for LPARs) lscfg -vl adapter - actual device location So, for example, to locate the physical adapter connected to hdisk0:
# Identify the parent adapterlsdev -Cl hdisk0 -F parent
# Locate the parent adapter lscfg -vl parent
----------------------------------------------------------------------------------------------------------------
55. How to automate setting passwords ?
The "chpasswd" command is easier to use than "passwd" when setting a list of user passwords. It can be used from the command line or shell script. For example, to change passwords for users listed in a file, type the following
cat mypasswords chpasswd
Where the mypasswords file contains
user1:password1user2:password2......
For more information see the following URL
http://publib16.boulder.ibm.com/doc_link/en_US/a_doc_lib/cmds/aixcmds1/chpasswd.htm
----------------------------------------------------------------------------------------------------------------
56. How to list files if 'ls' is missing or corrupt ?
echo *
----------------------------------------------------------------------------------------------------------------
57. How to change the timezone and language in /etc/environment ?
chtz (timezone eg GMT0BST)
chlang (language eg En_GB)
----------------------------------------------------------------------------------------------------------------
58. Find large files
How do you find really large files in a file system:
find . -size +1024 -xdev -exec ls -l {} \;
The -xdev flag is used to only search within the same file system, instead of traversing the full directory tree. The amount specified (1024) is in blocks of 512 bytes.
----------------------------------------------------------------------------------------------------------------
59. Montoring a system without logging in
Let's say you have a helpdesk, where they must be able to run a script under user-id root to check or monitor a system:
First, create a script, you wish your helpdesk to run.
Modify your /etc/inetd.conf file and add:
check stream tcp wait root /usr/local/bin/script.sh
where script.sh is the script you've written.
Modify your /etc/services file and add:
check 4321/tcp
You may change the portnumber to anything you like, as long as it's not in use.
Now, you may run:
telnet [system] 4321
and your script will be magically run and it's output displayed on your screen. If the output of the script isn't displayed on your screen very long, just put a sleep command at the end of your script.
----------------------------------------------------------------------------------------------------------------
60. Changing maxuproc requires a reboot?
When you change MAXUPROC (Maximum number of processes allowed per user), the smitty help panel will tell you that changes to this operating system parameter will take effect after the next system reboot.
This is wrong Help information. The change takes effect immediately, if MAXUPROC is increased. If it is decreased, then it will take effect after the next system reboot.
This help panel text from smitty will be changed in AIX 5.3. APAR IY52397.
----------------------------------------------------------------------------------------------------------------
61. Defunct processes
Defunct processes are commonly known as "zombies". You can't "kill" a zombie as it is already dead. Zombies are created when a process (typically a child process) terminates either abnormally or normally and it's spawning process (typically a parent process) does not "wait" for it (or has yet to "wait" for it) to return an exit status.
It should be noted that zombies DO NOT consume any system resources (except a process slot in the process table). They are there to stay until the server is rebooted.
Zombies commonly occur on programs that were (incompletely) ported from old BSD systems to modern SysV systems, because the semantics of signals and/or waiting is different between these two OS families.
------------------------------------------------------------------------------------------------
62. DLpar with DVD-ROM
Adding a DVD-ROM with DLpar is very easy. Removing however, can be somewhat more difficult, especially when you've run cfgmgr and devices have been configured.
This is how to remove it:
#rmdev -dl cd0
(Remove all cdrom devices found with lsdev -Cc cdrom)
#rmdev -dl ide0
Then remove the devices found with
# lsdev -C grep pci
All PCI devices still in use, can not be removed. The one not in use, is the PCI device where the DVD-ROM drive on was configured. You have to remove it before you can do a DLPAR remove operation on it.
Now do your DLPAR remove operation n HMC
------------------------------------------------------------------------------------------------
63. How do you send an attachment via mail from AIX ?
Uuencode is the answer:
uuencode [source-file] [filename].b64 mail -v -s "subject" [email-address]
For example:
# uuencode /etc/motd motd.b64 mail -v -s "Message of the day" email@hostname.comI
use the .b64 extension which gets recognized by Winzip. When you received your email in Outlook, you will have an attachment, which can be opened by Winzip.
------------------------------------------------------------------------------------------------
64. FTP umask
A way to change the default 027 umask of ftp is to change the entry in /etc/inetd.conf for ftpd:
ftp stream tcp6 nowait root /usr/sbin/ftpd -l -u 117
This will create files with umask 117 (mode 660).
Using the -l option will make sure the FTP sessions are logged to the syslogd. If you want to see these FTP messages in the syslogd output, then you should add in /etc/syslog.conf:
daemon.info [filename]
AIX - Tips n Tricks - Part II
1. How to configure the system and create a restricted shell user ?
Below example shows how to create a restricted shell user (this user can execute only "ls" and "vi" commands
a) Make a reduced bin directory to contain links to programs for the user or users:
# mkdir /usr/rbin
b) Link the necessary commands and programs in the reduced bin directory.
For example, give access to the ls and vi commands:
# ln -s /usr/bin/ls /usr/rbin/ls
# ln -s /usr/bin/vi /usr/rbin/vi
c) Add Rsh as a valid shell in /etc/security/login.cfg:
# vi /etc/security/login.cfg
d) Add /usr/bin/Rsh to the list of shells in the usw stanza:
usw:
shells = /bin/sh,/bin/bsh,/bin/csh,/bin/ksh,/bin/tsh,/bin/ksh93,/usr/bin/sh,
/usr/bin/bsh,/usr/bin/csh,/usr/bin/ksh,/usr/bin/tsh,/usr/bin/ksh93,/usr/sbin/
uucp/uucico,/usr/sbin/sliplogin,/usr/sbin/snapp,/usr/bin/Rsh
e) Add the restricted shell user:
# mkuser shell="/usr/bin/Rsh" alex
f) Assign an initial password:
# passwd alex
g) Change the ownership of the users profile to root:
# chown root:system /home/alex/.profile
h) Change the permissions of the users profile to 755:
# chmod 755 /home/alex/.profile
i) Edit the users profile setting the PATH and Shell variables:
# vi /home/alex/.profile
Set PATH for the new bin directory and Set SHELL to rksh:
PATH=/usr/rbin; export SHELL=/usr/bin/Rsh
---------------------------------------------------------------------------¬---------------------------------
2. How to change the default welcome (herald) message on the login display ?
Edit the file /etc/security/login.cfg and update the herald parameter ...
default:
herald = "Unauthorized use of this system is prohibited\n\nlogin: "
sak_enable = false
logintimes =
logindisable = 0
logininterval = 0
loginreenable = 0
logindelay = 0
You can also use the below command to change the herald value
# chsec -f /etc/security/login.cfg -a default -herald "Unauthorized use of this system is prohibited.\n\nlogin: "
---------------------------------------------------------------------------¬---------------------------------
3. How to set automatic logoff (only for terminals) ?
Edit the /etc/security/.profile file to include an automatic logoff value for all users, as in the following example:
TMOUT=600 ; TIMEOUT=600 ; export readonly TMOUT TIMEOUT
The number 600, in this example, is in seconds, which is equal to 10 minutes. However, this method only works from the shell.
---------------------------------------------------------------------------¬---------------------------------
4. How to auto forward the mails ?
Create $HOME/.forward file and add adresses or aliases.
When mail is sent to a local user, the sendmail command checks for the $HOME/.forward file.
If the file exists, the message is not sent to the user. The message is sent to the addresses or aliases in the $HOME/.forward file.
---------------------------------------------------------------------------¬---------------------------------
5. How to set(define) and unset a variable in a shell or shell script ?
# x=3 -> Defines a vlue for a variable 'x'
# echo $x -> Displays the vlue of 'x' vairable
3
# unset x -> Unsets the variable
# echo $x -> Again display its value
#
---------------------------------------------------------------------------¬---------------------------------
6. How to send file1 as a message to user alex ?
# mail alex <>
---------------------------------------------------------------------------¬---------------------------------
7. How to display mail queue ?
Note: mailq is the queue where your mails are stored
# mailq (or) sendmail -bp
There is 1 request in the mail queue
---QID---- --Size-- -----Q-Time----- ----------Sender/ Recipient-----------
OAA 19258 * 29 Mon Jun 26 14:57 root
---------------------------------------------------------------------------¬---------------------------------
8. Whats sendmail command?
It receives formatted messages and routes messages to one or more users. IT can deliver messages to users on local/remote machines. It will be started by tcpip sub-system . It uses /etc/mail/sendmail.cf as config file.
Once this daemon started, you can find its process id in
/etc/sendmail.pid.
---------------------------------------------------------------------------¬---------------------------------
9. How to define mail aliases for users?
a) Add the aliases to /etc/aliases.
For Example,
nobody: /dev/null
certify: user02, user5801@server3, root@server4, user5911@se
b) Rebuild the aliases database using
newaliases (or) sendmail -bi
---------------------------------------------------------------------------¬---------------------------------
10. If logging with telnet takes long time (for ex. 2 mins), what might be the issue?
There might be problem with DNS resolution. Check /etc/resolv.conf and check dns connection thru nslookup command.
---------------------------------------------------------------------------¬---------------------------------
11. While attempting to log in, you see the below message. How you solve this issue ?
'All available login sessions are in use.'
Check the number of AIX user license using "lslicense"
If required increase the license using "chlicense" command.
---------------------------------------------------------------------------¬---------------------------------
12. Oracle DBA says that his database is not able to go beyond certain limit. For example, oracle userid is not able to start more than 500 process's. Whats the issue?
This is because of the "maxuproc" value is 500. Check the value using "lsattr -El sys0 -a maxuproc"
If required change the value using
# chdev -l sys0 -a maxuproc=1000
Normally for Oracle Production machines, you have to consult with DBA's while installing the server and set an agreed value.
---------------------------------------------------------------------------¬---------------------------------
13. Errpt is not displaying any reports. Found that /var/adm/ras/ errlog file is there in the location and errdemon is running fine. What might be the issue the issue?
errlog file seems to be corrupted. Delete the file and stop the errdemon (/usr/lib/errstop).
Start the errdaemon (/usr/lib/errdemon). While starting, daemon creates the errlog file automatically.
---------------------------------------------------------------------------¬---------------------------------
14. How to list IDE controllers in your system ?
# lscfg -l ide*
DEVICE LOCATION DESCRIPTION
ide0 01-00-00 ATA/IDE Controller Device
ide1 01-00-01 ATA/IDE Controller Device
The following sample display from the lscfg -l ide command shows
There are 2 IDE I/O controllers configured in the server
Controller ide0 and ide1 are located on the system planar ( Notice 1st and 2nd digits in location code)
The planar indicator is the second digit in the location value with a value of 1.
6th digit indicates the controller number.
---------------------------------------------------------------------------¬---------------------------------
15. After a successful login, the login command displays the message of the day, the date and time of the last successful and unsuccessful login attempts for this user, and the total number of unsuccessful login attempts for this user since the last change of authentication information (usually a password).
How do you suppress these messages?
You can suppress these messages by creating a “.hushlogin” file in your home directory.
For Example,
At the prompt in your home directory, type the following:
# touch .hushlogin
The touch command creates the empty file named .hushlogin if it does not already exist. The next time you log in, all login messages will be suppressed. You can instruct the system to retain only the message of the day, while suppressing other login messages.
---------------------------------------------------------------------------¬---------------------------------
16. Whats the files system read once you login ?
First File : /etc/environment - contains variables specifying the basic environment for all processes.
Second File: /etc/profile - controls system-wide default variables
Third File : $HOME/.profile - lets you customize your individual working environment
Fourth File: $HOME/.env - lets you customize your individual working environment variables.
---------------------------------------------------------------------------¬---------------------------------
17. How to override variables defined in /etc/environment for a particular user?
A fourth file that the operating system uses at login time is the
$HOME/.env file, if your .profile contains the following line:
export ENV=$HOME/.env
The .env file lets you customize your individual working environment variables. The .env file contains the individual user environment variables that override the variables set in the /etc/environment file. You can customize your environment variables as desired by modifying your .env file.
---------------------------------------------------------------------------¬---------------------------------
18. How to change the font in AIX ?
To change the font to an italic, roman, and bold face of the same size, type the following:
# chfont -n /usr/lpp/fonts/It114.snf /usr/lpp/fonts/Bld14.snf /usr/lpp/
> fonts/Rom14.snf
You can also use smitty chfont.
---------------------------------------------------------------------------¬---------------------------------
19. How to run a process in the background ?
For Ex, to run script1.sh in background run
# script1.sh &
But this script process gets killed if you close the terminal
So always practice to run using nohup,
# nohup script1.sh &
Usage of nohup doesn't kill the process if you close the telnet session. Output from the process/script will be stored in a file called nohup.out in the directory from where you started the process.
This will help you in case if you want to start backup using mksysb and close your terminal/ leaving office, you can safely use "nohup command &". Next day morning, you can view the contents of nohup.out to know the status of the backup job.
---------------------------------------------------------------------------¬---------------------------------
20. What is the default priority for a process?
Default priority is 0. Priority numbers is in the range of -20 to 20. Highest number is the lowest priority and lowest number has high priority while using resources.
To set the priority while start a process, use nice command.
If the process is already running, you can use "renice" command to change its priority.
---------------------------------------------------------------------------¬---------------------------------
21. How to stop, resume and to make it foreground process?
To stop(pause) a foreground process, use
Cntrol + Z keys ie., Ctrl+Z.
Note: Ctrl+Z works in the Korn shell (ksh) and C shell (csh), but not in the Bourne shell (bsh).
To restart a stopped process, you must either be the user who started the process or have root user authority.
To restart a stopped process, enter
# kill -19 pid
To run it in foreground, enter
# fg pid
where pid is the process id which can be obtained from the following command
ps -ef | grep precess_name | awk '{print $2}'
---------------------------------------------------------------------------¬---------------------------------
22. How to display a program output as well as copying to a file ?
Normally usage of output redirection suppresses the output on screen.
Ex. ls -l > file1
If we want to redirect the output as well as show the output in screen use the tee command.
Ex: ls -l | tee -a file1
---------------------------------------------------------------------------¬---------------------------------
23..How to capture your terminal screen to a file ?
To capture the screen of a terminal, at the prompt, type the following:
#script
The system displays information similar to the following:
Script command is started. The file is typescript.
Everything displayed on the screen is now copied to the "typescript" file.
To stop the script command, press Ctrl-D or type exit and press Enter.
The system displays information similar to the following:
^D
Script command is complete. The file is typescript.
Use the cat command to display the contents of your file.
---------------------------------------------------------------------------¬---------------------------------
24. What are the supported file systems in AIX ?
a) JFS (or) JFS2 - Disk based file system
b) NFS - Network based File system
c) CDRFS - CDROM based file system
d) UDFS - DVD-ROM based file system
e) RAMFS - RAM based file system used while booting the system
---------------------------------------------------------------------------¬---------------------------------
25. What are the different directory abbreviations?
Abbreviation Meaning
. The current working directory
.. The parent of the current working directory
~ Your home directory
$HOME Your home directory
---------------------------------------------------------------------------¬---------------------------------
26. What are the different directory path names ?
Absolute path name:
Traces the path from the /(root) directory. Absolute path names always
begin with the slash (/) symbol.
Ex. /home/ raja/dir1
Relative path name:
Traces the path from the current directory through its parent or its
subdirectories and files. As user "raja", I can say ./dir1 since I'm already in /home/raja
---------------------------------------------------------------------------¬---------------------------------
27. How to move a directory ?
# mvdir book manual
This moves the book directory under the directory named manual, if the
manual directory exists. Otherwise, the book directory is renamed to manual.
---------------------------------------------------------------------------¬---------------------------------
28. What the RAID groups AIX LVM supports?
RAID-0 - Striping
RAID-1 - Mirroring
RAID-10 (or) RAID 0+1 - Mirroring and striping
---------------------------------------------------------------------------¬---------------------------------
29. How to read and remove mails from my system mailbox?
At your system command line prompt, enter the mail command:
# mail
If there is no mail in your system mailbox, the system responds with a message:
No mail for YourID
If there is mail in your mailbox, the system displays a listing of the messages in your system mailbox:
# mail
Here Type ? for help.
"/usr/mail/lance": 3 messages 3 new
>N 1 karen Tue Apr 27 16:10 12/321 "Dept Meeting"
N 2 lois Tue Apr 27 16:50 10/350 "System News"
N 3 tom Tue Apr 27 17:00 11/356 "Tools Available"
The current message is always prefixed with a greater-than symbol (>).
Each one-line entry displays the following fields:
status - Indicates the class of the message.
number - Identifies the piece of mail to the mail program.
sender - Identifies the address of the person who sent the mail.
date - Specifies the date the message was received.
size - Defines the number of lines and characters contained in the
message (this includes the header).
subject - Identifies the subject of the message, if it has one.
The status can be any of the following:
N - A new message.
P - A message that will be preserved in
---------------------------------------------------------------------------¬---------------------------------
30. After logging as an application user (oradba), when I issued "crontab -l" system throwed the below error
0481-103 Cannot open a file in the /var/spool/cron/crontabs directory.
What is the solution?
Here is the solution
a) Create an empty file /var/spool/cron/crontabs/oradba
b) Change the ownership of the file to root.cron
c) Login as oradba and issue "crontab -l" to verify the cron.
---------------------------------------------------------------------------¬---------------------------------
31. How to identify the program listening in the given port ?
METHOD I: # lsof –P –n –i :505 (for port 505)
METHOD II:
# netstat -Aan|grep 9404
f100060006952b98 tcp 0 0 *.9404 *.* LIST
EN
f100060006a90b98 tcp 0 0 *.19404 *.* LIST
EN
# rmsock f100060006952b98 tcpcb
The socket 0x6952808 is being held by proccess 753870 (java).
---------------------------------------------------------------------------¬---------------------------------
32. How to display non-printable characters in a text file ?
Lets create a file with non-printable characters.
# vi filename.txt
^I^I^I^I$
$
$
$
this is a test$
^I^I^I^I$
~
: set list
Now we will list the file so that non-printable chars are viewed
# cat -vet filename.txt
^I^I^I^I$
$
$
$
this is a test$
^I^I^I^I$
# od -c filename.txt
0000000 \t \t \t \t \n \n \n \n t h i s i s
0000020 a t e s t \n \t \t \t \t \n
0000034
---------------------------------------------------------------------------¬---------------------------------
33. How to display specific lines in a text files ?
For illustration purposes, I'm using the cat -n filename to show the line numbers in this script.
# cat -n filename
...
8 for i in $*
9
10 do
11
12 typeset -i16 hex
13 hex=$i
14 print $i equals $hex in hexadecimal
15
16 typeset -i8 oct
17 oct=$i
18 print $i equals $oct in octal
19
20 typeset -i2 bin
21 bin=$i
22 print $i equals $bin in binary
23
24 print
25 done
...
Prints out the for loop without displaying the line numbers
# sed -n 8,25p filename | tee for_loop
---------------------------------------------------------------------------¬---------------------------------
34. How to recover the root password in AIX ?
If you forgotten the root password, we can easily recover it but the system requires 2 recycles.
Here is the way I follow
Password recovery is one of the simplest troubleshooting procedure in
AIX. Once you boot from CD, you see a menu with 3 menu items.
In that select the 3rd item
ie., "Start Maintenance Mode for System Recovery" Ã
"Access a Root Volume Group" ->
"Access this volume group and start a shell".
This will open a shell prompt. The just use "passwd" command for
setting a new password for root.
Thats it. root password has been changed.
Now you can reboot the machine from rootvg hard disk (normally it should be hdisk0)
---------------------------------------------------------------------------¬---------------------------------
34. How to find out the (real) memory usage ?
# svmon -G
size inuse free pin virtual
memory 2097152 2097026 126 195637 1237158
pg space 524288 61023
work pers clnt lpage
pin 195404 233 0 0
in use 1189840 906786 400 0
The size and inuse columns of the memory and pgspace output represent real memory and paging space usage respectively.
The size is measured as the number of 4K pages.
Here in this case used memory is
= ((2097026 x 4)/1024)/1024 GB of used memory
---------------------------------------------------------------------------¬---------------------------------
35. Here are some of the errors you get when paging space is low.
INIT: Paging space is low!
ksh: cannot fork no swap space
Not enough memory
Fork function failed
fork () system call failed
Unable to fork, too many processes
Fork failure - not enough memory available
Fork function not allowed. Not enough memory available.
---------------------------------------------------------------------------¬---------------------------------
36. How is the default paging space size determined ?
It follows the following standard
Set paging space to 2 times the amount of RAM
Paging space can use no more than 20% of total disk space in the root volume Group
Paging space can be no larger than 2 GB
1. To confirm which network adapter is plugged into the switch in IBM AIX
Try this next time to confirm what adapter is actually plugged in to a switch:
This will not only tell you where you have a connection but it will also tell you what speed the port on the switch is.
# netstat -v grep -E "ETHERMedia"
ETHERNET STATISTICS (ent0) :
Media Speed Selected: 100 Mbps Full Duplex
Media Speed Running: 100 Mbps Full Duplex
ETHERNET STATISTICS (ent1) :
Media Speed Selected: 1000 Mbps Full Duplex
Media Speed Running: 1000 Mbps Full Duplex
----------------------------------------------------------------------------------------------------------------
2. How to determine if IBM AIX 64 bit kernel (sotware) is installed on your IBM AIX server?
# lslpp -l bos.64bit
bos.64bit 4.3.3.76 COMMITTED Base Operating System 64 bit
----------------------------------------------------------------------------------------------------------------
3. How you create a snapshot of your IBM AIX server to send to IBM for tech support ?
This is requried whenver you face issues with your server and you seek help from IBM to sort out the issue.
# snap -gc
This will create a file called snap.pax.Z in the directory /tmp/ibmsupt. Send this file to IBM so that they will get full configuration of your server.
----------------------------------------------------------------------------------------------------------------
4. How to enable entended history in AIX 5.3 ?
In AIX 5.3, you have the capability to have a time stamped history. To enable it, just set the following variable:
EXTENDED_HISTORY=ON
Example:
export EXTENDED_HISTORY=ON
If required add this line to your .profile.
----------------------------------------------------------------------------------------------------------------
5. How to find the microcode level of tape drives ?
To find microcode level (firmware) of tape drives in IBM AIX:
# tapeutil -f /dev/rmt1 vpd
----------------------------------------------------------------------------------------------------------------
6. pgrep and pkill - how to terminate processes ?
You can use the pgrep and pkill commands to identify and stop command processes that you no longer want to run. These commands are useful when you mistakenly start a process that takes a long time to run.
To terminate a process:
a. pgrep - to find out the PID(s) for the process(es)
b. pkill - followed by the PID(s)
The following example illustrates how to find all the processes with a specific name (xterm) and terminate the xterm process that was started last.
# pgrep xterm 17818 17828 17758 18210
# pkill -n 18210
Note: If you need to forcibly terminate a process, use the -9 option to the pkill command.
For Example
# kill -9 -n xterm
----------------------------------------------------------------------------------------------------------------
7. How to modify Asynchronous I/O variables in AIX ?
To modify the minservers asynchronous I/O variable (MINIMUM number of servers) in IBM AIX:
# chdev -l aio0 -a minservers='1'
To modify the maxservers asynchronous I/O variable (MAXIMUM number of servers per cpu) in IBM AIX 5L:
# chdev -l aio0 -a maxservers='10'
To modify the maxservers asynchronous I/O variable (MAXIMUM number of servers) in IBM AIX v4.3:
# chdev -l aio0 -a maxservers='80'
To modify the requests asynchronous I/O variable (Maximum number of REQUESTS) in IBM AIX:
# chdev -l aio0 -a requests='4096'
Notes:
1) Valeus will only take effect after a reboot
2) You may use multiple -a options on the same command line3)
The maxservers variable is PER CPU for AIX 5L and TOTAL for AIX v4.3
----------------------------------------------------------------------------------------------------------------
8. How to display microcode and firmware levels of the system and adapters in IBM AIX ?
To displays microcode level information for all supported devices in IBM AIX :
# lsmcode -A
sys0!system:SF240_284 (t) SF240_261 (p) SF240_284 (t)
ent0!14108902.DV0210
ent1!14108902.DV0210
ent2!14108902.DV0210
ent3!14108902.DV0210
sisscsia0!44415254.05080064
sisscsia1!44415255.050A0064
hdisk0!ST37320.4A553042.43373038
hdisk1!ST37320.4A553042.43373038
----------------------------------------------------------------------------------------------------------------
9. How to determine if simultaneous multi-threading (SMT) is enabled in AIX ?
Your system is capable of SMT if it's a POWER5-based system running AIX 5L Version 5.3.To determine if it is enabled:# smtctl
To enable SMT: # smtctl -m on [ -w boot now]
To disable SMT: # smtctl -m off [ -w boot now]
Note: If neither the -w boot or the -w now options are specified, then the mode change is made immediately. It persists across subsequent reboots if you run the bosboot command before the next system reboot.
----------------------------------------------------------------------------------------------------------------
10. How to find top users of memory space in IBM AIX ?
To list the top ten users of paging space in IBM AIX:
# svmon -Pgt 10
To list the top ten users of realmem in IBM AIX:
# svmon -Put 10
----------------------------------------------------------------------------------------------------------------
11. How to list the filesystems in a volume group in IBM AIX ?
# lsvgfs volume_group
----------------------------------------------------------------------------------------------------------------
12. How to query the volume group descriptor area on a drive in IBM AIX ?
To query the volume group descriptor area on the drive, so you can find out if there's a VG on the disk, even if there isn't anything imported on the drive in IBM AIX:
# lqueryvg -Atp hdisk#
----------------------------------------------------------------------------------------------------------------
13. How to set IBM AIX for full core dumps and files to unlimited
To set IBM AIX for full core dumps to unlimited:
# ulimit -c unlimited
To set IBM AIX for files to unlimited:
# ulimit -f unlimited
To view your ulimit settings:
# ulimit –a
----------------------------------------------------------------------------------------------------------------
14. How to determine what the speed and duplex is of an interface in AIX ?
# entstat -d en0 grep "Media Speed"
------------------------------------------------------------------------------------------------
15. How to find the highest technology level and service pack installed in IBM AIX ?
Starting in 2006, in IBM AIX, a Maintenance Level will be referred to as a Technology Level and will only be released twice per year.
The Service Pack concept will allow service-only updates (as known as PTF’s) that are released between Technology Levels to be grouped together for easier identification.
Sample output for a V5.3 system, with Technology Level 4, and Service Pack 2 installed would be:
# oslevel –s
----------------------------------------------------------------------------------------------------------------
16. How to find number of active processors in IBM AIX ?
To find the number of active processors in IBM AIX:
# bindprocessor -q
The available processors are: 0 1 2
To find the number of processors installed (but not necessarily available):
# lscfg -v grep proc
proc0 00-00 Processor
proc2 00-02 Processor
proc4 00-04 Processor
----------------------------------------------------------------------------------------------------------------
17. How to read the contents of /etc/security/failedlogin in IBM AIX ?
# who /etc/security/failedlogin# /usr/sbin/acct/fwtmp < /etc/security/failedlogin ---------------------------------------------------------------------------------------------------------------- 18. How to reserve or lock a terminal under AIX 5L ? This command will lock your terminal and reserve it for later use in AIX 5L # lock # lock -30 (lock for 30 minutes) Note: default lock time is 15 mintes lock will ask you for a password twice then lock the terminal. you can unlock it by entering the password a third time. ---------------------------------------------------------------------------------------------------------------- 19. How to stop IBM AIX from forcing a user to change their password at first login ? # pwdadm -c username ---------------------------------------------------------------------------------------------------------------- 20. Redhat Package Manager for IBM AIX and Linux To query all packages installed: # rpm -q –a To list file in a specific package: # rpm -q -l package.rpm To install a RPM package: # rpm -i package.rpm To delete (erase) a RPM package: # rpm -e package.rpm To query for RPM package owning file: # rpm -q -f /path/to/file To upgrade a RPM package # rpm -U package.rpm ---------------------------------------------------------------------------------------------------------------- 21. How to capture TCPIP packet information in IBM AIX ? Here is a command to capture TCP/IP packet information between your server and another in IBM AIX: Become root user, Find a temporary directory to capture the data (/tmp in this example) Run the iptrace command: # iptrace -a -d host_destination -b /tmp/ip.out iptrace will run in the background and results will be in /tmp/ip.out To see the results of the trace: # ipreport /tmp/ip.out more Don't foregt to kill iptrace when you're done: # ps -ef grep iptrace grep -v grep awk '{system("kill " $2)}' Some other cool options of iptrace: -d : specify destination IP address -s : specify origin IP address -b : show 2-way traffic (as in "-s xxx -b" or "-d xxx -b") -a : no ARP requests (less pollution in the trace) To see all packets going in and out of server, unixserv, without ARP requests: # iptrace -a -d unixserv -b /tmp/ip.out iptrace and ipreport are in IBM AIX LPP "bos.net.tcp.server" ---------------------------------------------------------------------------------------------------------------- 22. How to check if a system dump completed successfully on IBM AIX ? To verify if a system dump completed successfully on an IBM AIX server: # sysdumpdev –L 0453-039 Device name: /dev/hd6 Major device number: 10 Minor device number: 1 Size: 124371456 bytes Date/Time: Sun Sep 15 12:19:02 EDT 2002 Dump status: 0 ( 0 = Ok) dump completed successfully 0481-195 Failed to copy the dump from /dev/hd6 to /var/adm/ras ---------------------------------------------------------------------------------------------------------------- 23. How to check two file systems simultaneously on different drives in IBM AIX ? To check two file systems simultaneously on different drives in IBM AIX.The dfsck command permits you to interact with two fsck commands at once. To aid in this, the dfsck command displays the file system name with each message. When responding to a question from the dfsck command, prefix your response with a 1 or a 2 to indicate whether the answer refers to the first or second file system group. # dfsck [ FlagList1 ] FileSystem1 [ FlagList2 ] FileSystem2 Example to check two filesystems: # dfsck -p /dev/hd1 - -p /dev/hd7 Note: you can also specify the file system names found in the /etc/filesystems. Attention: Do not use the dfsck command to check the root file system. ---------------------------------------------------------------------------------------------------------------- 24. How to make a disk flash in IBM AIX ? For disk replacement, it is often useful to make the disk flash so that you know which disk to replace in IBM AIX: # diag to continue Select "Task Selection (Diagnostics, Advanced Diagnostics, Service Aids, etc.)" Select "Hot Plug Task"Select "SCSI and SCSI RAID Hot Plug Manager" Select "Identify a Device Attached to a SCSI Hot Swap Enclosure Device" Select the slot you wish the disk to flash Replace the appropriate disk by checking which disk is flashing ---------------------------------------------------------------------------------------------------------------- 25. How to determine which application created the OS core file in AIX ? # /usr/sbin/lquerypv -h /path/to/core 6b0 64 The output of this command is neat, clean and easy to read. Here is an example: # lquerypv -h core 6b0 64 000006B0 7FFFFFFF FFFFFFFF 7FFFFFFF FFFFFFFF ................ 000006C0 00000000 000007D0 7FFFFFFF FFFFFFFF ................ 000006D0 00120000 1312C9C0 00000000 00000017 ................ 000006E0 6E657473 63617065 5F616978 34000000 netscape_aix4... 000006F0 00000000 00000000 00000000 00000000 ................ 00000700 00000000 00000000 00000000 00000ADB ................ 00000710 00000000 000008BF 00000000 00000A1E ................ The executable is located between the pipes on the right hand side of the output. In this case, the core was generated by Netscape. ---------------------------------------------------------------------------------------------------------------- 26. How to find the system id number of an IBM AIX server ? # lsattr -El sys0 -a systemid # uname -u # lscfg -vpgrep -p "System VPD:" grep -i Serial Note: commands may not work on all IBM models ---------------------------------------------------------------------------------------------------------------- 27. Tips on Memory Tuning : Do not use the command vmtune in AIX 5L. From AIX 5L. vmo and ioo commands are introduced to tune memory and I/O. Here is how we set various memory options now: # vmo -p -o maxfree=128 # vmo -p -o minperm%=5 # vmo -p -o maxclient%=10 # vmo -p -o maxperm%=10 # vmo -p -o maxfree=632 # vmo -p -o minfree=600 # ioo -p -o maxpgahead=32 To see the results of the changes: # vmo –L # ioo –L If you have a problem with slow telnet sessions: # chdev -l sys0 -a maxpout='33' -a minpout='24' To set AIO (asynchronous IO) options use smitty ---------------------------------------------------------------------------------------------------------------- 28. If you reinstall an IBM AIX server and the mksysb used was for a server on another vlan, you can end up with 2 default routes. Lets see how to solve this issue In order to see the default routes stored in the ODM: # lsattr -El inet0 grep Route route net,-hopcount,0,,0,172.26.247.92 Route True route net,-hopcount,0,,0,172.26.14.1 Route True To see the default routes you have in your routing table: # netstat -rn grep default default 172.26.14.1 UGc 0 0 en0 - - =>
default 172.26.247.92 UGc 0 0 en0 - -
To remove one of the default routes, use smitty and not a the route command otherwise you will end up with 2 default routes after a reboot.
----------------------------------------------------------------------------------------------------------------
29. EFix - How to manage ?
To list efix :
# emgr –l
To install an efix package:
# emgr -e efixPackage
----------------------------------------------------------------------------------------------------------------
30. Kernel Processes in AIX: An Overview
# ps -kl
# pstat -a (as root)
There are several kprocs, and they do a number of things. Often they have fixed priority and will run ahead of any user processes.
A couple are:
Kproc Kernel (wait) wait process
Kproc Kernel (lrud) Least Recently Used Daemon (mem mgmt)
Kproc Kernal (swapper) Memory/Process swapping ?
Kproc Kernel (kbiod) Kernel Block I/O daemon (disk I/O)?
kproc Kernel ( gil) 1032 Networking off-level stuff
Kproc Kernel (netm) Network memory allocator
Kproc Kernel (aump) Automounter
A kproc is a kernel process, started by the kernel on behalf of either another kernel process, or as a result of an application initiating a system call or call to a kernel service.
Wait - You will find that the "wait" kproc will have accumulated a lot of cpu time. This just means your system is idle a lot. When nothing else needs to run, the wait kproc is charged the time slice
GIL - "Global ISR List"
ISR->Interrupt Service Routines - multithreaded kproc runs at fixed pri of 37 Used to process various timers (tcp, streams, ....) and also used to pass packets from demux layer to IP layer for non-CDLI drivers.
ps -lk
The processes with nice value of -- are running with fixed priorities. Their nice values can not be changed. On my system that would processes such as swapper (pid=0, pri=16) and the wait kproc (pid=514, pri=127).
This will list out all the kprocs. Now do a pstat -a to find what they really are?
This will show you the real name of the kproc.
Aump kernel thread is left over after you stop the automounter daemon.
Yes. It goes away after reboot. However, there is an APAR for it: IY33240
----------------------------------------------------------------------------------------------------------------
31. How to manage network tuning parameters in AIX ?
The no command is used to configure network tuning parameters in IBM AIX. The no command sets or displays current or next boot values for network tuning parameters. This command can also make permanent changes or defer changes until the next reboot.
Syntax:
no [ -p -r ] { -o Tunable[=NewValue] }
no [ -p -r ] {-d Tunable }
no [ -p -r ] { -D }
no [ -p -r ] -a
no -h [ Tunable ]
no -L [ Tunable ]
no -x [ Tunable ]
where:
-a : displays current, reboot or permanent value for all tunable parameters
-d : resets Tunable its to default value
-D : resets all tunables to their default value
-h : displays help about Tunable parameter
-L : lists the characteristics of one or all Tunables
-o : displays the value or sets the Tunable to NewValue
-p : makes changes apply to both current and reboot values (I heard this option in many interviews)
-r : makes changes apply to reboot values (I heard this option in many interviews)
-x : lists characteristics of one or all tunables using spreadhset format
Example to set the value of tunable tcp_sendspace to 65536 permanently:
# no -p -o tcp_sendspace=65536
Example to set the value of tunable sb_max to 2097152 at next reboot:
# no -r -o sb_max=2097152
Example to display all values of tunables:
# no -a
Example to display all values of tunables at next reboot:
# no -r -a
Example to set the value of tunable tcp_sendspace to 65536:
# no -o tcp_sendspace=65536
----------------------------------------------------------------------------------------------------------------
32. How to verify your ntp setup is working properly ?
# ntpq -c peersremote refid st t when poll reach delay offset disp =========================================================================
*time.domain.co doghaus.cns.uto 3 u 12 64 177 1.46 -2.515 145.19
If you have a star (*) in the first column of the name of the time server, your time is being synchronised properly.
The third column, st, is the stratum. The lower the number, the closer you are to the time source.
Stratum 16 means you are not synchronised.
----------------------------------------------------------------------------------------------------------------
33. How to syncronise your server time to a time server ?
If you don't want to run NTP (network time protocol), you can update your system time with the command:
# ntpdate timeserver.domain.com
----------------------------------------------------------------------------------------------------------------
34.How to list open files ?
To list all open files:
# lsof
To list all open files on a device:
# lsof /dev/hd4
----------------------------------------------------------------------------------------------------------------
35. How to find the world-wide name (WWN) or network address of a fibre-channel (FC) card in IBM AIX ?
First find the name of your fibre-channel cards:
# lsdev -vp grep fcs
Then get the WWN (for fcs0 in this example):
# lscfg -vp -l fcs0 grep "Network Address"
----------------------------------------------------------------------------------------------------------------
36. How to change a server hostname with the uname command in IBM AIX ?
To display the current hostname
# uname -n
localhost
To change the hostname
# uname -S newhostname
Again display the hostname
# uname -n
newhostname
----------------------------------------------------------------------------------------------------------------
37. How can you find out when a system was installed?
Enter the following command:
lslpp -h bos.rte
The output of this command will show the history of when the operating system was installed. Read the entry for the AIX level (ie, 4.3.3.0).
----------------------------------------------------------------------------------------------------------------
38. How to make sure all of the user definitions are correct in the user database
#usrck -n ALL
Do the same for the groups:
#grpck -n ALL
----------------------------------------------------------------------------------------------------------------
39. Reduce the Size of a File System in Your Root Volume Group ?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto/HT_baseadmn_rootvg_reduce.htm#baseadmn_rootvg_reduce
----------------------------------------------------------------------------------------------------------------
40.How to reset an Unknown Root Password ?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto/HT_baseadmn_recoverrootpswd.htm#baseadmn_recoverrootpswd
----------------------------------------------------------------------------------------------------------------
41. How to configure Domain Name Servers ?http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto/HT_commadmn_dns.htm#commadmn_dns
-----------------------------------------------------------------------------------------------------------------
42. How to re-create corrupted boot image ?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto/HT_baseadmn_bad_boot_img.htm#baseadmn_bad_boot_img
----------------------------------------------------------------------------------------------------------------
43. How to configure NIM master server using EZNIM ?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto/HT_insgdrf_configure_eznim.htm#insgdrf_configure_eznim
----------------------------------------------------------------------------------------------------------------
44. How do I use network (For normal users)
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto_user/Use_Network_U.htm#category_use_network
----------------------------------------------------------------------------------------------------------------
45. How do I view system and environment information?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto_user/AccessSys_Envir_U.htm#category_accesssys_envir
----------------------------------------------------------------------------------------------------------------
46. How do I use shell scripts ?http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto_user/Use_Shells_U.htm#category_use_shells
----------------------------------------------------------------------------------------------------------------
47. How do I redirect standard input, output, and error?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto_user/RedirStand_IO_Error_U.htm#category_redirstand_io_error
------------------------------------------------------------------------------------------------
48. How do I make my system more secure?
http://publib16.boulder.ibm.com/pseries/en_US/infocenter/howto_user/Security_U.htm#category_security
----------------------------------------------------------------------------------------------------------------
49. How to configure network adapters for Redundancy ?
http://users.ca.astound.net/~baspence/AIXtip/etherchannel.htm
----------------------------------------------------------------------------------------------------------------
50. Useful link about AIX Error Log codes, LED codes, 7-Digit Error codes.http://rainsux.dyndns.org/AIX5L-Messages-Codes.html
----------------------------------------------------------------------------------------------------------------
51. How to disable remote root login ?
When multiple users have root access to a system, a common security question is who logged in as root? One alternative is to disable remote logins for the root id (chuser -rlogin=false root). This forces users to first login in with their regular user id, then "su -" to root. All "su" activity is captured in /var/adm/sulog, thus answering the question of "who logged in as root."
Comment: In general it is a good practice to disable root remote access as it provides two layers of password protection.
----------------------------------------------------------------------------------------------------------------
52. Replacing a disk drive in AIX.
http://users.ca.astound.net/~baspence/AIXtip/download/failed_disk.pdf
----------------------------------------------------------------------------------------------------------------
53. How to enabling Non-root Users to Administer Passwords ?
The AIX pwdadm command can be used to offload password administration to non-root administrators. The pwdadm command allows the administrator to change anothers password, or force users to change their password at the next login. To enable a non-root administrator to use pwdadm, simply add their ID to the "security" group.
For more information: http://www.rs6000.ibm.com/doc_link/en_US/a_doc_lib/cmds/aixcmds4/pwdadm.htm
----------------------------------------------------------------------------------------------------------------
54. Fun with device locations
Here are a few commands to locate physical devices. These commands are useful in a partitioned environment where locations are virtual.
lsdev -Cc adapter -s pci - list all adapter slots lsdev -p adapter - lists devices owned by an adapter lsdev -Cl adapter -F parent lists the parent adapter for a device (like a disk drive) lsdev -Cl adapter - virtual device location (for LPARs) lscfg -vl adapter - actual device location So, for example, to locate the physical adapter connected to hdisk0:
# Identify the parent adapterlsdev -Cl hdisk0 -F parent
# Locate the parent adapter lscfg -vl parent
----------------------------------------------------------------------------------------------------------------
55. How to automate setting passwords ?
The "chpasswd" command is easier to use than "passwd" when setting a list of user passwords. It can be used from the command line or shell script. For example, to change passwords for users listed in a file, type the following
cat mypasswords chpasswd
Where the mypasswords file contains
user1:password1user2:password2......
For more information see the following URL
http://publib16.boulder.ibm.com/doc_link/en_US/a_doc_lib/cmds/aixcmds1/chpasswd.htm
----------------------------------------------------------------------------------------------------------------
56. How to list files if 'ls' is missing or corrupt ?
echo *
----------------------------------------------------------------------------------------------------------------
57. How to change the timezone and language in /etc/environment ?
chtz (timezone eg GMT0BST)
chlang (language eg En_GB)
----------------------------------------------------------------------------------------------------------------
58. Find large files
How do you find really large files in a file system:
find . -size +1024 -xdev -exec ls -l {} \;
The -xdev flag is used to only search within the same file system, instead of traversing the full directory tree. The amount specified (1024) is in blocks of 512 bytes.
----------------------------------------------------------------------------------------------------------------
59. Montoring a system without logging in
Let's say you have a helpdesk, where they must be able to run a script under user-id root to check or monitor a system:
First, create a script, you wish your helpdesk to run.
Modify your /etc/inetd.conf file and add:
check stream tcp wait root /usr/local/bin/script.sh
where script.sh is the script you've written.
Modify your /etc/services file and add:
check 4321/tcp
You may change the portnumber to anything you like, as long as it's not in use.
Now, you may run:
telnet [system] 4321
and your script will be magically run and it's output displayed on your screen. If the output of the script isn't displayed on your screen very long, just put a sleep command at the end of your script.
----------------------------------------------------------------------------------------------------------------
60. Changing maxuproc requires a reboot?
When you change MAXUPROC (Maximum number of processes allowed per user), the smitty help panel will tell you that changes to this operating system parameter will take effect after the next system reboot.
This is wrong Help information. The change takes effect immediately, if MAXUPROC is increased. If it is decreased, then it will take effect after the next system reboot.
This help panel text from smitty will be changed in AIX 5.3. APAR IY52397.
----------------------------------------------------------------------------------------------------------------
61. Defunct processes
Defunct processes are commonly known as "zombies". You can't "kill" a zombie as it is already dead. Zombies are created when a process (typically a child process) terminates either abnormally or normally and it's spawning process (typically a parent process) does not "wait" for it (or has yet to "wait" for it) to return an exit status.
It should be noted that zombies DO NOT consume any system resources (except a process slot in the process table). They are there to stay until the server is rebooted.
Zombies commonly occur on programs that were (incompletely) ported from old BSD systems to modern SysV systems, because the semantics of signals and/or waiting is different between these two OS families.
------------------------------------------------------------------------------------------------
62. DLpar with DVD-ROM
Adding a DVD-ROM with DLpar is very easy. Removing however, can be somewhat more difficult, especially when you've run cfgmgr and devices have been configured.
This is how to remove it:
#rmdev -dl cd0
(Remove all cdrom devices found with lsdev -Cc cdrom)
#rmdev -dl ide0
Then remove the devices found with
# lsdev -C grep pci
All PCI devices still in use, can not be removed. The one not in use, is the PCI device where the DVD-ROM drive on was configured. You have to remove it before you can do a DLPAR remove operation on it.
Now do your DLPAR remove operation n HMC
------------------------------------------------------------------------------------------------
63. How do you send an attachment via mail from AIX ?
Uuencode is the answer:
uuencode [source-file] [filename].b64 mail -v -s "subject" [email-address]
For example:
# uuencode /etc/motd motd.b64 mail -v -s "Message of the day" email@hostname.comI
use the .b64 extension which gets recognized by Winzip. When you received your email in Outlook, you will have an attachment, which can be opened by Winzip.
------------------------------------------------------------------------------------------------
64. FTP umask
A way to change the default 027 umask of ftp is to change the entry in /etc/inetd.conf for ftpd:
ftp stream tcp6 nowait root /usr/sbin/ftpd -l -u 117
This will create files with umask 117 (mode 660).
Using the -l option will make sure the FTP sessions are logged to the syslogd. If you want to see these FTP messages in the syslogd output, then you should add in /etc/syslog.conf:
daemon.info [filename]
AIX - Tips n Tricks - Part II
1. How to configure the system and create a restricted shell user ?
Below example shows how to create a restricted shell user (this user can execute only "ls" and "vi" commands
a) Make a reduced bin directory to contain links to programs for the user or users:
# mkdir /usr/rbin
b) Link the necessary commands and programs in the reduced bin directory.
For example, give access to the ls and vi commands:
# ln -s /usr/bin/ls /usr/rbin/ls
# ln -s /usr/bin/vi /usr/rbin/vi
c) Add Rsh as a valid shell in /etc/security/login.cfg:
# vi /etc/security/login.cfg
d) Add /usr/bin/Rsh to the list of shells in the usw stanza:
usw:
shells = /bin/sh,/bin/bsh,/bin/csh,/bin/ksh,/bin/tsh,/bin/ksh93,/usr/bin/sh,
/usr/bin/bsh,/usr/bin/csh,/usr/bin/ksh,/usr/bin/tsh,/usr/bin/ksh93,/usr/sbin/
uucp/uucico,/usr/sbin/sliplogin,/usr/sbin/snapp,/usr/bin/Rsh
e) Add the restricted shell user:
# mkuser shell="/usr/bin/Rsh" alex
f) Assign an initial password:
# passwd alex
g) Change the ownership of the users profile to root:
# chown root:system /home/alex/.profile
h) Change the permissions of the users profile to 755:
# chmod 755 /home/alex/.profile
i) Edit the users profile setting the PATH and Shell variables:
# vi /home/alex/.profile
Set PATH for the new bin directory and Set SHELL to rksh:
PATH=/usr/rbin; export SHELL=/usr/bin/Rsh
---------------------------------------------------------------------------¬---------------------------------
2. How to change the default welcome (herald) message on the login display ?
Edit the file /etc/security/login.cfg and update the herald parameter ...
default:
herald = "Unauthorized use of this system is prohibited\n\nlogin: "
sak_enable = false
logintimes =
logindisable = 0
logininterval = 0
loginreenable = 0
logindelay = 0
You can also use the below command to change the herald value
# chsec -f /etc/security/login.cfg -a default -herald "Unauthorized use of this system is prohibited.\n\nlogin: "
---------------------------------------------------------------------------¬---------------------------------
3. How to set automatic logoff (only for terminals) ?
Edit the /etc/security/.profile file to include an automatic logoff value for all users, as in the following example:
TMOUT=600 ; TIMEOUT=600 ; export readonly TMOUT TIMEOUT
The number 600, in this example, is in seconds, which is equal to 10 minutes. However, this method only works from the shell.
---------------------------------------------------------------------------¬---------------------------------
4. How to auto forward the mails ?
Create $HOME/.forward file and add adresses or aliases.
When mail is sent to a local user, the sendmail command checks for the $HOME/.forward file.
If the file exists, the message is not sent to the user. The message is sent to the addresses or aliases in the $HOME/.forward file.
---------------------------------------------------------------------------¬---------------------------------
5. How to set(define) and unset a variable in a shell or shell script ?
# x=3 -> Defines a vlue for a variable 'x'
# echo $x -> Displays the vlue of 'x' vairable
3
# unset x -> Unsets the variable
# echo $x -> Again display its value
#
---------------------------------------------------------------------------¬---------------------------------
6. How to send file1 as a message to user alex ?
# mail alex <>
---------------------------------------------------------------------------¬---------------------------------
7. How to display mail queue ?
Note: mailq is the queue where your mails are stored
# mailq (or) sendmail -bp
There is 1 request in the mail queue
---QID---- --Size-- -----Q-Time----- ----------Sender/ Recipient-----------
OAA 19258 * 29 Mon Jun 26 14:57 root
---------------------------------------------------------------------------¬---------------------------------
8. Whats sendmail command?
It receives formatted messages and routes messages to one or more users. IT can deliver messages to users on local/remote machines. It will be started by tcpip sub-system . It uses /etc/mail/sendmail.cf as config file.
Once this daemon started, you can find its process id in
/etc/sendmail.pid.
---------------------------------------------------------------------------¬---------------------------------
9. How to define mail aliases for users?
a) Add the aliases to /etc/aliases.
For Example,
nobody: /dev/null
certify: user02, user5801@server3, root@server4, user5911@se
b) Rebuild the aliases database using
newaliases (or) sendmail -bi
---------------------------------------------------------------------------¬---------------------------------
10. If logging with telnet takes long time (for ex. 2 mins), what might be the issue?
There might be problem with DNS resolution. Check /etc/resolv.conf and check dns connection thru nslookup command.
---------------------------------------------------------------------------¬---------------------------------
11. While attempting to log in, you see the below message. How you solve this issue ?
'All available login sessions are in use.'
Check the number of AIX user license using "lslicense"
If required increase the license using "chlicense" command.
---------------------------------------------------------------------------¬---------------------------------
12. Oracle DBA says that his database is not able to go beyond certain limit. For example, oracle userid is not able to start more than 500 process's. Whats the issue?
This is because of the "maxuproc" value is 500. Check the value using "lsattr -El sys0 -a maxuproc"
If required change the value using
# chdev -l sys0 -a maxuproc=1000
Normally for Oracle Production machines, you have to consult with DBA's while installing the server and set an agreed value.
---------------------------------------------------------------------------¬---------------------------------
13. Errpt is not displaying any reports. Found that /var/adm/ras/ errlog file is there in the location and errdemon is running fine. What might be the issue the issue?
errlog file seems to be corrupted. Delete the file and stop the errdemon (/usr/lib/errstop).
Start the errdaemon (/usr/lib/errdemon). While starting, daemon creates the errlog file automatically.
---------------------------------------------------------------------------¬---------------------------------
14. How to list IDE controllers in your system ?
# lscfg -l ide*
DEVICE LOCATION DESCRIPTION
ide0 01-00-00 ATA/IDE Controller Device
ide1 01-00-01 ATA/IDE Controller Device
The following sample display from the lscfg -l ide command shows
There are 2 IDE I/O controllers configured in the server
Controller ide0 and ide1 are located on the system planar ( Notice 1st and 2nd digits in location code)
The planar indicator is the second digit in the location value with a value of 1.
6th digit indicates the controller number.
---------------------------------------------------------------------------¬---------------------------------
15. After a successful login, the login command displays the message of the day, the date and time of the last successful and unsuccessful login attempts for this user, and the total number of unsuccessful login attempts for this user since the last change of authentication information (usually a password).
How do you suppress these messages?
You can suppress these messages by creating a “.hushlogin” file in your home directory.
For Example,
At the prompt in your home directory, type the following:
# touch .hushlogin
The touch command creates the empty file named .hushlogin if it does not already exist. The next time you log in, all login messages will be suppressed. You can instruct the system to retain only the message of the day, while suppressing other login messages.
---------------------------------------------------------------------------¬---------------------------------
16. Whats the files system read once you login ?
First File : /etc/environment - contains variables specifying the basic environment for all processes.
Second File: /etc/profile - controls system-wide default variables
Third File : $HOME/.profile - lets you customize your individual working environment
Fourth File: $HOME/.env - lets you customize your individual working environment variables.
---------------------------------------------------------------------------¬---------------------------------
17. How to override variables defined in /etc/environment for a particular user?
A fourth file that the operating system uses at login time is the
$HOME/.env file, if your .profile contains the following line:
export ENV=$HOME/.env
The .env file lets you customize your individual working environment variables. The .env file contains the individual user environment variables that override the variables set in the /etc/environment file. You can customize your environment variables as desired by modifying your .env file.
---------------------------------------------------------------------------¬---------------------------------
18. How to change the font in AIX ?
To change the font to an italic, roman, and bold face of the same size, type the following:
# chfont -n /usr/lpp/fonts/It114.snf /usr/lpp/fonts/Bld14.snf /usr/lpp/
> fonts/Rom14.snf
You can also use smitty chfont.
---------------------------------------------------------------------------¬---------------------------------
19. How to run a process in the background ?
For Ex, to run script1.sh in background run
# script1.sh &
But this script process gets killed if you close the terminal
So always practice to run using nohup,
# nohup script1.sh &
Usage of nohup doesn't kill the process if you close the telnet session. Output from the process/script will be stored in a file called nohup.out in the directory from where you started the process.
This will help you in case if you want to start backup using mksysb and close your terminal/ leaving office, you can safely use "nohup command &". Next day morning, you can view the contents of nohup.out to know the status of the backup job.
---------------------------------------------------------------------------¬---------------------------------
20. What is the default priority for a process?
Default priority is 0. Priority numbers is in the range of -20 to 20. Highest number is the lowest priority and lowest number has high priority while using resources.
To set the priority while start a process, use nice command.
If the process is already running, you can use "renice" command to change its priority.
---------------------------------------------------------------------------¬---------------------------------
21. How to stop, resume and to make it foreground process?
To stop(pause) a foreground process, use
Cntrol + Z keys ie., Ctrl+Z.
Note: Ctrl+Z works in the Korn shell (ksh) and C shell (csh), but not in the Bourne shell (bsh).
To restart a stopped process, you must either be the user who started the process or have root user authority.
To restart a stopped process, enter
# kill -19 pid
To run it in foreground, enter
# fg pid
where pid is the process id which can be obtained from the following command
ps -ef | grep precess_name | awk '{print $2}'
---------------------------------------------------------------------------¬---------------------------------
22. How to display a program output as well as copying to a file ?
Normally usage of output redirection suppresses the output on screen.
Ex. ls -l > file1
If we want to redirect the output as well as show the output in screen use the tee command.
Ex: ls -l | tee -a file1
---------------------------------------------------------------------------¬---------------------------------
23..How to capture your terminal screen to a file ?
To capture the screen of a terminal, at the prompt, type the following:
#script
The system displays information similar to the following:
Script command is started. The file is typescript.
Everything displayed on the screen is now copied to the "typescript" file.
To stop the script command, press Ctrl-D or type exit and press Enter.
The system displays information similar to the following:
^D
Script command is complete. The file is typescript.
Use the cat command to display the contents of your file.
---------------------------------------------------------------------------¬---------------------------------
24. What are the supported file systems in AIX ?
a) JFS (or) JFS2 - Disk based file system
b) NFS - Network based File system
c) CDRFS - CDROM based file system
d) UDFS - DVD-ROM based file system
e) RAMFS - RAM based file system used while booting the system
---------------------------------------------------------------------------¬---------------------------------
25. What are the different directory abbreviations?
Abbreviation Meaning
. The current working directory
.. The parent of the current working directory
~ Your home directory
$HOME Your home directory
---------------------------------------------------------------------------¬---------------------------------
26. What are the different directory path names ?
Absolute path name:
Traces the path from the /(root) directory. Absolute path names always
begin with the slash (/) symbol.
Ex. /home/ raja/dir1
Relative path name:
Traces the path from the current directory through its parent or its
subdirectories and files. As user "raja", I can say ./dir1 since I'm already in /home/raja
---------------------------------------------------------------------------¬---------------------------------
27. How to move a directory ?
# mvdir book manual
This moves the book directory under the directory named manual, if the
manual directory exists. Otherwise, the book directory is renamed to manual.
---------------------------------------------------------------------------¬---------------------------------
28. What the RAID groups AIX LVM supports?
RAID-0 - Striping
RAID-1 - Mirroring
RAID-10 (or) RAID 0+1 - Mirroring and striping
---------------------------------------------------------------------------¬---------------------------------
29. How to read and remove mails from my system mailbox?
At your system command line prompt, enter the mail command:
If there is no mail in your system mailbox, the system responds with a message:
No mail for YourID
If there is mail in your mailbox, the system displays a listing of the messages in your system mailbox:
Here Type ? for help.
"/usr/mail/lance": 3 messages 3 new
>N 1 karen Tue Apr 27 16:10 12/321 "Dept Meeting"
N 2 lois Tue Apr 27 16:50 10/350 "System News"
N 3 tom Tue Apr 27 17:00 11/356 "Tools Available"
The current message is always prefixed with a greater-than symbol (>).
Each one-line entry displays the following fields:
status - Indicates the class of the message.
number - Identifies the piece of mail to the mail program.
sender - Identifies the address of the person who sent the mail.
date - Specifies the date the message was received.
size - Defines the number of lines and characters contained in the
message (this includes the header).
subject - Identifies the subject of the message, if it has one.
The status can be any of the following:
N - A new message.
P - A message that will be preserved in
---------------------------------------------------------------------------¬---------------------------------
30. After logging as an application user (oradba), when I issued "crontab -l" system throwed the below error
0481-103 Cannot open a file in the /var/spool/cron/crontabs directory.
What is the solution?
Here is the solution
a) Create an empty file /var/spool/cron/crontabs/oradba
b) Change the ownership of the file to root.cron
c) Login as oradba and issue "crontab -l" to verify the cron.
---------------------------------------------------------------------------¬---------------------------------
31. How to identify the program listening in the given port ?
METHOD I: # lsof –P –n –i :505 (for port 505)
METHOD II:
# netstat -Aan|grep 9404
f100060006952b98 tcp 0 0 *.9404 *.* LIST
EN
f100060006a90b98 tcp 0 0 *.19404 *.* LIST
EN
# rmsock f100060006952b98 tcpcb
The socket 0x6952808 is being held by proccess 753870 (java).
---------------------------------------------------------------------------¬---------------------------------
32. How to display non-printable characters in a text file ?
Lets create a file with non-printable characters.
# vi filename.txt
^I^I^I^I$
$
$
$
this is a test$
^I^I^I^I$
~
: set list
Now we will list the file so that non-printable chars are viewed
# cat -vet filename.txt
^I^I^I^I$
$
$
$
this is a test$
^I^I^I^I$
# od -c filename.txt
0000000 \t \t \t \t \n \n \n \n t h i s i s
0000020 a t e s t \n \t \t \t \t \n
0000034
---------------------------------------------------------------------------¬---------------------------------
33. How to display specific lines in a text files ?
For illustration purposes, I'm using the cat -n filename to show the line numbers in this script.
# cat -n filename
...
8 for i in $*
9
10 do
11
12 typeset -i16 hex
13 hex=$i
14 print $i equals $hex in hexadecimal
15
16 typeset -i8 oct
17 oct=$i
18 print $i equals $oct in octal
19
20 typeset -i2 bin
21 bin=$i
22 print $i equals $bin in binary
23
24 print
25 done
...
Prints out the for loop without displaying the line numbers
# sed -n 8,25p filename | tee for_loop
---------------------------------------------------------------------------¬---------------------------------
34. How to recover the root password in AIX ?
If you forgotten the root password, we can easily recover it but the system requires 2 recycles.
Here is the way I follow
Password recovery is one of the simplest troubleshooting procedure in
AIX. Once you boot from CD, you see a menu with 3 menu items.
In that select the 3rd item
ie., "Start Maintenance Mode for System Recovery" Ã
"Access a Root Volume Group" ->
"Access this volume group and start a shell".
This will open a shell prompt. The just use "passwd" command for
setting a new password for root.
Thats it. root password has been changed.
Now you can reboot the machine from rootvg hard disk (normally it should be hdisk0)
---------------------------------------------------------------------------¬---------------------------------
34. How to find out the (real) memory usage ?
# svmon -G
size inuse free pin virtual
memory 2097152 2097026 126 195637 1237158
pg space 524288 61023
work pers clnt lpage
pin 195404 233 0 0
in use 1189840 906786 400 0
The size and inuse columns of the memory and pgspace output represent real memory and paging space usage respectively.
The size is measured as the number of 4K pages.
Here in this case used memory is
= ((2097026 x 4)/1024)/1024 GB of used memory
---------------------------------------------------------------------------¬---------------------------------
35. Here are some of the errors you get when paging space is low.
INIT: Paging space is low!
ksh: cannot fork no swap space
Not enough memory
Fork function failed
fork () system call failed
Unable to fork, too many processes
Fork failure - not enough memory available
Fork function not allowed. Not enough memory available.
---------------------------------------------------------------------------¬---------------------------------
36. How is the default paging space size determined ?
It follows the following standard
Set paging space to 2 times the amount of RAM
Paging space can use no more than 20% of total disk space in the root volume Group
Paging space can be no larger than 2 GB
Subscribe to:
Posts (Atom)