Tanti Technology

My photo
Bangalore, karnataka, India
Multi-platform UNIX systems consultant and administrator in mutualized and virtualized environments I have 4.5+ years experience in AIX system Administration field. This site will be helpful for system administrator in their day to day activities.Your comments on posts are welcome.This blog is all about IBM AIX Unix flavour. This blog will be used by System admins who will be using AIX in their work life. It can also be used for those newbies who want to get certifications in AIX Administration. This blog will be updated frequently to help the system admins and other new learners. DISCLAIMER: Please note that blog owner takes no responsibility of any kind for any type of data loss or damage by trying any of the command/method mentioned in this blog. You may use the commands/method/scripts on your own responsibility. If you find something useful, a comment would be appreciated to let other viewers also know that the solution/method work(ed) for you.

Tuesday 10 December 2013

Paging Space Tips on AIX 5.3 or above


Technote (FAQ)
Question
This document contains tips for allocating paging space on the system. The information contained in this document is valid for 5.3 and above.

Answer
AIX 5L provides two enhancements for managing paging space. A new command, swapoff , allows you to deactivate a paging space. The -d flag, for thechps command, provides the ability to decrease the size of a paging space. For both commands, a system reboot is no longer required.
Deactivating a paging space
To deactivate a paging space with the swapoff command, you can either use:
#swapoff device name {device name ...} Or a system management tool, such as SMIT (fast path swapoff).
This command may fail due to:
§  Paging space size constraints
§  While deactivating one paging space, it is necessary to move all pages in use to another active paging space (which must be large enough to accommodate those pages).
§  I/O errors
§  If I/O errors should occur,:
§  check the error log,
§  deactivate the problematic paging space for the next system reboot with the chps command, and reboot the system.
OR
§  deactivate the problematic paging space for the next system reboot with Web-based System Manager by selecting the problematic paging space from either the Paging Space, Logical Volume or Volume Groups plug-in and selecting Stop...(2) from the Selected pulldown or popup menu.
*Note: Do not try to reactivate paging spaces with I/O errors before you have checked the corresponding disk with the appropriate diagnostic tools. The lsps command will display, in this case, the string I/O error in the column with the heading Active.
Decreasing the size of a paging space
By using the new -d flag, you can decrease the size of an existing paging space using the chps command as follows:
     #chps -d LogicalPartitions PagingSpace
or specify it on the SMIT panel (fast path 'smitty chps').
Using Web-based System Manager, a paging space can be dynamically decreased in size by selecting that paging space, bringing up the Properties dialog for that paging space, and inputting the size to deallocate in either megabytes or physical partitions. Web-based System Manager then issues the appropriate commands to perform the action and automatically notifies you of success or any error condition it encounters.
The actual processing is done by the shell script shrinkps. In the case of decreasing the size of an active paging space, shrinkps will create a temporary paging space, move all pages from the paging space to be decreased to this temporary one, delete the old paging space, recreate it with the new size, move all the pages back, and finally delete the temporary paging space. This temporary paging space is always created in the same volume group as the one you try to decrease. It is therefore, necessary, to have enough space available in the volume group for this temporary paging space. If you decrease the size of a deactivated paging space, the creation of a temporary paging space is not necessary and therefore omitted.
The following example shows the commands needed to remove one logical partition from paging01:
#lsps -a
Page Space Physical Volume Volume Group Size %Used Active Auto Type
paging01     hdisk0         rootvg      48MB   1    yes    yes lv
hd6          hdisk0         rootvg      32MB   11   yes    yes lv

#chps -d 1 paging01
shrinkps:Temporary paging space paging00 created.
shrinkps:Paging space paging01 removed.
shrinkps:Paging space paging01 recreated with new size.

#lsps -a
Page Space Physical Volume Volume Group Size %Used Active Auto Type
paging01     hdisk0         rootvg      32MB   1    yes    yes lv
hd6          hdisk0         rootvg      32MB   12   yes    yes lv

You can see from the above description, the deactivation or decrease in size of an active paging space can result in a noticeable performance degradation, depending on the size and usage of the paging space and the current system workload. But the main advantage is that there is no system reboot necessary to rearrange the paging space.
If you are working with the primary paging space (usually hd6), this command will prevent you from decreasing the size below 32 MB or actually deleting it. If you decrease the primary paging space, a temporary boot image and a temporary /sbin/rc.boot pointing to this temporary primary paging space will be created to make sure the system is always in a state where it can be safely rebooted.
NOTE: These command enhancements are not available through the Web-based System Manager. The Web-based System Manager allows you, by default, to specify the increase in size for a paging space in the megabytes field.

AIX 5L Version 5.1 Performance Management Guide: Resource Management Overview chapter Performance Overview of the Virtual Memory Manager (VMM) section can be found at the following URL:http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixbman/prftungd/2365c22.htm#HDRI36312
The amount of paging space required depends on the types of activities performed on the system. If paging space runs low, processes may be lost, and if paging space runs out, the system may panic. When a paging-space low condition is detected, additional paging space should be activated.
The system monitors the number of free paging space blocks and detects when a paging space shortage exists. When the number of free paging space blocks falls below a threshold known as the paging space warning level, the system sends the SIGDANGER signal to all processes except the kprocsprocess. If the shortage continues, free paging space blocks can fall below a second threshold known as the paging space kill level. In this event, the SIGKILL signal is sent to processes that are major users of paging space and do not have a signal handler for the SIGDANGER signal. (The default action for the SIGDANGER signal is to ignore the signal.) The system continues sending SIGKILL signals until the number of free paging space blocks is above the paging space kill level.
You can ensure the existence of sufficient paging space for processes that dynamically allocate memory by monitoring the paging space levels with the psdanger subroutine or by using special allocation routines. The disclaim subroutine can be used to prevent processes from ending when the paging space kill level is reached. To do this, define a signal handler for the SIGDANGER signal and release memory and paging space resources allocated in the processes data and stack areas and in shared memory segments.
For more information on persistent and working segments, request The AIX Virtual Memory Manager (VMM) document.

Paging space requirements are unique for each system, depending on the applications that are running, the number of active users, and other factors. With the appearance of large amounts of RAM and database applications, previous paging space rules of thumb have become invalid.
Option 1
Systems with large amounts of memory typically do not need large amounts of paging space. In a persistent storage environment, where the system hosts a few small programs and a large amount of data, the system may need less than one times (1X) its RAM size for paging space. For example, a 100GB database server that runs on a system with 16GB of RAM and uses only 2GB of working storage does not need 16GB, or even 8GB, of paging space. Because the 100GB database is mostly persistent storage and requires little or no paging space, it needs only the amount of paging space that allows all the working storage to be paged out to disk.
Option 2
The 1X RAM rule is suggested for use with less than or equal to 4GB of RAM. However, the paging space will have to be monitored during a period of heavy load to establish whether the paging space size is sufficient. Check the npswarn value of the vmo or vmo command output and compare this value to the %used value of the lsps command output. When the %used value is equivalent to the npswarn value, then SIGDANGER signals are sent to processes. At this point, it is a good idea to increase or add another paging space. The npswarn value is explained in the Tuning paging space thresholds section of this document.
Option 3
For RAM sizes greater than 4GB, such as 16GB, 32GB or even 96GB, memory requirements for applications would have to be researched in order to approximate the recommended paging space sizes. When researching these memory requirements , keep in mind how the paging space will be allocated, that is, deferred or late.

Before creating a new paging space or enlarging an existing paging space, consider the following:
  • If a disk drive containing an active hd6 paging space logical volume is removed from the system, the system will crash.
      Do not put more than one paging space logical volume on a physical volume.If you add more than one paging space to one of the physical volumes, the paging activity is no longer spread equally across the physical volumes.

  • All processes started during the boot process are allocated paging space on the default paging space logical volume (hd6). When additional paging space logical volumes are activated, paging space is allocated in a "round robin" manner, in 4KB chunks.
     
  • Avoid putting a paging space logical volume on the same physical volume as a heavily active logical volume, such as that used by a database.
     
  • It is not necessary to put a paging space logical volume on each physical volume.
     
  • Make each paging space logical volume roughly equal in size.
     
  • If paging spaces are of different sizes, and the smaller ones become full, paging activity will no longer be spread across all of the physical volumes.
     
  • Do not extend a paging space logical volume onto multiple physical volumes.
     
  • For best system performance, put paging space logical volumes on physical volumes that are each attached to a different disk controller.
     
  • It is technically supported to create default paging space (hd6) on ESS, EMC or RAID array, although it is not recommended, and should be avoided if possible.
NOTE: If system is paging enough to cause an I/O bottleneck, tuning the location of the paging space is not the answer.
In this case, consult Chapter 7, "Monitoring and Tuning Memory Use of Performance Management Guide", in the Performance Management Guide at this location:

Allocating more paging space than necessary results in unused paging space that wastes disk space. However, allocating too little paging space can result in one or more of the avoidable symptoms listed below. Use the following guidelines for determining the necessary paging space:
  • Enlarge paging space if any of the following messages are displayed on the console or in response to a command on any terminal:
INIT: Paging space is low
ksh: cannot fork no swap space
Not enough memory
Fork function failed
fork () system call failed
Unable to fork, too many processes
Fork failure - not enough memory available
Fork function not allowed. Not enough memory available.
Cannot fork: Not enough space
SIGKILL

  • Add a paging space if the average of the %Used column in the output of the lsps -a command is greater than 80.
     
  • Add a paging space if the %Used column in the output of the lsps -s command is greater than 80.
NOTE: Only extend a paging space as a last option.
Use the following commands to determine if you need to make changes regarding paging space logical volumes:
iostat
Check the tm_act field for the hdisk containing the paging space for a high percentage relative to the other hdisks
vmstat
Assure fr/sr columns of the vmstat page field do not consistently exceed the ratio of 1:4.
lsps
Use the -a flag to list all characteristics of all paging spaces. The size is given in megabytes. Use the -s flag to list the summary characteristics of all paging spaces. This information consists of the total paging space in megabytes and the percentage of paging space currently assigned (used). If the -s flag is specified, all other flags are ignored.


Follow these steps to add a paging space. Please note that the command output shown in this example may differ from the command output on your system.
  1. Check existing paging spaces and available physical volumes.
The following command lists characteristics for all existing paging spaces:
#lsps -a
Page Space  Physical Volume   Volume Group    Size    %Used  Active  Auto  Type
Paging02      hdisk2             rootvg       512MB     3     yes     yes   lv
Paging01      hdisk3             testcase     512MB     3     yes     yes   lv
Paging00      hdisk1             rootvg       512MB     2     yes     yes   lv
Hd6           hdisk0             rootvg       512MB     3     yes     yes   lv 

The following command lists the available physical volumes:
#lspv
hdisk0         000336524e264c40    rootvg
hdisk1         00033652f9fe5c81    doomvg
hdisk2         00302593eb30798f    none
hdisk3         00033652fa08edca    testcase

  1. Make sure the physical volume where the paging space will be assigned is part of a volume group (check column 3). In this example, paging space can be created on all hdisks, except hdisk2.
The following command displays detailed information about the physical volume within a volume group where you plan to assign the paging space.
#lspv hdisk3
PHYSICAL VOLUME:    hdisk3                   VOLUME GROUP:     testcase
PV IDENTIFIER:      00033652fa08edca         VG IDENTIFIER     0008508436f7d210
PV STATE:           active
STALE PARTITIONS:   0                        ALLOCATABLE:      yes
PP SIZE:            8 megabyte(s)            LOGICAL VOLUMES:  3
TOTAL PPs:          268 (2144 megabytes)     VG DESCRIPTORS:   2
FREE PPs:           201 (1608 megabytes)
USED PPs:           67 (536 megabytes)
FREE DISTRIBUTION:  00..00..00..00..07
USED DISTRIBUTION:  54..54..53..53..47

  1. Make a note of the following items
    PP SIZE: (8 megabyte(s) in this example)
    FREE PPs: (201 (1608 megabytes) in this example)
     
  2. Decide the size of the new paging space, remembering that keeping paging space sizes equal improves system performance. This example creates a paging space of 512 MB. Use smitty fastpath to display the Volume Group name window.
Do one of the following:
#smitty mkps
OR
#smitty
  >>System Storage Management
      >>Logical Volume Manager
              >>Paging Space
                    >>Add another paging space.

  1. Choose the volume group name of which the physical volume is a part. The next screen is displayed, with the SIZE of paging space field highlighted. This value entered in this field is the number of logical partitions.
     
  2. To calculate the number of logical partitions, divide the number of MBs to be created by the PP SIZE value. This value equals the number of logical partitions. Enter this number in the highlighted field.
      # of logical = Prospective Paging Space Size in MBs
      Partitions             PP SIZE

  1. Select the physical volume where you want to assign the new paging space. The F4 key can be used to assist in selecting available physical volumes.
     
  2. Using the Tab key, select yes for the option to start using this paging space now and yes for the option to use this paging space each time the system is restarted. Press the Enter key after selecting these two options.
     
  3. Run the lsps -a command to compare the new paging space with any others on the system.
     
    • Make sure the value in both the Active and Auto columns is set to yes.
       
    • Verify that the value in the %Used column is at least 1%. The system uses the paging space in a round robin fashion and %Used value increases over time.
No reboot is required, and can you can complete this procedure on a production system.
Increase paging space
Follow these steps to increase the size of an existing paging space:
  1. Determine the PP SIZE and FREE PP values for the disk where you want to increase the paging space. Follow the instructions in Steps 1 through 3 of the previous procedure.
     
  2. Decide the number of megabytes by which you want to increase the paging space.
     
  3. Divide the number of megabytes by the PP SIZE value to determine the equivalent number of logical partitions needed to increase the paging space.
     
  4. Use the following smitty fastpath command to display the dialog box for making changes to an existing paging space:
smitty chps

  1. Complete the entries on the dialog box as follows:
     
    • Enter the number of logical partitions needed to increase the paging space (as determined in Step 3).
       
    • Set the value to yes in the Use this paging space each time the system is RESTARTED? field.
       
  2. Press Enter to execute the dialog.
     
  3. Run the lsps -a command to verify the size of the paging space and to check that the value in both the Active and Auto columns is set to yes.

For additional information, refer to the chapter on "Monitoring and Tuning Memory Use" in the Performance Management Guide at the following location:
If available paging space becomes depleted, the operating system attempts to release resources as follows:
  • First, by warning processes to release paging space
     
  • Then, if there is still insufficient paging space for current process, by killing processes.
The VMM uses the values of two parameters that specify the thresholds for sending warning or kill signals to processes:
npswarn
The value of this parameter specifies the paging space warning threshold, below which warning signals are sent to processes.
npskill
The value of this parameter specifies the paging space kill threshold, below which kill signals are sent to certain processes.

Choosing npswarn and npskill settings
These values for the npswarn and npskill parameters are set by means of arguments to the vmo command:
Parameter
vmo flag
Description
npswarn
-o
Specifies the number of free paging space pages at which the operating system begins sending the SIGDANGER signal to processes. If the npswarn threshold is reached and a process is handling this signal, the process can choose to ignore the signal or do some other action, such as exit or release memory by using the disclaim() subroutine. The default value in operating system version 4 is determined by the following formula:
 npswarn =  4*npskill
The value of npswarn must be greater than zero and less than the total number of paging space pages on the system. This parameter can be changed by using the vmo -o command (on AIX 5.3 or above).
npskill
-o
Specifies the number of free paging space pages at which the operating system begins killing processes. If the npskill threshold is reached, a SIGKILL signal is sent to the youngest process. Processes that are handling SIGDANGER or processes that are using the early page-space allocation (paging space is allocated as soon as memory is requested) are exempt from being killed. The formula to determine the default value of npskill is as follows:
 npskill =  number_of_paging_space_pages/128
The npskill value must be greater than zero and less than the total number of paging space pages on the system. This parameter can be changed by using the vmo -o command.

Example
Real memory = 16GB
Paging space = 4096MB
Convert paging space to 4KB pages.
4096MB *(1024KB/4KB)=1048576 4KB pages
npskill = number_of_paging_space_pages/128
            = 1048576 pages/128
            = 8192 pages
npswarn = 4*npskill
              = 4*8192
              = 32768 pages
Npskill % of paging space = ((1048576 - 8192)/1048576)*100   = 99.2
Npswarn % of paging space = ((1048576 - 32768)/1048576)*100  = 96.8

Example of a 8GB paging space:
Npskill % of paging space = ((2097152 - 16384)/2097152)*100  = 99.2
Npswarn % of paging space = ((2097152 - 65536)/2097152)*100  = 96.8

Example of a 16GB paging space:
Npskill % of paging space = ((4194304 - 32768)/4194304)*100  = 99.2
Npswarn % of paging space =((4194304 - 131072)/4194304)*100  = 96.8

The npswarn and npskill default value percentages for paging spaces of 4GB, 8GB and 16GB all are equivalent. Notice the npswarn default is 96.8%. This percentage translates to the point at which the paging space usage percentage will send SIGDANGER signals to marked processes. Then when the paging space usage percentage reaches 99.2, the SIGKILL signal will sent to the youngest process. To gain more notice time, decrease npswarnpercentages to give a more advanced warning.
Other vmo parameters
The following parameters are also set by arguments to the vmo command:
Parameter
vmo flag
Description
nokilluid
-o
By setting the nokilluid option to a nonzero value with the command vmo -o (on AIX 5.3 or above), user IDs lower than this value will be exempt from being killed because of low page space conditions. This option is only available in AIX 4.3.3.2 and later.

For more information on the vmo flags see Appendix G of the Performance Management Guide.
Tuning the pacefork retry interval parameter with schedo
If a process cannot be forked due to a lack of paging space pages, the scheduler retries the fork five times. After each try, the scheduler delays for a default of 10 clock ticks.
The -o flag of the schedo command specifies the number of clock ticks to wait before retrying a failed fork() call. For example, if a fork() subroutine call fails because there is not enough space available to create a new process, the system retries the call after waiting the specified number of clock ticks. The default value is 10, and because there is one clock tick every 10 ms, the system retries the fork() call every 100 ms.
If the paging space is only low due to brief, sporadic workload peaks, increasing the retry interval might allow processes to delay long enough to be released. See the following example:
     # schedo -o pacefork=15
In this way, when the system retries the fork() call, there is a higher chance of success because some processes might have finished their execution and, consequently, released pages from paging space.

If 'lsps -a' fails with the following error (due to a clone alt_disk install putting an entry in the ODM for the alt_disk volume):
0516-010 : Volume group must be varied on; use varyonvg command.
Check paging space configuration (to see if hd6 is mirrored and opened/syncd):
- lsps -s
- lsvg -l rootvg
If a clone alt_disk install has occurred on the system, it will put an entry in the ODM for the alt_disk volume. To remove the ODM entry for the alt_disk volume, run the following commands:
- odmget CuAt | grep paging
- odmget -q value=paging CuAt
- odmdelete -q name= -o CuAt
- odmdelete -q name= -o CuDv
- odmdelete -q value3= -o CuDvDr
- odmdelete -q dependency= -o CuDep
- rm /dev/
- sync;sync;sync;
- savebase

To determine if altinst_rootvg is assigned to a disk, and removing the disk assignment:
- lspv                       
- alt_disk_install -X altinst_rootvg
- lspv                       
- lsps –a

If a clone alt_disk install has NOT occurred on the system, run the following commands:
- lsps -a
Page Space      Physical Volume   Volume Group    Size %Used Active   Auto  Type
0516-010 : Volume group must be varied on; use varyonvg command.
  hd6             hdisk1            rootvg        6144MB     1   yes   yes lv

- odmget CuAt | grep -p paging
CuAt:
        name = "hd6"
        attribute = "type"
        value = "paging"
        type = "R"
        generic = "DU"
        rep = "s"
        nls_index = 639
CuAt:
        name = "paging01"
        attribute = "lvserial_id"
        value = "0008fb9a00004c00000000fc229a5310.1"
        type = "R"
        generic = "D"
        rep = "n"
        nls_index = 648
CuAt:
        name = "paging01"
        attribute = "intra"
        value = "c"
        type = "R"
        generic = "DU"
        rep = "l"
        nls_index = 641
CuAt:
        name = "paging01"
        attribute = "type"
        value = "paging"
        type = "R"
        generic = "DU"
        rep = "s"
        nls_index = 639
CuAt:
        name = "paging01"
        attribute = "size"
        value = "16"
        type = "R"
        generic = "DU"
        rep = "r"
        nls_index = 647
- odmget -q name=paging01 CuDv           
 *Note: Make note of 'parent = "{volume group name}"'

- exportvg {volume group name}
- lsps -a


Paging space garbage collection consists of enhanced paging space management algorithms to reclaim paging space when needed. AIX 5.3 and above implement two paging space garbage collection (PSGC) methods for deferred allocation working segments:
 Garbage collect paging space on re-pagein (which applies to applies to the deferred page
space allocation policy). This method is enabled by default on the operating system when the system
becomes low on free paging space disk blocks. To dynamically change the thresholds that control
functions for the re-pagein garbage collection, run the following commands:
- vmo -o rpgclean=1
- vmo -o rpgcontrol=1
- vmo -a | grep numpsblks
- vmo -o npsrpgmax={numpsblks - 1)
- vmo -o npsrpgmin=1

 *Note: numpsblks is the number of paging space blocks
 Garbage collect paging space scrubbing for in-memory frames can enable a mechanism
(dynamically using the vmo command) to free paging space disk blocks from pages in
memory:
- vmo -o scrub=1
- vmo -o scrubclean=1
- vmo -d npsscrubmax
- vmo -d npsscrubmin

If the GC (garbage collection) paging space scrubbing is enabled, an internal timer will trigger a
kernel service to start the garbage collection process every 60 seconds when the
number of system free paging space blocks are within the limits of the lower and
upper scrubbing thresholds.