Virtual-memory segments are partitioned in units called pages; each page
is either located in real physical memory (RAM) or stored on disk until
it is needed. AIX uses virtual memory to address more memory than is
physically available in the system. The management of memory pages in
RAM or on disk is handled by the VMM.
A page is a fixed-size block of data (usually 4096 byte). A page might
be resident in memory (that is, mapped into a location in physical
memory), or a page might be resident on a disk (that is, paged out of
physical memory into paging space or a file system).
The VMM maintains a free list of available page frames. The VMM also
uses a page-replacement algorithm to determine which virtual-memory
pages currently in RAM will have their page frames reassigned to the
free list.
AIX tries to use all of RAM all of the time, except for a small amount
which it maintains on the free list. To maintain this small amount of
unallocated pages the VMM uses page outs and page steals to free up
space and reassign those page frames to the free list.
overhead -- The load that AIX incurs while sharing resources
between user processes and performing its internal accounting.
page -- A fixed-size (4KB) block of memory.
page fault -- It occurs when a process tries to access an
address in virt mem. that does not have a location in physical memory.
In response, the system tries to load the appropriate data from the hard disk
page stealing daemon -- The daemon responsible for releasing pages of memory for use by other processes
(It makes room for incoming pages, by swapping
out mem. pages that are not the part of the working set of a process.)
paging in -- Reading pages from swap.
paging out -- Releasing pages of physical memory for use.
Kernel continuously checks to see if the number of pages on the free
list is below a threshold. If so the page stealing daemon, becomes
active and begins copying pages to the swap area, starting with least
recently used pages. Each page placed on the free list then becomes
available for use by other processes. Pages written out to swap must be
read back into physical memory when the process needs them again.
The AIX VMM integrates cached file data with the management of other
types of virtual memory (for example, process data, process stack, and
so forth). It caches the file data as pages, just like virtual memory
for processes.
(In most modern computer systems, each thread has a reserved region of memory referred to as its stack.)
------------------
Working Storage
Working storage pages are pages that contain volatile data (in other words, data that is not preserved across a reboot).
Examples of virtual memory regions that consist of working storage pages are:
- Process data
- Stack
- Shared memory
- Kernel data
When modified working storage pages need to be paged out (moved from
memory to the disk), they are written to paging space. Working storage
pages are never written to a file system.
When a process exits, the system releases all of its private working
storage pages. Thus, the system releases the working storage pages for
the data of a process and stack when the process exits.
Permanent Storage
Permanent storage pages are pages that contain permanent data (that is,
data that is preserved across a reboot). This permanent data is just
file data. So, permanent storage pages are basically just pieces of
files cached in memory.
When a modified permanent storage page needs to be paged out (moved from memory to disk), it is written to a file system.
You can divide permanent storage pages into two sub-types:
- Non-client pages (aka persistent pages): these are pages containing cached Journaled File System (JFS) file data
- Client pages: These are pages containing cached data for all other
file systems (for example, JFS2 and Network File System (NFS)
------------------
In order to help optimize which pages are selected for replacement by
the page replacement daemons, AIX classifies pages into one of two
types:
- Computational pages: pages used for the text, data, stack, and shared memory of a process
- Non-computational pages: pages containing file data for files that are being read and written.
All working storage pages are computational. A working storage page is never marked as non-computational.
Depending on how you use the permanent storage pages, the pages can be
computational or non-computational. If a file contains executable text
for a process, the system treats the file as computational and marks all
of the permanent storage pages in the file as computational. If the
file does not contain executable text, the system treats the file as
non-computational file and marks all of the pages in the file as
non-computational.
Once a file has been marked as computational, it remains marked as a
computational file until the file is deleted (or the system is
rebooted). Thus, a file remains marked as computational even after it is
moved or renamed.
------------------
Page replacement
The AIX page replacement daemons scan memory a page at a time to find
pages to evict in order to free up memory. The page replacement daemons
must choose pages carefully to minimize the performance impact of paging
on the system, and the page replacement daemons target pages of
different classes based on tunable parameter settings and system
conditions.
There are a number of tunable parameters that you can use to control how AIX selects pages to replace.
------------------
minperm and maxperm
The two most basic page replacement tunable parameters are minperm and
maxperm. These tunable parameters are used to indicate how much memory
the AIX kernel should use to cache non-computational pages. The maxperm
tunable parameter indicates the maximum amount of memory that should be
used to cache non-computational pages. The minperm limit indicates the
target minimum amount of memory that should be used for
non-computational pages.
By default, maxperm is an "un-strict" limit, so it allows more
non-computational files to be cached in memory when there is available
free memory. The maxperm limit can be made a "strict" limit by setting
the strict_maxperm tunable parameter to 1.
(The disadvantage of this is, that the number of non-computational pages
cannot grow beyond maxperm and consume more memory when there is free
memory on the system.)
numperm (lru_file_repage)
The number of non-computational pages is referred to as numperm: The
vmstat -v command displays the numperm value for a system as a
percentage of a system’s real memory.
When the number of non-computational pages (numperm) is greater than or
equal to maxperm, the AIX page replacement daemons strictly target
non-computational pages (for example, cached files that are not
executables).
When the number of non-computational pages (numperm) is less than or
equal to minperm, the AIX page replacement daemons target both
computational and non-computational pages. In this case, AIX scans both
classes of pages and evicts the least recently used pages.
When the number of non-computational pages (numperm) is between minperm
and maxperm, the lru_file_repage (least recently used) tunable parameter
controls what kind of pages the AIX page replacement daemons should
steal.
Thus, the lru_file_repage tunable parameter can be set to 0. In this
case, the AIX kernel always targets non-computational pages when numperm
is between minperm and maxperm.
In most customer environments, it is most optimal to just have the
kernel always target non-computational pages, because paging
computational pages (for example, a process’s stack, data, and so forth)
usually has a much higher performance cost on a process than paging
non-computational pages (that is, data file cache). Thus, the
lru_file_repage tunable parameter can be set to 0. In this case, the AIX
kernel always targets non-computational pages when numperm is between
minperm and maxperm
------
maxclient
The maxclient tunable parameter specifies a limit on the maximum amount
of memory that should be used to cache non-computational client pages.
Because all non-computational client pages are a subset of the total
number of non-computational permanent storage pages, the maxclient limit
must always be less than or equal to the maxperm limit.
numclient
The number of non-computational client pages is referred to as
numclient. The vmstat -v command displays the numclient value for a
system as a percentage of a system’s real memory.
By default, the maxclient limit is a strict limit. This means that the
AIX kernel does not allow the non-computational client file cache to
exceed the maxclient limit (that is, the AIX kernel does not allow
numclient to exceed maxclient). When numclient reaches the maxclient
limit, the AIX page replacement daemons strictly target client pages.
------
minfree, maxfree
Two other important parameters are minfree and maxfree. If the number of
pages on your free list (vmstat -v: free pages) falls below the minfree
parameter, VMM starts to steal pages (just to add to the free list),
which is not good. It continues to do this until the free list has at
least the number of pages in the maxfree parameter.
# vmstat -v <-- for non-computational file-cache
4980736 memory pages
739175 lruable pages
432957 free pages
1 memory pools
84650 pinned pages
80.0 maxpin percentage
20.0 minperm percentage <<- system’s minperm% setting
80.0 maxperm percentage <<- system’s maxperm% setting
2.2 numperm percentage <<- % of memory containing non-comp. pages
16529 file pages <<- # of non-comp. pages
0.0 compressed percentage
0 compressed pages
2.2 numclient percentage <<- % of memory containing non-comp. client pages
80.0 maxclient percentage <<- system’s maxclient% setting
16503 client pages <<- # of client pages
So, in the above example, there are 16529 non-computational file pages
mapped into memory. These non-computational pages consume 2.2 percent of
memory. Of these 16529 non-computational file pages, 16503 of them are
client pages.
The vmstat output does not provide information about computational file
pages. Information about computational file pages can be gathered from
the svmon command
# svmon -G <--in memory pages of each type (work, pers., client)
size inuse free pin virtual
memory 786432 209710 576722 133537 188426
pg space 131072 1121
work pers clnt
pin 133537 0 0
in use 188426 0 21284
- work: working storage
- pers: persistent storage (persistent storage pages are non-client pages - that is, JFS pages.)
- clnt: client storage
For each page type, svmon displays two rows:
- in use: number of 4K pages mapped into memory
- pin: number of 4K pages mapped into memory and pinned (pin is a subset of inuse)
So, in the above example, there are 188426 working storage pages mapped
into memory. Of those 188426 working storage pages, 133537 of them are
pinned (that is, can’t be paged out).
There are no persistent storage pages (because there are no JFS
filesystems in use on the system). There are 21284 client storage pages,
and none of them are pinned.
The svmon command does not display the number of permanent storage
pages, but it can be calculated from the svmon output. As mentioned
earlier, the number of permanent storage pages is the sum of the number
of persistent storage pages and the number of client storage pages. So,
in the above example, there are a total of 21284 permanent storage pages
on the system:
0 persistent storage pages + 21284 client storage pages = 21284 permanent storage pages
The type of information reported by svmon is slightly different than
vmstat. svmon reports information about the number of in-memory pages
of different types: working, persistent (that is, non-client), and
client. svmon does not report information about computational versus
non-computational. svmon just reports the total number of in-memory
pages of each page type.
In contrast, vmstat reports information about non-computational versus computational pages.
To illustrate this difference, consider the above example of svmon
output. Some of the 21284 client pages will be computational, and the
rest of the 21284 client pages will be non-computational. To determine
the breakdown of these client pages between computational and
non-computational, use the vmstat command to determine how many of the
21284 client pages are non-computational.
-----------
suggested:
lru_file_repage = 0
maxperm = 90%
maxclient = 90%
minperm = 3%
strict_maxclient = 1 (default)
strict_maxperm = 0 (default)
# vmo -p -o lru_file_repage=0 -o maxclient%=90 -o maxperm%=90 -o minperm%=3
# vmo -p -o strict_maxclient=1 -o strict_maxperm=0
The above tunable parameters settings are the default settings for AIX Version 6.1.
-----------------------------
minfree: Minimum acceptable number of real-memory page frames in the
free list. When the size of the free list falls below this number, the
VMM begins stealing pages. It continues stealing pages until the size of
the free list reaches maxfree.
-----------------------------
An example:
topas:
MEMORY
Real,MB 26623
% Comp 57 <--this is used for processes (OS+appl.), if
you add nmon Process+System, for me it was the same (46+11)
% Noncomp 22 <--fs cache
% Client 22 <--fs cache (for jfs2)
nmon:
FileSystemCache
(numperm) 22.5% <--this is for fs cache
Process 46.0% <--this is for appl. processes
System 11.3% <--this is for the OS
Free 20.2% <--free
-----
Total 100.0%
-----------------------------
Excerpts from a tuning docs:
Set vmo:lru_file_repage=0; default=1 # Mandatory critical change
This change directs lrud to steal only JFS/JFS2 file-buffer pages
unless/until numperm/numclient is less-than/equal-to vmo:minperm%, at
which point lrud begins stealing both JFS/JFS2 file-buffer pages and
computational memory pages.
Essentially stealing computational memory invokes pagingspace-pageouts.
I have found this change already made by most AIX 5.3 customers.
Set vmo:page_steal_method=1; default=0 # helpful, not critical
This change switches the lrud page-stealing algorithm from a physical
memory address page-scanning method (=0) to a List-based page-scanning
method (=1).
Set ioo:sync_release_ilock=1; default=0 # helpful, not critical
Default value =0 means that the i-node lock is held while all dirty
pages of a file are flushed; thus, I/O to a file is blocked when the
syncd daemon is running. Setting =1 will cause a sync() to flush all I/O
to a file without holding the i-node lock, and then use the i-node
lock to do the commit.
Execute vmstat -v and compare the following values/settings:
minperm should be 10, 5 or 3; default=20
maxperm should be 80 or higher; default=80 or 90
maxclient should be 80 or higher; default=80 or 90
numperm real-time percent of non-computational memory (includes client below)
numclient real-time percent of JFS2/NFS/vxfs filesystem buffer-cache
paging space page outs are triggered when numperm or numclient is less-than-or-equal-to minperm.
Typically numperm and numclient is greater than minperm, and as such, no paging space page outs can be triggered.
paging space page outs are triggered when numperm or numclient is
less-than-or-equal-to minperm. Typically numperm and numclient is
greater than minperm, and as such, no paging space page outs can be
triggered.