Dec 24 2009

Linux Kernel Configuration

Category: 技术ssmax @ 15:38:59
You can determine the amount of System V IPC resources available by looking at the contents of the following files:
  /proc/sys/kernel/shmmax - The maximum size of a shared memory segment.
  /proc/sys/kernel/shmmni - The maximum number of shared memory segments.
  /proc/sys/kernel/shmall - The maximum amount of shared memory
                              that can be allocated.
  /proc/sys/kernel/sem    - The maximum number and size of semaphore sets
                              that can be allocated.
For example, to view the maximum size of a shared memory segment that can be created enter:
  cat /proc/sys/kernel/shmmax

To change the maximum size of a shared memory segment to 256 MB enter:

  echo 268435456 > /proc/sys/kernel/shmmax

To view the maximum number of semaphores and semaphore sets which can be created enter:

cat /proc/sys/kernel/sem

This returns 4 numbers indicating:

 SEMMSL - The maximum number of semaphores in a sempahore set
 SEMMNS - The maximum number of sempahores in the system
 SEMOPM - The maximum number of operations in a single semop call
 SEMMNI - The maximum number of sempahore sets

 For WebSphere MQ:

  • the SEMMSL value must be 128 or greater
  • the SEMOPM value must be 5 or greater
  • the SEMMNS value must be 16384 or greater
  • the SEMMNI value must be 1024 or greater

 To increase the maximum number of semaphores available to WebSphere MQ, you should update the SEMMNS and SEMMNI values.

 

Maximum open files

If the system is heavily loaded, you might need to increase the maximum possible number of open files. If your distribution supports the proc filesystem you can do this by issuing the following command:  echo 32768 > /proc/sys/fs/file-max

If you are using a pluggable security module such as PAM (Pluggable Authentication Module), ensure that this does not unduly restrict the number of open files for the ‘mqm’ user.

 

TCP Tuning Background

The following is a summary of techniques to maximize TCP WAN throughput.

TCP uses what is called the “congestion window”, or CWND, to determine how many packets can be sent at one time. The larger the congestion window size, the higher the throughput. The TCP “slow start” and “congestion avoidance” algorithms determine the size of the congestion window. The maximum congestion window is related to the amount of buffer space that the kernel allocates for each socket. For each socket, there is a default value for the buffer size, which can be changed by the program using a system library call just before opening the socket. There is also a kernel enforced maximum buffer size. The buffer size can be adjusted for both the send and receive ends of the socket.

To get maximal throughput it is critical to use optimal TCP send and receive socket buffer sizes for the link you are using. If the buffers are too small, the TCP congestion window will never fully open up. If the receiver buffers are too large, TCP flow control breaks and the sender can overrun the receiver, which will cause the TCP window to shut down. This is likely to happen if the sending host is faster than the receiving host. Overly large windows on the sending side is not a big problem as long as you have excess memory.

The optimal buffer size is twice the bandwidth*delay product of the link:

buffer size = 2 * bandwidth * delay

The ping program can be used to get the delay, and tools such as pathrate to get the end-to-end capacity (the bandwidth of the slowest hop in your path). Since ping gives the round trip time (RTT), this formula can be used instead of the previous one:

buffer size = bandwidth * RTT.

For example, if your ping time is 50 ms, and the end-to-end network consists of all 100 BT Ethernet and OC3 (155 Mbps), the TCP buffers should be .05 sec * (100 Mbits / 8 bits) = 625 KBytes. (When in doubt, 10 MB/s is a good first approximation for network bandwidth on high-speed R and E networks like ESnet).

There are 2 TCP settings you need to know about. The default TCP send and receive buffer size, and the maximum TCP send and receive buffer size. Note that most of UNIX OS’s by default have a maximum TCP buffer size that is way too small 1 Gbps pipes, and all have a maximum that is too small for 10 Gbps flows. For instructions on how to increase the maximum TCP buffer, see the OS specific instructions for setting system defaults.

Linux, FreeBSD, Windows, and OSX all now support TCP autotuning, so you no longer need to worry about setting the default buffer sizes. But for Solaris or other older OSes you’ll need to use the UNIX setsockopt call in your sender and receiver to set the optimal buffer size for the link you are using.

/proc/sys/net/core/rmem_max – Maximum TCP Receive Window
/proc/sys/net/core/wmem_max – Maximum TCP Send Window
/proc/sys/net/ipv4/tcp_timestamps – timestamps (RFC 1323) add 12 bytes to the TCP header…
/proc/sys/net/ipv4/tcp_sack – tcp selective acknowledgements.
/proc/sys/net/ipv4/tcp_window_scaling – support for large TCP Windows (RFC 1323). Needs to be set to 1 if the Max TCP Window