tuning(7)
NAME
tuning - performance tuning under FreeBSD
SYSTEM SETUP - DISKLABEL, NEWFS, TUNEFS, SWAP
- When using bsdlabel(8) or sysinstall(8) to lay out your file
- systems on a
hard disk it is important to remember that hard drives can - transfer data
much more quickly from outer tracks than they can from inner - tracks. To
take advantage of this you should try to pack your smaller - file systems
and swap closer to the outer tracks, follow with the larger - file systems,
and end with the largest file systems. It is also important - to size system standard file systems such that you will not be forced
- to resize them
later as you scale the machine up. I usually create, in or - der, a 128M
root, 1G swap, 128M /var, 128M /var/tmp, 3G /usr, and use - any remaining
space for /home. - You should typically size your swap space to approximately
- 2x main memory. If you do not have a lot of RAM, though, you will gen
- erally want a
lot more swap. It is not recommended that you configure any - less than
256M of swap on a system and you should keep in mind future - memory expansion when sizing the swap partition. The kernel's VM paging
- algorithms
are tuned to perform best when there is at least 2x swap - versus main memory. Configuring too little swap can lead to inefficiencies
- in the VM
page scanning code as well as create issues later on if you - add more memory to your machine. Finally, on larger systems with multi
- ple SCSI disks
(or multiple IDE disks operating on different controllers), - we strongly
recommend that you configure swap on each drive. The swap - partitions on
the drives should be approximately the same size. The ker - nel can handle
arbitrary sizes but internal data structures scale to 4 - times the largest
swap partition. Keeping the swap partitions near the same - size will
allow the kernel to optimally stripe swap space across the N - disks. Do
not worry about overdoing it a little, swap space is the - saving grace of
UNIX and even if you do not normally use much swap, it can - give you more
time to recover from a runaway program before being forced - to reboot.
- How you size your /var partition depends heavily on what you
- intend to
use the machine for. This partition is primarily used to - hold mailboxes,
the print spool, and log files. Some people even make - /var/log its own
partition (but except for extreme cases it is not worth the - waste of a
partition ID). If your machine is intended to act as a mail - or print
server, or you are running a heavily visited web server, you - should consider creating a much larger partition - perhaps a gig or
- more. It is
very easy to underestimate log file storage requirements. - Sizing /var/tmp depends on the kind of temporary file usage
- you think you
will need. 128M is the minimum we recommend. Also note - that sysinstall
will create a /tmp directory. Dedicating a partition for - temporary file
storage is important for two reasons: first, it reduces the - possibility
of file system corruption in a crash, and second it reduces - the chance of
a runaway process that fills up [/var]/tmp from blowing up - more critical
subsystems (mail, logging, etc). Filling up [/var]/tmp is a - very common
problem to have. - In the old days there were differences between /tmp and
- /var/tmp, but the
introduction of /var (and /var/tmp) led to massive confusion - by program
writers so today programs haphazardly use one or the other - and thus no
real distinction can be made between the two. So it makes - sense to have
just one temporary directory and softlink to it from the - other tmp directory locations. However you handle /tmp, the one thing you
- do not want
to do is leave it sitting on the root partition where it - might cause root
to fill up or possibly corrupt root in a crash/reboot situa - tion.
- The /usr partition holds the bulk of the files required to
- support the
system and a subdirectory within it called /usr/local holds - the bulk of
the files installed from the ports(7) hierarchy. If you do - not use ports
all that much and do not intend to keep system source - (/usr/src) on the
machine, you can get away with a 1 gigabyte /usr partition. - However, if
you install a lot of ports (especially window managers and - Linux-emulated
binaries), we recommend at least a 2 gigabyte /usr and if - you also intend
to keep system source on the machine, we recommend a 3 giga - byte /usr. Do
not underestimate the amount of space you will need in this - partition, it
can creep up and surprise you! - The /home partition is typically used to hold user-specific
- data. I usually size it to the remainder of the disk.
- Why partition at all? Why not create one big / partition
- and be done
with it? Then I do not have to worry about undersizing - things! Well,
there are several reasons this is not a good idea. First, - each partition
has different operational characteristics and separating - them allows the
file system to tune itself to those characteristics. For - example, the
root and /usr partitions are read-mostly, with very little - writing, while
a lot of reading and writing could occur in /var and - /var/tmp. By properly partitioning your system fragmentation introduced in
- the smaller
more heavily write-loaded partitions will not bleed over in - to the mostlyread partitions. Additionally, keeping the write-loaded
- partitions
closer to the edge of the disk (i.e., before the really big - partitions
instead of after in the partition table) will increase I/O - performance in
the partitions where you need it the most. Now it is true - that you might
also need I/O performance in the larger partitions, but they - are so large
that shifting them more towards the edge of the disk will - not lead to a
significant performance improvement whereas moving /var to - the edge can
have a huge impact. Finally, there are safety concerns. - Having a small
neat root partition that is essentially read-only gives it a - greater
chance of surviving a bad crash intact. - Properly partitioning your system also allows you to tune
- newfs(8), and
tunefs(8) parameters. Tuning newfs(8) requires more experi - ence but can
lead to significant improvements in performance. There are - three parameters that are relatively safe to tune: blocksize,
- bytes/i-node, and
cylinders/group. - FreeBSD performs best when using 8K or 16K file system block
- sizes. The
default file system block size is 16K, which provides best - performance
for most applications, with the exception of those that per - form random
access on large files (such as database server software). - Such applications tend to perform better with a smaller block size, al
- though modern
disk characteristics are such that the performance gain from - using a
smaller block size may not be worth consideration. Using a - block size
larger than 16K can cause fragmentation of the buffer cache - and lead to
lower performance. - The defaults may be unsuitable for a file system that re
- quires a very
large number of i-nodes or is intended to hold a large num - ber of very
small files. Such a file system should be created with an - 8K or 4K block
size. This also requires you to specify a smaller fragment - size. We
recommend always using a fragment size that is 1/8 the block - size (less
testing has been done on other fragment size factors). The - newfs(8)
options for this would be ``newfs -f 1024 -b 8192 ...''. - If a large partition is intended to be used to hold fewer,
- larger files,
such as database files, you can increase the bytes/i-node - ratio which
reduces the number of i-nodes (maximum number of files and - directories
that can be created) for that partition. Decreasing the - number of inodes in a file system can greatly reduce fsck(8) recovery
- times after a
crash. Do not use this option unless you are actually stor - ing large
files on the partition, because if you overcompensate you - can wind up
with a file system that has lots of free space remaining but - cannot
accommodate any more files. Using 32768, 65536, or 262144 - bytes/i-node
is recommended. You can go higher but it will have only in - cremental
effects on fsck(8) recovery times. For example, ``newfs -i - 32768 ...''.
- tunefs(8) may be used to further tune a file system. This
- command can be
run in single-user mode without having to reformat the file - system. However, this is possibly the most abused program in the sys
- tem. Many people attempt to increase available file system space by set
- ting the minfree percentage to 0. This can lead to severe file system
- fragmentation
and we do not recommend that you do this. Really the only - tunefs(8)
option worthwhile here is turning on softupdates with - ``tunefs -n enable
/filesystem''. (Note: in FreeBSD 4.5 and later, softupdates - can be
turned on using the -U option to newfs(8), and sysinstall(8) - will typically enable softupdates automatically for non-root file
- systems). Softupdates drastically improves meta-data performance, mainly
- file creation
and deletion. We recommend enabling softupdates on most - file systems;
however, there are two limitations to softupdates that you - should be
aware of when determining whether to use it on a file sys - tem. First,
softupdates guarantees file system consistency in the case - of a crash but
could very easily be several seconds (even a minute!) behind - on pending
write to the physical disk. If you crash you may lose more - work than
otherwise. Secondly, softupdates delays the freeing of file - system
blocks. If you have a file system (such as the root file - system) which
is close to full, doing a major update of it, e.g. ``make - installworld'',
can run it out of space and cause the update to fail. For - this reason,
softupdates will not be enabled on the root file system dur - ing a typical
install. There is no loss of performance since the root - file system is
rarely written to. - A number of run-time mount(8) options exist that can help
- you tune the
system. The most obvious and most dangerous one is async. - Do not ever
use it; it is far too dangerous. A less dangerous and more - useful
mount(8) option is called noatime. UNIX file systems nor - mally update the
last-accessed time of a file or directory whenever it is ac - cessed. This
operation is handled in FreeBSD with a delayed write and - normally does
not create a burden on the system. However, if your system - is accessing
a huge number of files on a continuing basis the buffer - cache can wind up
getting polluted with atime updates, creating a burden on - the system.
For example, if you are running a heavily loaded web site, - or a news
server with lots of readers, you might want to consider - turning off atime
updates on your larger partitions with this mount(8) option. - However,
you should not gratuitously turn off atime updates every - where. For example, the /var file system customarily holds mailboxes, and
- atime (in combination with mtime) is used to determine whether a mailbox
- has new mail.
You might as well leave atime turned on for mostly read-only - partitions
such as / and /usr as well. This is especially useful for / - since some
system utilities use the atime field for reporting.
STRIPING DISKS
- In larger systems you can stripe partitions from several
- drives together
to create a much larger overall partition. Striping can al - so improve the
performance of a file system by splitting I/O operations - across two or
more disks. The vinum(8) and ccdconfig(8) utilities may be - used to create simple striped file systems. Generally speaking, strip
- ing smaller
partitions such as the root and /var/tmp, or essentially - read-only partitions such as /usr is a complete waste of time. You should
- only stripe
partitions that require serious I/O performance, typically - /var, /home,
or custom partitions used to hold databases and web pages. - Choosing the
proper stripe size is also important. File systems tend to - store metadata on power-of-2 boundaries and you usually want to reduce
- seeking
rather than increase seeking. This means you want to use a - large offcenter stripe size such as 1152 sectors so sequential I/O
- does not seek
both disks and so meta-data is distributed across both disks - rather than
concentrated on a single disk. If you really need to get - sophisticated,
we recommend using a real hardware RAID controller from the - list of
FreeBSD supported controllers.
SYSCTL TUNING
- sysctl(8) variables permit system behavior to be monitored
- and controlled
at run-time. Some sysctls simply report on the behavior of - the system;
others allow the system behavior to be modified; some may be - set at boot
time using rc.conf(5), but most will be set via - sysctl.conf(5). There
are several hundred sysctls in the system, including many - that appear to
be candidates for tuning but actually are not. In this doc - ument we will
only cover the ones that have the greatest effect on the - system.
- The kern.ipc.shm_use_phys sysctl defaults to 0 (off) and may
- be set to 0
(off) or 1 (on). Setting this parameter to 1 will cause all - System V
shared memory segments to be mapped to unpageable physical - RAM. This
feature only has an effect if you are either (A) mapping - small amounts of
shared memory across many (hundreds) of processes, or (B) - mapping large
amounts of shared memory across any number of processes. - This feature
allows the kernel to remove a great deal of internal memory - management
page-tracking overhead at the cost of wiring the shared mem - ory into core,
making it unswappable. - The vfs.vmiodirenable sysctl defaults to 1 (on). This pa
- rameter controls
how directories are cached by the system. Most directories - are small and
use but a single fragment (typically 1K) in the file system - and even less
(typically 512 bytes) in the buffer cache. However, when - operating in
the default mode the buffer cache will only cache a fixed - number of
directories even if you have a huge amount of memory. Turn - ing on this
sysctl allows the buffer cache to use the VM Page Cache to - cache the
directories. The advantage is that all of memory is now - available for
caching directories. The disadvantage is that the minimum - in-core memory
used to cache a directory is the physical page size (typi - cally 4K) rather
than 512 bytes. We recommend turning this option off in - memory-constrained environments; however, when on, it will substan
- tially improve
the performance of services that manipulate a large number - of files.
Such services can include web caches, large mail systems, - and news systems. Turning on this option will generally not reduce per
- formance even
with the wasted memory but you should experiment to find - out.
- The vfs.write_behind sysctl defaults to 1 (on). This tells
- the file system to issue media writes as full clusters are collected,
- which typically
occurs when writing large sequential files. The idea is to - avoid saturating the buffer cache with dirty buffers when it would not
- benefit I/O
performance. However, this may stall processes and under - certain circumstances you may wish to turn it off.
- The vfs.hirunningspace sysctl determines how much outstand
- ing write I/O
may be queued to disk controllers system-wide at any given - instance. The
default is usually sufficient but on machines with lots of - disks you may
want to bump it up to four or five megabytes. Note that - setting too high
a value (exceeding the buffer cache's write threshold) can - lead to
extremely bad clustering performance. Do not set this value - arbitrarily
high! Also, higher write queueing values may add latency to - reads occurring at the same time.
- There are various other buffer-cache and VM page cache re
- lated sysctls.
We do not recommend modifying these values. As of FreeBSD - 4.3, the VM
system does an extremely good job tuning itself. - The net.inet.tcp.sendspace and net.inet.tcp.recvspace
- sysctls are of particular interest if you are running network intensive appli
- cations. They
control the amount of send and receive buffer space allowed - for any given
TCP connection. The default sending buffer is 32K; the de - fault receiving
buffer is 64K. You can often improve bandwidth utilization - by increasing
the default at the cost of eating up more kernel memory for - each connection. We do not recommend increasing the defaults if you
- are serving
hundreds or thousands of simultaneous connections because it - is possible
to quickly run the system out of memory due to stalled con - nections building up. But if you need high bandwidth over a fewer number
- of connections, especially if you have gigabit Ethernet, increasing
- these defaults
can make a huge difference. You can adjust the buffer size - for incoming
and outgoing data separately. For example, if your machine - is primarily
doing web serving you may want to decrease the recvspace in - order to be
able to increase the sendspace without eating too much ker - nel memory.
Note that the routing table (see route(8)) can be used to - introduce
route-specific send and receive buffer size defaults. - As an additional management tool you can use pipes in your
- firewall rules
(see ipfw(8)) to limit the bandwidth going to or from par - ticular IP
blocks or ports. For example, if you have a T1 you might - want to limit
your web traffic to 70% of the T1's bandwidth in order to - leave the
remainder available for mail and interactive use. Normally - a heavily
loaded web server will not introduce significant latencies - into other
services even if the network link is maxed out, but enforc - ing a limit can
smooth things out and lead to longer term stability. Many - people also
enforce artificial bandwidth limitations in order to ensure - that they are
not charged for using too much bandwidth. - Setting the send or receive TCP buffer to values larger than
- 65535 will
result in a marginal performance improvement unless both - hosts support
the window scaling extension of the TCP protocol, which is - controlled by
the net.inet.tcp.rfc1323 sysctl. These extensions should be - enabled and
the TCP buffer size should be set to a value larger than - 65536 in order
to obtain good performance from certain types of network - links; specifically, gigabit WAN links and high-latency satellite links.
- RFC1323 support is enabled by default.
- The net.inet.tcp.always_keepalive sysctl determines whether
- or not the
TCP implementation should attempt to detect dead TCP connec - tions by
intermittently delivering ``keepalives'' on the connection. - By default,
this is enabled for all applications; by setting this sysctl - to 0, only
applications that specifically request keepalives will use - them. In most
environments, TCP keepalives will improve the management of - system state
by expiring dead TCP connections, particularly for systems - serving dialup
users who may not always terminate individual TCP connec - tions before disconnecting from the network. However, in some environments,
- temporary
network outages may be incorrectly identified as dead ses - sions, resulting
in unexpectedly terminated TCP connections. In such envi - ronments, setting the sysctl to 0 may reduce the occurrence of TCP ses
- sion disconnections.
- The net.inet.tcp.delayed_ack TCP feature is largely misun
- derstood. Historically speaking, this feature was designed to allow the
- acknowledgement to transmitted data to be returned along with the re
- sponse. For
example, when you type over a remote shell, the acknowledge - ment to the
character you send can be returned along with the data rep - resenting the
echo of the character. With delayed acks turned off, the - acknowledgement
may be sent in its own packet, before the remote service has - a chance to
echo the data it just received. This same concept also ap - plies to any
interactive protocol (e.g. SMTP, WWW, POP3), and can cut the - number of
tiny packets flowing across the network in half. The FreeB - SD delayed ACK
implementation also follows the TCP protocol rule that at - least every
other packet be acknowledged even if the standard 100ms - timeout has not
yet passed. Normally the worst a delayed ACK can do is - slightly delay
the teardown of a connection, or slightly delay the ramp-up - of a slowstart TCP connection. While we are not sure we believe that
- the several
FAQs related to packages such as SAMBA and SQUID which ad - vise turning off
delayed acks may be referring to the slow-start issue. In - FreeBSD, it
would be more beneficial to increase the slow-start flight - size via the
net.inet.tcp.slowstart_flightsize sysctl rather than disable - delayed
acks. - The net.inet.tcp.inflight.enable sysctl turns on bandwidth
- delay product
limiting for all TCP connections. The system will attempt - to calculate
the bandwidth delay product for each connection and limit - the amount of
data queued to the network to just the amount required to - maintain optimum throughput. This feature is useful if you are serving
- data over
modems, GigE, or high speed WAN links (or any other link - with a high
bandwidth*delay product), especially if you are also using - window scaling
or have configured a large send window. If you enable this - option, you
should also be sure to set net.inet.tcp.inflight.debug to 0 - (disable
debugging), and for production use setting - net.inet.tcp.inflight.min to at least 6144 may be beneficial. Note however, that setting
- high minimums may effectively disable bandwidth limiting depending on
- the link.
The limiting feature reduces the amount of data built up in - intermediate
router and switch packet queues as well as reduces the - amount of data
built up in the local host's interface queue. With fewer - packets queued
up, interactive connections, especially over slow modems, - will also be
able to operate with lower round trip times. However, note - that this
feature only affects data transmission (uploading / server - side). It
does not affect data reception (downloading). - Adjusting net.inet.tcp.inflight.stab is not recommended.
- This parameter
defaults to 20, representing 2 maximal packets added to the - bandwidth
delay product window calculation. The additional window is - required to
stabilize the algorithm and improve responsiveness to chang - ing conditions, but it can also result in higher ping times over slow
- links
(though still much lower than you would get without the in - flight algorithm). In such cases you may wish to try reducing this pa
- rameter to 15,
10, or 5, and you may also have to reduce - net.inet.tcp.inflight.min (for example, to 3500) to get the desired effect. Reducing these
- parameters
should be done as a last resort only. - The net.inet.ip.portrange.* sysctls control the port number
- ranges automatically bound to TCP and UDP sockets. There are three
- ranges: a low
range, a default range, and a high range, selectable via the - IP_PORTRANGE
setsockopt(2) call. Most network programs use the default - range which is
controlled by net.inet.ip.portrange.first and - net.inet.ip.portrange.last, which default to 49152 and 65535, respectively. Bound port
- ranges are
used for outgoing connections, and it is possible to run the - system out
of ports under certain circumstances. This most commonly - occurs when you
are running a heavily loaded web proxy. The port range is - not an issue
when running a server which handles mainly incoming connec - tions, such as
a normal web server, or has a limited number of outgoing - connections,
such as a mail relay. For situations where you may run out - of ports, we
recommend decreasing net.inet.ip.portrange.first modestly. - A range of
10000 to 30000 ports may be reasonable. You should also - consider firewall effects when changing the port range. Some firewalls
- may block
large ranges of ports (usually low-numbered ports) and ex - pect systems to
use higher ranges of ports for outgoing connections. By de - fault
net.inet.ip.portrange.last is set at the maximum allowable - port number.
- The kern.ipc.somaxconn sysctl limits the size of the listen
- queue for
accepting new TCP connections. The default value of 128 is - typically too
low for robust handling of new connections in a heavily - loaded web server
environment. For such environments, we recommend increasing - this value
to 1024 or higher. The service daemon may itself limit the - listen queue
size (e.g. sendmail(8), apache) but will often have a direc - tive in its
configuration file to adjust the queue size up. Larger lis - ten queues
also do a better job of fending off denial of service at - tacks.
- The kern.maxfiles sysctl determines how many open files the
- system supports. The default is typically a few thousand but you may
- need to bump
this up to ten or twenty thousand if you are running - databases or large
descriptor-heavy daemons. The read-only kern.openfiles - sysctl may be
interrogated to determine the current number of open files - on the system.
- The vm.swap_idle_enabled sysctl is useful in large multi-us
- er systems
where you have lots of users entering and leaving the system - and lots of
idle processes. Such systems tend to generate a great deal - of continuous
pressure on free memory reserves. Turning this feature on - and adjusting
the swapout hysteresis (in idle seconds) via - vm.swap_idle_threshold1 and
vm.swap_idle_threshold2 allows you to depress the priority - of pages associated with idle processes more quickly then the normal pa
- geout algorithm. This gives a helping hand to the pageout daemon. Do
- not turn
this option on unless you need it, because the tradeoff you - are making is
to essentially pre-page memory sooner rather than later, - eating more swap
and disk bandwidth. In a small system this option will have - a detrimental effect but in a large system that is already doing mod
- erate paging
this option allows the VM system to stage whole processes - into and out of
memory more easily.
LOADER TUNABLES
- Some aspects of the system behavior may not be tunable at
- runtime because
memory allocations they perform must occur early in the boot - process. To
change loader tunables, you must set their values in load - er.conf(5) and
reboot the system. - kern.maxusers controls the scaling of a number of static
- system tables,
including defaults for the maximum number of open files, - sizing of network memory resources, etc. As of FreeBSD 4.5,
- kern.maxusers is automatically sized at boot based on the amount of memory available
- in the system, and may be determined at run-time by inspecting the
- value of the
read-only kern.maxusers sysctl. Some sites will require - larger or
smaller values of kern.maxusers and may set it as a loader - tunable; values of 64, 128, and 256 are not uncommon. We do not recom
- mend going
above 256 unless you need a huge number of file descriptors; - many of the
tunable values set to their defaults by kern.maxusers may be - individually
overridden at boot-time or run-time as described elsewhere - in this document. Systems older than FreeBSD 4.4 must set this value
- via the kernel
config(8) option maxusers instead. - kern.ipc.nmbclusters may be adjusted to increase the number
- of network
mbufs the system is willing to allocate. Each cluster rep - resents approximately 2K of memory, so a value of 1024 represents 2M of
- kernel memory
reserved for network buffers. You can do a simple calcula - tion to figure
out how many you need. If you have a web server which maxes - out at 1000
simultaneous connections, and each connection eats a 16K re - ceive and 16K
send buffer, you need approximately 32MB worth of network - buffers to deal
with it. A good rule of thumb is to multiply by 2, so - 32MBx2 = 64MB/2K =
32768. So for this case you would want to set - kern.ipc.nmbclusters to
32768. We recommend values between 1024 and 4096 for ma - chines with moderates amount of memory, and between 4096 and 32768 for ma
- chines with
greater amounts of memory. Under no circumstances should - you specify an
arbitrarily high value for this parameter, it could lead to - a boot-time
crash. The -m option to netstat(1) may be used to observe - network cluster use. Older versions of FreeBSD do not have this tunable
- and require
that the kernel config(8) option NMBCLUSTERS be set instead. - More and more programs are using the sendfile(2) system call
- to transmit
files over the network. The kern.ipc.nsfbufs sysctl con - trols the number
of file system buffers sendfile(2) is allowed to use to per - form its work.
This parameter nominally scales with kern.maxusers so you - should not need
to modify this parameter except under extreme circumstances. - See the
TUNING section in the sendfile(2) manual page for details.
KERNEL CONFIG TUNING
- There are a number of kernel options that you may have to
- fiddle with in
a large-scale system. In order to change these options you - need to be
able to compile a new kernel from source. The config(8) - manual page and
the handbook are good starting points for learning how to do - this. Generally the first thing you do when creating your own custom
- kernel is to
strip out all the drivers and services you do not use. Re - moving things
like INET6 and drivers you do not have will reduce the size - of your kernel, sometimes by a megabyte or more, leaving more memory
- available for
applications. - SCSI_DELAY may be used to reduce system boot times. The de
- faults are
fairly high and can be responsible for 5+ seconds of delay - in the boot
process. Reducing SCSI_DELAY to something below 5 seconds - could work
(especially with modern drives). - There are a number of *_CPU options that can be commented
- out. If you
only want the kernel to run on a Pentium class CPU, you can - easily remove
I486_CPU, but only remove I586_CPU if you are sure your CPU - is being recognized as a Pentium II or better. Some clones may be rec
- ognized as a
Pentium or even a 486 and not be able to boot without those - options. If
it works, great! The operating system will be able to bet - ter use higherend CPU features for MMU, task switching, timebase, and even
- device operations. Additionally, higher-end CPUs support 4MB MMU
- pages, which the
kernel uses to map the kernel itself into memory, increasing - its efficiency under heavy syscall loads.
IDE WRITE CACHING
- FreeBSD 4.3 flirted with turning off IDE write caching.
- This reduced
write bandwidth to IDE disks but was considered necessary - due to serious
data consistency issues introduced by hard drive vendors. - Basically the
problem is that IDE drives lie about when a write completes. - With IDE
write caching turned on, IDE hard drives will not only write - data to disk
out of order, they will sometimes delay some of the blocks - indefinitely
under heavy disk load. A crash or power failure can result - in serious
file system corruption. So our default was changed to be - safe. Unfortunately, the result was such a huge loss in performance that
- we caved in
and changed the default back to on after the release. You - should check
the default on your system by observing the hw.ata.wc sysctl - variable.
If IDE write caching is turned off, you can turn it back on - by setting
the hw.ata.wc loader tunable to 1. More information on tun - ing the ATA
driver system may be found in the ata(4) manual page. If - you need performance, go with SCSI.
CPU, MEMORY, DISK, NETWORK
- The type of tuning you do depends heavily on where your sys
- tem begins to
bottleneck as load increases. If your system runs out of - CPU (idle times
are perpetually 0%) then you need to consider upgrading the - CPU or moving
to an SMP motherboard (multiple CPU's), or perhaps you need - to revisit
the programs that are causing the load and try to optimize - them. If your
system is paging to swap a lot you need to consider adding - more memory.
If your system is saturating the disk you typically see high - CPU idle
times and total disk saturation. systat(1) can be used to - monitor this.
There are many solutions to saturated disks: increasing mem - ory for
caching, mirroring disks, distributing operations across - several
machines, and so forth. If disk performance is an issue and - you are
using IDE drives, switching to SCSI can help a great deal. - While modern
IDE drives compare with SCSI in raw sequential bandwidth, - the moment you
start seeking around the disk SCSI drives usually win. - Finally, you might run out of network suds. The first line
- of defense
for improving network performance is to make sure you are - using switches
instead of hubs, especially these days where switches are - almost as
cheap. Hubs have severe problems under heavy loads due to - collision
back-off and one bad host can severely degrade the entire - LAN. Second,
optimize the network path as much as possible. For example, - in
firewall(7) we describe a firewall protecting internal hosts - with a
topology where the externally visible hosts are not routed - through it.
Use 100BaseT rather than 10BaseT, or use 1000BaseT rather - than 100BaseT,
depending on your needs. Most bottlenecks occur at the WAN - link (e.g.
modem, T1, DSL, whatever). If expanding the link is not an - option it may
be possible to use the dummynet(4) feature to implement peak - shaving or
other forms of traffic shaping to prevent the overloaded - service (such as
web services) from affecting other services (such as email), - or vice
versa. In home installations this could be used to give in - teractive
traffic (your browser, ssh(1) logins) priority over services - you export
from your box (web services, email).
SEE ALSO
- netstat(1), systat(1), ata(4), dummynet(4), login.conf(5),
- rc.conf(5),
sysctl.conf(5), firewall(7), hier(7), ports(7), boot(8), bs - dlabel(8),
ccdconfig(8), config(8), fsck(8), ifconfig(8), ipfw(8), - loader(8),
mount(8), newfs(8), route(8), sysctl(8), sysinstall(8), - tunefs(8),
vinum(8)
HISTORY
- The tuning manual page was originally written by Matthew
- Dillon and first
appeared in FreeBSD 4.3, May 2001. - BSD June 25, 2002