.nr H1 7
.nr H2 0
.ds RH "System Operation
.ds CF \*(DY
.bp
.LG
.B
.ce
7. SYSTEM OPERATION
.sp 2
.R
.NL
.PP
This section describes procedures used to operate a PDP-11 UNIX system.
Procedures described here are used periodically, to reboot the system,
analyze error messages from devices, do disk backups, monitor
system performance, recompile system software and control local changes.
.NH 2
Bootstrap and shutdown procedures
.PP
The system boot procedure varies with the hardware configuration,
but generally uses the console emulator or a ROM routine to boot
one of the disks.  /boot comes up and prompts (with ``: '')
for the name of the system to load.  Simply hitting a carriage return
will load the default system.
The system will come up with a single-user shell on the console.
To bring the system up to a multi-user configuration from the single-user
status, all you have to do is hit ^D on the console (you should check
and, if necessary, set the date before going multiuser; see \fIdate\fP\|(1)).
The system will then execute /etc/rc,
a multi-user restart script, and come up on the terminals listed as
active in the file /etc/ttys.
See
\fIinit\fP\|(8)
and
\fIttys\fP\|(5).
Note, however, that this does not cause a file system check to be performed.
Unless the system was taken down cleanly, you should run
``fsck \-p'' or force a reboot with
\fIreboot\fP\|(8)
to have the disks checked.
.PP
In an automatic reboot, the system checks the disks and comes up multi-user
without intervention at the console.
If the file system check fails, or is interrupted
(after it prints the date) from the console when a delete/rubout is hit,
it will leave the system in special-session mode, allowing root to
log in on one of a limited number of terminals (generally including a dialup)
to repair file systems, etc.
The system is then brought to normal multiuser operations by signaling
init with a SIGINT signal (with ``kill \-INT 1'').
.PP
To take the system down to a single user state you can use
.DS
\fB#\fP kill 1
.DE
or use the
\fIshutdown\fP\|(8)
command (which is much more polite if there are other users logged in)
when you are up multi-user.
Either command will kill all processes and give you a shell on the console,
almost as if you had just booted.  File systems remain mounted after the
system is taken single-user.  If you wish to come up multi-user again, you
should do this by:
.DS
\fB#\fP cd /
\fB#\fP /etc/umount \-a
\fB#\fP ^D
.DE
The system can also be halted or rebooted with
.IR reboot (8)
if automatic reboots are enabled.
Otherwise, the system is halted by switching to single-user mode to kill
all processes, updating the disks with a ``sync'' command, and then
halting.
.PP
Each system shutdown, crash, processor halt and reboot
is recorded in the file /usr/adm/shutdownlog
with the cause.
.NH 2
Device errors and diagnostics
.PP
When errors occur on peripherals or in the system, the system
prints a warning diagnostic on the console.  These messages are collected
regularly and written into a system error log file
/usr/adm/messages by
.IR dmesg (8).
.PP
Error messages printed by the devices in the system are described with the
drivers for the devices in section 4 of the Berkeley \s-2PDP-11\s0
UNIX Programmer's manual.
If errors occur indicating hardware problems, you should contact
your hardware support group or field service.  It is a good idea to
examine the error log file regularly
(e.g. with ``tail \-r /usr/adm/messages'').
.PP
If you have \s-2DEC\s0 field service, they should know how to interpret
these messages.  If they do not, tell them to contact the \s-2DEC\s0 \s-2UNIX\s0
Engineering Group.
.NH 2
File system checks, backups and disaster recovery
.PP
Periodically (say every week or so in the absence of any problems)
and always (usually automatically) after a crash,
all the file systems should be checked for consistency
by
\fIfsck\fP\|(8).
The procedures of
\fIboot\fP\|(8) or
\fIreboot\fP\|(8)
should be used to get the system to a state where a file system
check can be performed manually or automatically.
.PP
Dumping of the file systems should be done regularly,
since once the system is going it is easy to
become complacent.
Complete and incremental dumps are easily done with
\fIdump\fP\|(8).
You should arrange to do a towers-of-Hanoi dump sequence; we tune
ours so that almost all files are dumped on two tapes and kept for at
least a week in almost every case.  We take full dumps every month (and keep
these indefinitely).
.PP
Dumping of files by name is best done by
\fItar\fP\|(1)
but the amount of data that can be moved in this way is limited
to a single tape.
Finally, if there are enough drives, entire
disks can be copied with
\fIdd\fP\|(1)
using the raw special files and an appropriate
block size.
.PP
It is desirable that full dumps of the root file system are made regularly.
This is especially true when only one disk is available.
Then, if the
root file system is damaged by a hardware or software failure, you
can rebuild a workable disk using a standalone restore in the
same way that \fIrestor\fP was used to build the initial root file
system.
.PP
Exhaustion of user-file space is certain to occur
now and then;
the only mechanisms for controlling this phenomenon
are occasional use of
\fIdf\fP\|(1),
\fIdu\fP\|(1),
\fIquot\fP\|(8),
threatening
messages of the day, personal letters, and (probably as
a last resort) quotas (see \fIsetquota\fP\|(8)).
.NH 2
Moving file system data
.PP
If you have the equipment,
the best way to move a file system
is to dump it to magtape using
\fIdump\fP\|(8),
to use
\fImkfs\fP\|(8)
to create the new file system,
and restore, using \fIrestor\fP\|(8), the tape.
If for some reason you don't want to use magtape,
dump accepts an argument telling where to put the dump;
you might use another disk.
Sometimes a file system has to be increased in logical size
without copying.
The super-block of the device has a word
giving the highest address that can be allocated.
For small increases, this word can be patched
using the debugger \fIadb\fP\|(1)
and the free list reconstructed using
\fIfsck\fP\|(8).
The size should not be increased greatly
by this technique, since the file system will then be short of inode slots.
Read and understand the description given in
\fIfilsys\fP\|(5)
before playing around in this way.
.PP
If you have to merge a file system into another, existing one,
the best bet is to
use
\fItar\fP\|(1).
If you must shrink a file system, the best bet is to dump
the original and restor it onto the new file system.
However, this will not work if the i-list on the smaller file system
is smaller than the maximum allocated inode on the larger.
If this is the case, reconstruct the file system from scratch
on another file system (perhaps using \fItar\fP\|(1)) and then dump it.
If you
are playing with the root file system and only have one drive
the procedure is more complicated.
What you do is the following:
.IP 1.
GET A SECOND PACK!!!!
.IP 2.
Dump the root file system to tape using
\fIdump\fP\|(8).
.IP 3.
Bring the system down and mount the new pack.
.IP 4.
Load the standalone versions of
\fImkfs\fP\|(8)
and
\fIrestor\fP\|(8)
as in sections 2.1-2.3 above.
.IP 5.
Boot normally
using the newly created disk file system.
.PP
Note that if you add new disk
drivers they should also be added to the standalone system in
/usr/src/sys/stand.
.NH 2
Monitoring System Performance
.PP
The
.IR iostat (8)
and
.IR vmstat (8)
programs provided with the system are designed to aid in monitoring
systemwide activity.
By running them
when the system is active you can judge the system activity in several
dimensions: job distribution, virtual memory load, swapping
activity, disk and CPU utilization.
Ideally, there should be few blocked (DW) jobs,
there should be little swapping activity, there should
be available bandwidth on the disk devices (most single arms peak
out at 30-35 tps in practice), and the user CPU utilization (US) should
be high (above 60%).
.PP
If the system is busy, then the count of active jobs may be large,
and several of these jobs may often be blocked (DW).  
.PP
If you run
.I vmstat
when the system is busy (a ``vmstat 5'' gives all the
numbers computed by the system), you can find
imbalances by noting abnormal job distributions.  If many
processes are blocked (DW), then the disk subsystem
is overloaded or imbalanced.  If you have several non-DMA
devices or open teletype lines that are ``ringing'', or user programs
that are doing high-speed non-buffered input/output, then the system
time may go high (60-70% or higher).
It is often possible to pin down the cause of high system time by
looking to see if there is excessive context switching (CS), interrupt
activity (IN) or system call activity (SY).
.PP
If the system is heavily loaded, or if you have little memory
for your load (248K is little in almost any case), then the system
will be forced to swap.  This is likely to be accompanied by a noticeable
reduction in system performance and pregnant pauses when interactive
jobs such as editors swap out.
If you expect to be in a memory-poor environment
for an extended period you might consider administratively
limiting system load.
.NH 2
Adding users
.PP
New users can be added to the system by adding a line to the
password file
/etc/passwd.
You should add accounts for the initial user community, giving
each a directory and a password, and putting users who will wish
to share software in the same group.
User id's should be assigned starting with 16 or higher, as lower
id's are treated specially by the system.
Default startup files should probably provided for new users
and can be copied from /usr/public.
Initial passwords should be set also.
.PP
A number of guest accounts have been provided on the distribution
system; these accounts are for people at Berkeley and at Bell Laboratories
who have done major work on UNIX in the past.  You can delete these accounts,
or leave them on the system if you expect that these people would have
occasion to login as guests on your system.
.NH 2
Accounting
.PP
UNIX currently optionally records two kinds of accounting information:
connect time accounting and process resource accounting.  The connect
time accounting information is normally stored in the file /usr/adm/wtmp,
which is summarized by the program
.IR ac (8).
The process time accounting information is stored in the file /usr/adm/acct,
and analyzed and summarized by the program
.IR sa (8).
.PP
If you need to implement recharge for computing time, you can implement
procedures based on the information provided by these commands.
A convenient way to do this is to give commands to the clock daemon
/etc/cron
to be executed every day at a specified time.  This is done by adding
lines to /usr/adm/crontab; see
.IR cron (8)
for details.
.NH 2
Resource control
.PP
Resource control in the current version of UNIX is rather primitive.
Disk space usage can be monitored by
.IR du (1)
or
.IR quot (8)
as was previously mentioned.
Disk quotas can be set and changed with \fIsetquota\fP\|(8)
if the kernel has been configured for quotas.
Our quota mechanism is simplistic and easily defeated but
does make users more aware of the amount of space they use.
.NH 2
Files which need periodic attention
.PP
We conclude the discussion of system operations by listing
the files and directories that continue to grow
and thus require periodic truncation,
along with references to relevant manual pages.
.IR Cron (8)
can be used to run scripts to
truncate these periodically, possibly summarizing first
or saving recent entries.
Some of these can be disabled if you don't need to collect the information.
.TS
center;
lb l a.
/usr/adm/acct	sa(8)	raw process account data
/usr/adm/messages	dmesg(8)	system error log
/usr/adm/shutdownlog	shutdown(8)	log of system reboots
/usr/adm/wtmp	ac(8)	login session accounting
/usr/spool/uucp/LOGFILE	uulog(1)	uucp log file
/usr/spool/uucp/SYSLOG	uulog(1)	more uucp logging
/usr/dict/spellhist	spell(1)	spell log
/usr/lib/learn/log	learn(1)	learn lesson logging
/usr/sys	savecore(8)	system core images
.TE