[ENBD] manpage for /proc/nbdinfo (new!)
Peter T. Breuer
enbd@lists.community.tummy.com
Mon, 6 Jan 2003 17:49:03 +0100 (MET)
Today's the last day of xmas hols, so I've finally gotten round to
writing a manpage for /proc/nbdinfo.
Add to your manpage collection. Section 8.
Peter
.TH NBDINFO 8 "6 January 2002"
.SH NAME
nbdinfo \- ENBD control and information file in /proc
.SH SYNOPSIS
.B cat /proc/nbdinfo
.Br
.B echo command > /proc/nbdinfo
.SH DESCRIPTION
The
.I nbdinfo
file is an interface in /proc to many of the ENBD modules mode controls
and accounting information and statistics.
.SH ACCOUNTING INFORMATION AND STATISTICS
The output from /proc/nbdinfo is divided into sections, one per active
ENBD device. The information pertaining to device nbd (or /dev/nd/b, if
using the devfs scheme), for example, is headed by a single line saying
.IP
Device b: Open
.LP
and information for that section is prefixed by a "[b]" at the
beginning of each line of output.
.PP
Devices that are not active and have
never been active will have a single abbreviated line of output
corresponding to them, saying, for example
.IP
Device a: Closed
.LP
Consecutive closed devices will be indicated with a range designator,
forexample:
.IP
Device c-p: Closed
.PP
Each section commences with a state indicator, showing the flags set
on the device and other important variables.
.IP
[b] State: flags
.LP
The flags include
.PP
.B uninitialized
- the device structures have not yet been set up by the driver (this should indicate a memory error);
.PP
.B verify
- the device has the right magic (i.e. it is not obviously
corrupt);
.PP
.B signed
- the client daemons have registered a servers signature;
.PP
.B rw/ro
- the device is in readwrite (or readonly) mode;
.PP
.B merge requests
- the device is aggregating incoming kernel requests
to some limit specified by the merge_requests command (see section COMMANDS);
.PP
.B buffer writes
- the device will buffer writes internally instead of
passing them on to the remote resource. This mode is used to provide
diskless node root file systems, which should be largely read only,
with a few local modifications;
.PP
.B enabled
- the device is in principle accepting kernel requests and
passing them on to the remote server, which is in good health. When
contat is lost, the device will be disabled;
.PP
.B validated
- the partition table on the device (if any) has been
scanned;
.PP
.B remote invalid
- the remote resource has disappeared although we are
still connected to the remote server and the latter is responding.
This usually indicates that the remote resource is a removable media
such as a cdrom or floppy, and it is being changed;
.PP
.B show_errs
- the device will error out requests when there is a problem,
instead of blocking them. This changes the behaviour when the device is
disabled or the remote resource is unavailable;
.PP
.B direct
- the device is using direct i/o. This is an experimental
option;
.PP
.B plug
- always shown (in 2.2 kernels and earlier was a kernel mode);
.PP
.B sync
- the device is in synchronous mode;
.PP
.B md5sum
- the device is currently in the mode where it uses md5summing
techniques to accelerate writes;
.PP
.B acct
- accounting is being performed for the device, as specified by
the acct command (see section COMMANDS);
.PP
.B last error
- in case the device has errored, an indicator of the last error;
.PP
.B lives
- a count of the number of times in its lifetime that the device has
totally disconnected from the remote and been reconnected. A high count
may indicate network or other problems;
.PP
.B bp
- always zero (in future kernels, a count of memory pages
available).
.PP
After the State line, a line showing information on the kernel queues is
shown. For example:
.IP
[b] Queued: +0R/0W curr (check 0R/0W) +1R/0W max
.LP
The statistics show blocks, per read and per write.
.PP
The device uses a userspace buffer to communicate with the dameons. Its
size is shown next. It is followed by lines showing the size of the
device in bytes and blocks:
.IP
[b] Buffersize: 262144 (sectors=512, blocks=256)
.br
[b] Blocksize: 1024 (log=10)
.br
[b] Size: 4096KB
.br
[b] Blocks: 4096
.PP
The next lines of output pertain to the individual client daemons,
and the output is columnized.
.PP
When the device is in RAID mode the daemons will be organized into
distinct groups, and the Groups line shows their group allegience.
The Sockets line that follows shows the state of the individual
connections. A "+" indicates a good client connection, and a "*"
shows the last client to have been active. If the connection is
bad, a "-" and a "." will be shown instead, respectively.
.IP
[b] Groups: 2 (0) (0) (1) (1)
.br
[b] Sockets: 4 (+) (+) (+) (*)
.PP
The following lines are concerned with transfer statistics.
The accounting is in blocks, and the total is at left, with
subtotals for each daemon in the corresponding columns. The
Requests line shows the number of kernel requests entering the device.
At far right the total is broken down into read and write components,
and the maximum seen in a single request is recorded. The Despatched
line shows the number of requests satisfied. The number of write
requests subjected to md5summing acceleratin is also shown. "eq"
means that the md5sum technique determined that the source and target
of the write were equal in content, and the write was skipped,
"ne" means the write was not skipped, as the contents were not equal,
and "dn" means that the remote server denied the md5sum request.
.IP
[b] Requested: 4 (0) (0) (4) (0) 4R/0W max 4
.br
[b] Despatched: 4 (0) (0) (4) (0) 4R/0W md5 0W (0 eq, 0 ne, 0 dn)
.br
[b] Errored: 0 (0) (0) (0) (0) 0+0
.br
[b] Pending: 0 (0) (0) (0) (0) 0R/0W+0R/0W
.PP
The Errored and Pending lines show requests errored out by the device
and requests queued internally, respectively. The "+" totals at the
right distinguish between requests on the kernel queue and requests on
the drivers internal queues. The kernel queue statistics are after
the "+".
.PP
There follow lines showing the current device speeds, in bytes per
second:
.br
[b] B/s now: 0 (0R+0W)
.br
[b] B/s ave: 0 (0R+0W)
.br
[b] B/s max: 0 (0R+0W)
.PP
The next line breaks the requests total down per size of request.
The size (in blocks) is given after the "%" and the percentage
is before the "%".
.IP
[b] Spectrum: 100%4
.PP
There then follows internal state information about the number of
threads of execution currently running through the device:
.IP
[b] Kthreads: 0 (0 waiting/0 running/1 max)
.PP
The same kind of information is shown for the user space client
threads, but in more detail. A "+" indicates that the thread is
currently blocked in kernel, presumably waiting on an event.
A "-" indicates that the thread is out of kernel. It may indicate a
client daemon death. The succeeding line shows the Ūrocess IDs of
the corresponding user space daemons.
.IP
[b] Cthreads: 4 (+) (+) (+) (+)
.br
[b] Cpids: 4 (1189) (1190) (1191) (1192)
.SH COMMANDS
The /proc/nbdinfo interface accepts certain instructions written to
it. For example:
.IP
echo enable[b]=0 > /proc/nbdinfo
.LP
will disable device ndb.
.PP
The general format of a command is one of
.IP
command[letter] = value
.br
command = value
.LP
In the latter case, the command applies to all (initialized, open) devices.
Normally, the "letter" designates the target device. Numbers ("0", "1", ...)
may be used instead of letters. Spaces around the equals sign are
discarded. as are leading and trailing spaces.
.PP
In addition, the instruction "0" and "1" are emergency escapes
which tell all devices to shut down. The "1" form also zeros the module
reference counter, so the module may be removed from the kernel
(expect a minor oops if there are other kernel components still
referencing it, such as, for example, if the device were mounted in the
file system).
.PP
The list of commands recognized is as follows (the list may change in
future, check the write_proc code in the driver if in doubt):
.IP
.B merge_requests
- the maximum extra number of blocks tobe aggregated per request, over
the natural blocksize.
.IP
.B debug
- if compiled in, turn on (1) or turn off (0) debugging on the device.
.IP
.B sync
- (or "sync_intvl") the interval between forced device syncs. 0
indicates never.
.IP
.B show_errs
- turn on (1) or turn off (0) the behaviour that errors out failed
requests instead of retrying them later. This makes the difference
beween erroring and blocking behaviour on i/o to a failed devoce.
.IP
.B plug
- no longer used.
.IP
.B md5sum
- put the device in (1) md5summing mode, or take it out (0). In any
case the thresholds shown in /proc/sys/dev/endb/ variables still
apply and may take the device out of md5summing mode after the
threshold number of failures, or put it into md5summing mode after
the threshold number of ordinary writes.
.IP
.B rahead
- changes the number of blocks of read ahead performed on the device.
.IP
.B acct
- turns on (1) or turns off (0) accounting on the device.
.IP
.B enable
- turns on (1) or turns off (0) the device.
.IP
.B direct
- puts the device in (1) or out of (0) direct i/o mode. Experimental.
.IP
.B zero
- zeros (1) the accounting counters on the device.
.IP
.B setfaulty
- marks the group given as an argument as faulty, when in a RAID1
configuration (i.e the Groups line shows more than one group).
The group will not be written to, but when it comes back online,
all missed writes will be caught up with.
.IP
.B hotremove
- marks the RAID1 group given as argument as absent, as though a disk
were being changed. When the group comes back on line, a complete
resync will be performed.
.IP
.B hotadd
- marks the RAID1 group given as argument as present, allowing resyncs
to take place.
.SH
ERRORS
On an illegal command or argument value (out of bounds, malformed,
etc.), the write to
.I /proc/nbdinfo
returns -EINVAL.
.SH AUTHOR
Peter Breuer wrote the nbdinfo interface.
.SH "SEE ALSO"
enbd-client(8), enbd.conf(5).