.\" bib | pic | tbl | eqn | troff -ms
.if n .ds Lq "
.if n .ds Rq "
.if n .ds -  --
.if t .ds Lq \\&``
.if t .ds Rq \\&''
.if t .ds -  \-
.de Tp
.ps 10
.vs 12
..
.de pT
.ps \n(PS
.vs \n(VS
..
.EQ
delim $$
gsize 12
.EN
.lg 0
.nr PS 12
.ps 12
.nr VS 14
.vs 14
.TL
.ps 14
.vs 16
\!.ps 14
\!.vs 16
Adding An Auditing Subsystem: A Retrospective
.AU
.ps 12
.vs 14
\!.ps 12
\!.vs 14
Matt Bishop
.AI
Department of Mathematics and Computer Science
Dartmouth College
Hanover, NH  03755
\f2and\fP
Research Institute for Advanced Computer Science
NASA Ames Research Center
Moffett Field, CA  94035
.AB no
.AE
.LP
.ls 2
.NH
Introduction
.PP
The importance of designing an auditing subsystem
as part of the system to be audited is fundamental
and has been discussed repeatedly in the literature
(see for example
[. bonyun role well defined auditing process,
picciotto design effective auditing subsystem.]).
However,
this is not always possible.
In the computing environment of the Numerical Aerodynamic Simulation project
at NASA Ames Research Center,
a set of vendor-supplied operating systems based on
.UX
were selected as the operating systems to be used on all computers;
as these systems
were to be off-the-shelf products used without modification to the kernels,
any auditing abilities beyond those supplied by the vendors
would have to be retrofitted as system programs
rather than as modifications to the kernel.
Therefore the unified auditing subsystem discussed in
[. bonyun role well defined auditing process.]
was split into several parts;
one part monitors messages indicating failed login attempts,
another monitors attempts to access privileged accounts,
and so forth.
.PP
In order to understand how forbidding modifications to the kernel
affects the effectiveness of these various parts of the auditing subsystems,
we must very precisely specify what the term \*(Lqauditing\*(Rq
means,
and provide a model in which the tools built
to provide that capability for the system may be analyzed.
This paper attempts to provide such a model and analysis.
It begins by examining what actions \*(Lqauditing\*(Rq
and its associated process \*(Lqlogging\*(Rq encompass,
and presents a general taxonomy based on the effects
of the implementation of each.
We then describe that part of the auditing subsystem
designed to monitor the file system,
analyze its security deficiencies using the model,
and recommend remedies where appropriate.
.NH
Logging vs. Auditing
.PP
Although often used interchangeably,
.I auditing
and
.I logging
mean two different things.
\*(LqTo audit\*(Rq is \*(Lqto examine and verify (as the books of account
of a company or a treasurer's account)\*(Rq,
whereas \*(Lqto log\*(Rq is \*(Lqto make a note or record of
(the speed, progress, performance, or other sequential
detail of something) esp. in a journal
or other record of data\*(Rq
[.websters dictionary.].
So logging is simply making a record;
auditing is analyzing that record.
.PP
Logging is very common in computer science.
Logs provide information that can be used to restore
file systems and databases to consistent states
after crashes
[.gray recovery manager system r database manager,
haerder principles transaction oriented database recovery,
mitchell comparison network based file servers,
sturgis issues design use distributed file system.].
More relevant to this discussion is the use of logging
for security purposes;
this is done in computer systems ranging from those at
class C2 or higher
[.orange book.],
to those not certified secure but
dealing in electronic fund transfer
[.kinnon audit security implications eft.]
or any other type of data processing
[.bowman security auditability.].
Information appropriate to the security of the system is logged;
in most cases,
some type of auditing is performed on the log
(often called an
.I "audit trail"
or
.I "activity log" )
and action consistent with the state of the system
as recorded in the log is taken.
This audit has three steps.
First,
information in the log is
.I reduced
to eliminate data.
The remaining information is
.I analyzed
either to format the data in the log or
to determine whether or not a compromise has occurred
or would occur if a specific action were taken;
and finally,
the audit mechanism
.I notifies
a program or an auditor of the results.
.PP
These distinctions may be emphasized by considering the entry and
exit of people from a secured building.
At the door is a security guard who signs people in and out.
This record is a log,
and the guard is performing the logging function.
At the end of the day,
the main office obtains the logs.
To determine if there is a potential security problem,
a clerk first eliminates all names showing both entry and exit;
this is the reduction step.
He or she then determines if the people still in the building
are authorized to be there at nighttime;
this is the analysis step.
If someone who should not be there is still in the building,
the security office is called and directed
to locate and to escort the person out of the building;
this is the notification step.
.PP
The distinction being drawn between logging and auditing is very broad;
as another example,
consider the same building with a new rule:
visitors must be escorted by employees who work within the building.
When a visitor arrives,
the guard records personal information and who the visitor is to see
(the logging step.)
He or she then determines how to contact that employee,
which may entail calls to numerous people
(the reduction step).
When that person is reached the guard
informs him or her there is a visitor,
and learns whether or not the employee will come
to escort the visitor (the analysis step.)
Finally,
the guard informs the visitor of the result (the notification step.)
While these are not the conventional uses of the words \*(Lqlogging\*(Rq
and \*(Lqauditing,\*(Rq
they certainly fall under the purview of the definitions above.
.PP
We should at this point distinguish our notion of \*(Lqlogging\*(Rq
and \*(Lqauditing\*(Rq from the orthogonal concepts of
\*(Lqpassive auditing\*(Rq and \*(Lqactive auditing\*(Rq as defined in
[. bonyun role well defined auditing process.].
\*(LqPassive auditing\*(Rq is essentially logging with the expectation
that the log will be available for analysis;
whether or not the log will be analyzed is irrelevant to
our notion of \*(Lqlogging,\*(Rq
since we are separating logging from the auditing
process entirely.
\*(LqActive auditing\*(Rq is the complement of \*(Lqpassive auditing,\*(Rq
and is restricted to determining if the information in the log
consitiutes one (or more) of a set of exceptional conditions
and if so,
taking action;
our use of \*(Lqauditing\*(Rq eliminates this restriction entirely
and simply refers to the analysis and taking of action.
Note that action may be taken even if no exceptional condition has occurred;
this may be done to reassure systems administrators that the system is
still functioning.
.PP
Since the file scanning tool is to audit the file system,
it must first log data,
then audit using that log.
For this reason,
an analysis of both the logging phase and the auditing phase
will provide a model with which
we may explain the reasons for the design choices made and
examine the security problems they raise.
.NH
An Analysis of Logging
.PP
Central to security monitoring is the logging mechanism,
which provides the auditing tools with information
about the state of the system.
Recall that a computer system may be represented as a sequence of states;
${roman "{"} S sub 0 , S sub 1 , S sub 2 , ... {roman "}"}$.
Let $s sub i$ be that part of $S sub i$ which is to be logged
(and,
possibly,
audited.)
A log is
.I complete
if it records sufficient information to construct $s sub i$
for any $i$.
Further,
a log may record either states
$s sub {t sub 0} , s sub {t sub 1} , s sub {t sub 2} , ...$
or an initial state $s sub {t sub 0}$
and the changes
$DELTA ( s sub {t sub 0} , s sub {t sub 1} ) , DELTA ( s sub {t sub 1} , s sub {t sub 2} ) , ...$
between successive states
(where $j <= t sub j$).
We shall therefore distinguish them by calling the former
.I "state logging"
and the latter
.I "change logging" .
.PP
Assume both mechanisms are used to log information
about a particular computer system that starts in a known state $s sub 0$.
Assume further that all events relating to the states $s sub i$
are recorded by the change logging mechanism.
Then,
if all states are recorded by the state logging mechanism,
the two logs contain the equivalent information
(although each state $s sub {t sub i}$ must be derived
from the log produced by the change logging mechanism.)
In practise,
though,
several considerations work against this equivalence.
.PP
Directly related to security is the effect of an attacker
attempting to conceal some change to the state of the system.
With change logging,
only the message corresponding to the change from the attack need be altered,
since the alteration will be reflected in subsequent states reconstructed from
the change log.
With state logging,
each log entry subsequent to the attack must be altered
since the change from the attack would otherwise be reflected
in each log entry after the change.
So state logging better records information
that can be used to detect (or reconstruct) an attack.
.PP
Other considerations mitigate this advantage;
for example,
the cost of the logging.
Typically,
logging a state is much more costly than logging a change,
since obtaining $s sub i$ requires all parts of $s sub i$ to be
scanned and recorded,
whereas obtaining the relevant information about the change
requires instrumenting only the system call or the external event handler
causing the change.
As computer systems usually change state very rapidly,
it is not feasible to scan a system every millisecond
because of the impact on users of the system.
So,
state logging is done periodically,
and as a result the state log is not complete.
To reduce the impact upon the system,
this logging is often done at night or when the computer
is unavailable to the general user community.
This leaves a very large window of vulnerability
when an attacker can make changes to the
parts of the system state being logged,
obtain whatever is desired,
and then change those relevant parts to their original values;
unless the logging occurs during the time of the attack,
the attack will be undetected from the logs.
Event logging impacts the users much less,
and while it records specific events that may indicate an attack,
the state of the system at each change can be derived
from the log assuming the initial state of the system is known.
If all events (including external exceptions)
are instrumented and logged,
and the initial state is known,
then the state log derived from the change log would be complete.
.PP
Secondly,
in practise the state logging mechanism does not indicate why
the state changed,
but merely the new state of the system.
A change log,
on the other hand,
indicates why the state changed and preserves enough information
to determine the new state.
This provides information that a system security officer can use
to determine if the sequence of events was an attempted attack,
an error,
or indicates that a user bears further monitoring.
So in addition to monitoring the state,
change logging may also be used to monitor users' actions which,
in turn,
may be used to detect attempts to thwart security
[.denning intrusion detection model,
lunt real time intrusion detection prototype.].
.PP
Most implementations of a change logging mechanism
focus on tracking events that indicate an attack,
and for that reason their implementation
either makes no record of the
initial state $s sub 0$ or assumes $s sub 0$ is secure.
This means that the system may initially be in a nonsecure state,
and an attacker could gain control of the computer.
While the steps the attacker takes would show in the log,
unless additional precautions are taken,
the attacker could simply erase the log.
.PP
As
[.bonyun role well defined auditing process.]
points out,
the events being logged must be chosen carefully,
because if the change logging mechanism does not instrument all aspects of the
system which could affect $s sub i$,
then it is not possible to derive any state accurately from the change log,
and indeed events indicating an attack on the system may not be recorded.
Hence,
logging mechanisms should always be designed in synchrony with the computer
system so they are an integral part of both the structure and the
components of the system.
.PP
Because of these considerations,
a combination of the two types of logging is in practise the most effective.
A state log of the relevant parts of the system state
should be made when the system is booted,
and a change log thereafter.
This allows both the monitoring of user actions via the change log
and the analysis of the state of the system by combining the initial state
with the changes obtained from the change log.
.NH
Types of Auditing
.PP
In this section,
we use
.I auditing
to refer to those activities needed
to reduce, analyze,
and (possibly) act on the information in an activity log.
If the results of the analysis are simply communicated
to a system security officer,
the audit is
.I "informative" ;
if the results dictate some immediate,
automatic response or action within the system itself
and without any human intervention,
the audit is
.I "responsive" .
Responsive auditing
is very common in statistical database security;
for example,
a mechanism that keeps track of values obtained using random sample queries
and prevents compromise by returning the same answers
whenever a query is repeated
[.denning cryptography data security.]
is a responsive auditing mechanism;
the action taken is to return a precomputed result,
rather than the result that would be obtained by rerunning the
random sample query.
Informative auditing is used generally in operating system security,
because the number of events logged for security purposes is so great
that no responsive auditing mechanism could cope with them
and yet allow the system to maintain a reasonable response or throughput time.
.PP
In the simplest types of computer systems,
both auditing mechanisms lie on the host being audited.
Unless this host has a secure trusted computing base,
there is little if any guarantee that a determined attacker cannot
interfere with the logging and/or auditing mechanisms
and defeat the notification step of the audit.
The quick response is to move the audit mechanism
to another machine.
This introduces a new angle of attack,
namely via the transmission software and hardware;
and here the difference between informative and
responsive auditing becomes quite important.
.PP
Assume the auditing mechanisms lie on a remote,
physically secure computer called the
.I "audit machine"
(which may very well be a personal computer or a workstation.)
The logs are also maintained on the audit machine,
and logging is done by writing from another computer
over a secure communication channel.
All reductions and analyses are done on the audit machine.
.PP
Consider first the security of the transmissions from the main computer
to the audit machine.
As the audit machine is physically secure,
the attacker (presumably) cannot penetrate the facility
and erase or alter the log.
Since the communications channel into the audit machine does not allow
previously sent messages to be erased,
the attacker cannot erase the log.
If a trusted authentication mechanism is used to ensure that messages
sent to the audit machine are genuine log messages,
the attacker cannot even forge log entries or other messages.
As the auditing software is not resident on the main computer,
the attacker cannot tamper with it.
This leaves two vulnerable areas:
the logging software (which is resident on the main machine)
and the notification mechanism.
.PP
The logging software may be attacked in one of several ways,
the result being that logging is disabled
(which the audit machine can detect easily),
many genuine but spurious messages are produced
(and these messages will be eliminated during the reduction phase)
or the messages produced will be incorrect and misleading.
To prevent this requires the logging software to be protected
which (as we have said) is possible only on a secure machine.
.PP
The importance in the distinction
between responsive and informative auditin
lies in the interaction of the computing system with the auditing subsystem.
There are two aspects to this.	
First,
the computer system must send information
to the auditing subsystem when an informative auditing mechanism is in place;
but with a responsive auditing mechanism,
the auditing subsystem must be able to send information back
to the (relevant component of the) computer system as well.
This has several implications for system security.
.PP
The ability of an attacker to defeat the notification process depends
to a large degree on whether informative or responsive auditing is being used.
If informative auditing is being used,
the notification phase can proceed through the audit machine
and not involve the monitored host at all.
Then the attacker cannot alter the results of the audit by
tampering with a message from the auditing system to the auditor
except by physically intercepting it because
the message is not sent on the audited computer;
it is composed and printed on the audit machine.
If the responsible person has a computer available,
a more secure medium than printed messages can be used;
the auditing software can write the information,
encrypted using a public-key cryptosystem,
onto a floppy disk or to tape.
This medium is then delivered to the auditor,
who loads the data onto his computer,
decrypts the message,
and acts accordingly.
Since public key cryptosystems can be used to ensure both privacy and
authenticity,
the auditor would have a firm basis for accepting the results of the audit.
The attacker could not tamper with the results
without the auditor learning about it.
.PP
Responsive auditing requires the results of the audit
to be transmitted back to the audited computer
so that it may act upon the result.
This means that the communications channel between the audited computer
and the audit machine is two-way,
and an attacker may attack two pieces of software:
the logging routine (as noted)
and the routines that act upon the results of the audit.
So,
responsive auditing schemes have a greater window of vulnerability
than informative auditing schemes.
.PP
The second aspect of
the interaction of the computing system with the auditing subsystem
that affects security belongs to the realm of human factors.
If the auditing process is informative,
a human must sift through the results to
determine what is and is not significant.
Experience has shown that if the ratio of what is significant
to what is not significant is low,
humans may very well miss important results.
Further,
if the output is confusing or poorly designed,
the administrators may very well miss something.
But since with responsive auditing an automatic mechanism
does the winnowing,
this \*(Lqboredom factor\*(Rq does not apply
and less care need be taken with the presentation of the results.
.PP
Performance considerations touch on both these aspects.
Informative auditing can be performed during off hours
or when the system is not available to regular users,
so its impact on the system from the users' perspective is minimal.
Responsive auditing,
however,
must be performed after a query or command makes
an entry (or set of entries)
but before any action is taken.
Since the audit cannot be deferred,
it must be able to run when the system is being used by regular users,
and will therefore impact the performance at once.
Worse,
since the audit takes place whenever the command or query is issued,
the impact may be very consistent rather than infrequent.
It may be possible to ameliorate this impact
by preserving the audit trail in reduced form,
so whenever an audit is made,
only the entries made since the last audit need be reduced
and combined with previously reduced data.
.PP
Human factors issues make designing the output from an informative auditing
mechanism more difficult than that from a responsive auditing mechanism.
The latter output will have a well-defined format dictated
by the program processing the results,
and extraneous output can simply be ignored by the recipient.
The former must be easy to read,
succinct,
and clear.
This often leads to the inclusion of mechanisms which
suppress irrelevant information,
since human beings will tend to miss important information
present among a mass of irrelevant information.
Yet such mechanisms usually suppress important information unintentionally,
and so present a danger that must be dealt with during the mechanism's design.
.PP
A major problem of both types of auditing
systems is to preplan precisely what characteristics
are to be audited.
As observed in
[.bonyun role well defined auditing process.],
people designing audit systems \*(Lq...tend,
from time to time,
to create their own special purpose [auditing systems] designed
only\l'|0\(ul'
to satisfy their own initial requirements.\*(Rq
An auditing package may satisfy all needs for a time,
but when applied to a new situation,
fail miserably.
As an example,
consider a responsive audit mechanism for a small statistical database
that works by creating a matrix for queries,
and applying linear analysis to the matrix to determine if answering a query
will allow the questioner to deduce an individual record
[.chin auditing inference control statistical databases.].
Such an audit tool can determine if the database will be compromised in time
$O( n sup 2 )$,
which for a small $n$ is acceptable.
But as the number of entries grows,
the time needed for the audit mechanism to analyze the rows of the matrix
for linear independence becomes unacceptably high.
Notice that this problem is less serious with informative audit mechanisms,
because they do not take action to block commands or queries;
the only people impacted are the recipients of the audit results.
.PP
Finally,
adding a security monitoring system as an afterthought
frequently produces serious problem.
(In fact,
this is true for most security tools.)
Such systems can in general be evaded far more easily
than can security monitoring mechanisms designed into the system.
As an example,
consider a file monitoring program
which logs changes to files on the system.
If the program is not built into the kernel,
then it must use a special library to make entries in the log,
and a clever attacker can avoid linking that library
(by creating one of his or her own,
which issues the appropriate supervisor call without
making an entry in the log,
for example.)
If the program is built into the kernel of the system,
though,
it cannot be (easily) subverted,
because an attacker must replace the kernel with one that does not
monitor \*- a decidedly nontrivial task!
.NH
Examples and Discussion of the Model
.PP
Some examples will serve to make the ideas in the model more concrete.
So,
in this section we shall consider some logging and auditing schemes,
place them within the above model,
and discuss some security problems with each.
.NH 2
Statistical Database Control: Random-Sample-Queries
.PP
This method,
introduced in
[.denning secure statistical databases random sample queries.],
takes a query $q$ concerning some class $C$ of records in the database
and applies to each record $r^\(mo^C$ a selection function $f(C,^r)$
to determine whether or not $r$ is to be used to compute the response to $q$.
Because the control determines the state of each record in $C$
with respect to $f$
(that is,
whether or not $f(C,^r)~=~1$),
it uses state logging;
because the auditing mechanism must determine whether or not
the record is to be used immediately
and respond to the database query manager,
the form of auditing is responsive.
.NH 2
Statistical Database Control: Query-Set-Overlap
.PP
This control records all sets $D sub i$, $i ^=^ 1 ,..., n$
about which queries have been answered,
and answers a new query about a set $C$ if and only if
the number of records in $C ^ \(ca ^ D sub i$ is less than some parameter
(for all $i ^=^ 1 ,..., n$).
Because the logging mechanism must record each query set,
and hence the changes to the set of acceptable queries,
it uses a change logging mechanism,
and since the control must work immediately,
it uses responsive auditing.
.NH 2
Computer System Monitoring: File System Scanner
.PP
A set of programs scans file systems every night,
recording characteristics of the files and transmitting them to
another computer,
where they are compared to expected values of the characteristics.
The audit system notifies administrators of any problems
via electronic mail on the machine on which the audit takes place.
The logging done here is state logging because it captures
parts of the state of the relevant files,
and the auditing is informative because no action,
other than notifying administrators,
is taken.
.NH 2
Computer System Monitoring: Auditing Subsystem
.PP
The subsystem used here is described in detail in
[.picciotto design effective auditing subsystem.]
and involved instrumenting the kernel of a workstation
to record specific system calls,
and from that log produce an audit trail.
This clearly is change logging,
and since its primary purpose is to allow reconstruction
of events culminating in a breach of system security,
the auditing is informative only.
.NH 2
Backing Up Computer Systems
.PP
The data on computer systems is often backed up by copying the data
from the computer system to some
.I "backup medium"
such as tapes.
This is an example of logging without auditing.
Furthermore,
since all files may be recorded in an \*(Lqepoch dump\*(Rq,
or changes to the files since the last dump may be saved in an
\*(Lqincremental dump\*(Rq,
the logging may be either state or change logging.
.NH 2
Discussion of the Examples
.PP
Both query set overlap controls and the auditing subsystem
assume that the change log is accurate;
if an attacker is able to subvert either system's log,
reconstructing a successful attack on either system
might be impossible.
For this reason,
the logging mechanism must be an integral part of the system.
The auditing subsystem in fact recognizes this
and requires that only authorized users be able to access the log
if it is stored locally;
since the subsystem is implemented on a workstation
with enhanced security features
[.cummings compartmented mode workstation.],
the designers believe the underlying computing base
provides sufficient security.
Similarly,
if query set overlap is used\(dg,
the log must be kept in a protected area (either locally or remotely.)
.FS
.IP \(dg \w'\(dg\0'u
Since the auditing system for a query set overlap control
would have to compare the current query with every past query,
it should be noted that
this technique is infeasible under most practical conditions.
.FE
This would require some trusted communication path or trusted computing base.
On the other hand,
to defeat random sample query controls
and the file system scanner,
an attacker would have to tamper with every invocation
of the state function $f$ for a particular record $r$,
or with every message involving a set of files,
to prevent a change from being entered into the log;
this is certainly possible,
but can probably be more easily detected than a change
to just one log message.
.PP
Both statistical database controls require that the auditing mechanism
respond to entries in the log very quickly,
to determine whether or not the relevant statistic may be released;
in the process,
the auditing system transmits a response to the system.
If the system has been successfully penetrated,
the attacker can alter this response to whatever is desired;
this would allow him to obtain records
which should be concealed.
(The records might be on a remote host,
and so the attacker may not be able to get access
to them directly even if the machine on which
the statistical database resides is penetrated.)
The computer system monitoring tools do not suffer from this
vulnerability if the audit is performed on a machine
other than the one being audited and the results
are transmitted to the relevant people using a physically
secure printer that cannot be tampered with.
In this sense,
informative audits are less susceptible to compromise
because the window in which an attacker can alter logs or results
is smaller.
.NH
Overview of the File Monitoring Tool's Design
.PP
The charter of the work on the file monitoring package
(as that part of the retrofitted auditing package
which deals with files is known)
was to provide a tool that would enable systems administrators
to examine the state of the files on the system
and determine which,
if any,
may have been changed accidentally
or may provide an attacker with means to gain entry
into the system.
For instance,
the tool should be able to detect a password file
accidentally made writeable by everyone,
since this allows anyone to obtain any desired privileges;
similarly,
if a system program has been altered,
this should be reported.
.PP
Several policy statements and site considerations guided
the design and development of this auditing tool.
(It should be emphasized that these decisions were made
long before the audit tool was even discussed,
and the site had been in operation for many years.)
The first constraint was that none of the systems to be audited
provided a trusted computing base,
so the tool would rely on untrusted systems programs and libraries;
that the goal was neither to detect intruders nor deliberate,
malicious tampering with files is a recognition of this limit.
.PP
The second constraint was that no modifications to the kernel could be made.
This policy decision sprang from the computing branch's
primary purpose being to support aerodynamical simulations,
not develop secure computers;
it was determined that having to integrate changes into each new kernel
sent by the vendors would require more time and manpower than the
project could commit.
The effect of this was to force the tool to be designed around
a state logging mechanism,
which in turn
limited the times when the tool could be run
because of the impact on system performance indicated by the model.
.PP
The model further indicates that the logs produced by the state logging
mechanism will be incomplete.
This is true,
but barring modifications to the kernel
any change logging mechanism will also produce incomplete logs.
Essentially,
obtaining a complete change log would require every supervisor call and
every external event handler within the kernel to be instrumented.
This could not be done reliably for two reasons.
First,
the internal event handlers were not accessible to any program
other than the kernel.
Secondly,
even though there is a user-level library that translates high-level
language constructs into machine language instructions generating
supervisor calls,
and this could be instrumented to record file accesses,
any competent programmer could write a private version of that library
thus evading the overhead of the instrumented version.
Since the auditing process would reduce the (incomplete) change log
to determine the state of each file being monitored,
the state obtained by the reduction may bear little if any resemblance
to the actual state.
Systems administrators could easily obtain a false sense of confidence,
reasoning \*(Lqsince all events are logged, the reported stae is accurate;\*(Rq
and if not they would still have to look at the file to determine its
actual state.
So such a mechanism makes the tool worse than useless;
it makes it flatly misleading and therefore dangerous.
This touches on the user interface of the package.
.PP
The third constraint dealt specifically with this interface.
The intended audience was system administrators,
all of whom wanted to spend as little time as possible dealing with
the results of any audit.
The statement of those results had to be clear and succinct.
This was complicated by the evolving nature of the systems;
since the organization was changing as the requirements of the user
community grew and evolved,
a responsive audit mechanism \-
one which would alter the characteristics or contents of the files if
they were detected as changed \-
was unacceptable.
The staff simply could not update a master file indicating
what changes were made to which files.
A responsive auditing mechanism would undo these changes;
an informative mechanism,
merely report them.
So,
the design was to use an informative auditing mechanism.
.PP
Because the auditing mechanism was to be informative,
the user interface had to be designed quite carefully;
in particular,
only important changes should be reported,
lest the recipients of the results begin to skim those reports.
On the other hand,
the interface could not delete too many changes
because it might eliminate important information.
The need to balance these two opposing tendencies
led to placing this balancing under the control of
the administrators of each system;
the mechanisms provided will be discussed at length.
.PP
In summary,
the primary goal of this tool is to check the file systems for conditions
which may allow the alteration of files without authorization.
If system files are altered accidentally,
the next run of the subsystem will detect and report this;
however,
if the change resulted from actions of a reasonably sophisticated attacker,
that attacker can easily evade this mechanism.
It is not a replacement or a substitute for a well-integrated
auditing subsystem that is designed into the kernel;
it is simply a supplement to the auditing facilities that currently
exist on the NAS machines.
.PP
We shall now describe in some detail how this file monitoring package works;
this will make the security analysis to follow more comprehensible.
As with the description of the model,
let us first look at the logging phase,
then at the auditing phase.
.NH
An Implementation of Logging
.PP
The logging software consists of several shell scripts and
.UX
programs;
in general,
existing systems programs were used rather than rewriting them.
(This is in line with the philosophy of the
.UX
based systems on which the software was to run
[.ritchie thompson unix time sharing system.].)
The characteristics chosen are listed in table 1,
and are all easily obtainable from the system.
.KS
.Tp
.TS
center, box;
c s
l l.
Table 1. Recorded Characteristics of Files
\f2what\fP	\f2meaning\fP
=
name	file name
type	what it is (\f2i\fP.\f2e\fP. directory)
mode	protection mask
links	number of (hard) links
user	owner of the file
group	group of the file
size	number of bytes in the file \f2or\fP major, minor device numbers
date	date of last modification to data
.TE
.KE
.LP
The type is recorded because it may indicate whether or not a change
is important and if so where to look for the change;
for example, if the file named is a directory,
then one of the files it contains has been moved,
or a file has been added to or deleted from it,
but if the file is a regular file it must be read
to determine what (if indeed anything) has changed.
Altering the protection mask may allow users to change the
contents of a file even though site policy is to disallow that;
the same holds for altering the user and group characteristics.
Changes in the size and the time of last modification
may indicate that the file is being tampered with,
and changes in the number of links to the file
may reflect the planting of a Trojan horse.
.PP
The characteristics associated with each file are written on a single line
and then sorted by file name.
The resulting file constitutes the log
of the state of the file tree at that time.
Some sample log entries (with characteristics
listed in the order in table 1) are:
.DS B
.Tp
.ta 8n 16n 24n
.cs R 25m
.ss 25m
\!.cs R 25m
\!.ss 25m
\!\!.cs R 25m
\!\!.ss 25m
\&.       d  0755  2  root  root   1024   May  5, 1987 at 23:53:50
passwd  -  4755  3  root  staff  27648  Jan  9, 1987 at 18:23:47
ps      -  2755  1  root  kmem   38912  Dec 10, 1986 at 10:26:32
sh      -  0755  1  root  root   23552  Jan  9, 1987 at 18:25:44
.sp
\!\!.ss
\!\!.cs R
\!.ss
\!.cs R
.cs R
.ss
.pT
.ce
Figure 1.  Sample log entries.
.DE
.LP
In the figure,
the first entry is for a directory (\*(Lqd\*(Rq);
other known types are regular file (\*(Lq-\*(Rq),
block and character special files (\*(Lqb\*(Rq and \*(Lqc\*(Rq),
symbolic links (\*(Lql\*(Rq),
named pipes (\*(Lqf\*(Rq),
and sockets (\*(Lqs\*(Rq).
Unknown types are also reported (as \*(Lq?\*(Rq).
.PP
Several programs are involved in this scan.
When a host is to have part of its file system audited,
the auditing process calls the logging program
either directly or using a remote procedure call
(in
.UX
shell programming,
this is done using either of the commands
.I rsh
or
.I remsh
[.bsd reference manual,
system five reference manual.].)
This program,
.I auditls ,
takes several parameters.
The first names the root of the file hierarchy to be scanned,
the second indicates whether the logging process is to descend
that tree recursively,
and the other parameters indicate the location of programs
to be used in the logging process.
The final parameter indicates which class of files
is to be examined
(the classes are
.I setuid ,
.I setgid ,
.I "block device" ,
.I "character device" ,
any conjunction of the above classes,
or
.I "all files" ).
.PP
If the logging is to recurse,
the names of all the files in the file tree are printed using
.I find (1);
if not,
the names of all the files in the root directory of the tree
are listed using
.I ls (1).
In either case the resulting list is passed
through a program that eliminates files not of the requested class;
then this list is passed through a second program that prints the
requisite characteristics.
The list of these characteristics is the desired log.
.PP
Obtaining the characteristics of a file from the system is
quite straightforward.
All
.UX
files have associated with them the following structure:
.DS B
.Tp
.ta 8n 16n 24n 32n 40n 48n 56n 64n
struct stat {
	dev_t	st_dev;		/* device inode resides on */
	ino_t	st_ino;		/* this inode's number */
	u_short	st_mode;	/* protection */
	short	st_nlink;		/* number of hard links to the file */
	short	st_uid;		/* user ID of owner */
	short	st_gid;		/* group ID of owner */
	dev_t	st_rdev;		/* the device type, for inode that is device */
	off_t	st_size;		/* total size of file, in bytes */
	time_t	st_atime;	/* file last access time */
	time_t	st_mtime;	/* file last modify time */
	time_t	st_ctime;	/* file last status change time */
	long	st_blksize;	/* optimal blocksize for file system i/o ops */
	long	st_blocks;	/* actual number of blocks allocated */
}
.sp
.pT
.ce
Figure 2.  \f2Stat\fP structure for file characteristics
.DE
.LP
The relevant program has options to obtain any of the information
from that structure,
so if it is desired to include in the log additional characteristics
found in that structure,
the modification would be trivial.
One characteristic that could not be obtained trivially,
however,
is a cryptographic checksum;
introducing that poses operational problems that will be
discussed in a later section.
.PP
In order to prevent the accidental invocation of a Trojan horse,
all programs used by the shell script
.I auditls
are invoked with full path names;
at no time does that program ever have to look
at a search path to locate a command.
On different versions of
.UX ,
the programs reside in different locations;
for example,
in Berkeley systems,
.I test
(which determines whether
.I auditls
is to recurse down the file tree)
is located in the directory \*(Lq/bin\*(Rq
but on System V systems,
it is built into the command interpreter.
Further,
at least two of the programs must be installed
at the time the file monitor package is installed
on the audit machine.
So,
the names of the programs
passed to
.I auditls
are specified in a per-host file called
.I Environ
which is configured by the administrator
at the time that host is added to the audit system.
.PP
Another problem arises when the host being audited
is not the one on which the audit software is running.
Since in this case the remote execution programs
.I rsh
or
.I remsh
must be used,
and the network software is often unreliable,
the logging connections to remote machines on occasion hang;
since the audits of various machines are done sequentially,
this stops the entire auditing process.
To overcome this problem,
a timeout parameter was added to close the connection after a set period
of time when no output was obtained.
This partially solved the problem,
but it also introduced a new one which we shall discuss in the next section.
.NH
An Implementation of Auditing
.PP
The auditing phase begins
when the output from the logging phase is obtained.
Associated with each file system to be audited is a
.I "master file"
which contains a log of the expected characteristics
of each file being scanned.
These files are initially generated by running the logging program and
having the administrator inspect them.
The logs are compared to these master files;
if they differ,
the files in the master list but not in the log have been deleted;
the files not in the master list but in the log have been added;
and the files in both have been changed in some way.
The audit program constructs an electronic letter
listing the characteristics of each file in these three groups
(if the characteristics have changed,
both the old and new ones are shown;
see figure 3)
and mails this letter to the system administrator and,
if so instructed,
to the users who own (or used to own) the file.
The letter presents the data in such a way that
the recipient can tell at a glance what has changed:
.DS B
.ps 8
.vs 10
.ta 8n 16n 24n
.cs R 25m
.ss 25m
\!.cs R 25m
\!.ss 25m
\!\!.cs R 25m
\!\!.ss 25m
From root Wed May 20 11:05:55 1987
Received: by hydra.riacs.edu (4.12/2.0N)
           id AA07912; Wed, 20 May 87 11:05:53 pdt
Message-Id: <8705201805.AA07912@hydra.riacs.edu>
Date: Wed, 20 May 87 11:05:53 pdt
From: root <root>
To: mab
Subject: Audit report for hydra.riacs.edu:/tmp

Audit date:  Wed May 20 11:05:40 PDT 1987
Host system: hydra.riacs.edu
File system: /tmp
Options: update-master-list,nonrecursive
Copies sent to: mab

The following files have been added:

file name  type  mode  links  user  group  size            date
---------  ----  ----  -----  ----  -----  ----            ----
Am10132       -  0644      1  mab   root      0  May 19, 1987 at 07:27:37

The following files have been deleted:

 file name   type mode  links   user   group  size            date
 ---------   ---- ----  -----   ----   -----  ----            ----
Am10130        -  0644      1  mab     root      0  May 19, 1987 at 07:27:37
dqms007633     -  0666      1  daemon  root    118  May 20, 1987 at 11:02:38

The following files have been changed; the previous attributes
are shown on the line with (old), and the current attributes
are shown on the line with (new):

    file name      type  mode  links  user  group  size            date
    ---------      ----  ----  -----  ----  -----  ----            ----
(old) .               d  0777      5  root  root   3072  May 20, 1987 at 11:02:45
(new) .               d  0777      5  root  root   3072  May 20, 1987 at 11:05:24
(old) newsa007443     -  0600      1  mab   root   6936  May 20, 1987 at 10:59:31
(new) newsa007443     -  0600      1  mab   root   6936  May 20, 1987 at 11:04:53
\!\!.ss
\!\!.cs R
\!.ss
\!.cs R
.cs R
.ss
.sp
.ps \n(PS
.vs \n(VS
.ce
Figure 3.  Sample output from an audit
.DE
.LP
Then if so requested,
the log becomes the new master file.
.PP
Several embellishments became necessary very quickly.
In terms of security,
the one which had the most impact derived from
system administrators' complaints that they were being inundated
by irrelevant information.
Specifically,
one of the more important system directories is \*(Lq/etc,\*(Rq
which contains system configuration and administration information.
Many files there must not be changed except by system administrators;
on the other hand,
there are several files which change whenever a user logs in or out,
or whenever a user changes his or her password.
Reports of the latter files changing tended to obscure
reports of the former changing.
The spooling directories present the extreme version of this problem;
transient files containing data to be printed
were being reported as deleted and added.
.PP
Two methods were devised to handle this problem.
With respect to files in system directories like \*(Lq/etc,\*(Rq
the system administrators did not mean that
.I all
characteristics of files expected to change should be suppressed;
rather,
they were concerned only that those characteristics
affected by expected changes to the files be ignored.
As an example,
consider the password file.
Changes to the file's size or date of last modification are expected
whenever a user changes his or her password, and so need not be reported;
but changes to the ownership or protection mode should be,
because only the superuser should be able to write on the password file!
So,
the program that compared the log to the master file
was altered to accept wildcards in any field of the master file
except the file name.
Using the password file as an example,
the size and date of last modification fields of the entry
for the  file would contain wildcard characters;
this way,
any changes in characteristics other than those two will be reported,
but system administrators will not receive a message from this audit subsystem
when a password is changed.
The program that replaced the master file with the log
was also modified to propagate the wild cards
if the log was to replace the master file.
.PP
Such a scheme would not work for the spool directories,
because every file inserted in that directory
has a unique name generated by the process generating the file.
For these cases,
an
.I ignore
mechanism was adopted.
Associated with each master file was an ignore file,
containing one pattern per line;
before the results of the file scan were compared to the master list,
all files in the output matching a pattern in the ignore list were deleted.
Note that this mechanism should be used with a great deal of care,
since it provides a filter behind which
an attacker can hide programs designed to break the system.
.PP
The second problem arose because much of the networking software is
unreliable.
We discussed the timeout solution to hung connections above;
unfortunately,
that solution created a second problem.
The network software occasionally transmitted part of a log properly,
and then terminated without indicating that the termination was abnormal.
On many occasions
between 5% and 30% of a log was transmitted before the connection was closed.
The audit phase would then report that 70-95% of a file system had been deleted.
Since in the NAS environment
the system is configured to generate new master files
during each run,
the next time the file system was audited and the remote connection worked,
that audit phase would report that the file system had grown by
roughly 1900%!
.PP
The solution was to define a \*(Lqthreshhold\*(Rq percentage
.I t
(typically between 5 and 10%);
if any file system appears to have
.I t %
or less of the number of files listed in its master file,
the results of the logging are saved
and an error message is sent to the systems administrator
warning of a potential problem with the audit.
It is up to the administrator to determine
if there is indeed a network problem
or if ($100 - t$)% of the file system has been deleted.
If the number of files in the file tree is too small
(on the order of 20 or fewer),
though,
this threshhold parameter is ignored and
all changes are reported.
.NH
Performance of the Existing System
.PP
Versions of this system have been in use now for two years,
and have worked quite well.
The latest version,
incorporating all the features described above and several others,
is running on a \s-2VAX\s0 11/780;
table 2 summarizes the files and file systems scanned.
.KS
.ls 1
.Tp
.TS H
box, center;
c s s s s
c s s c s
c c c c c
l c n n n.
Table 2. Summary of files scanned
file tree	time for scan
root	recursive?	#files	regular	CPU time	real time
=
.TH
\f2special\fP	y	685	824.5s	5064.6s
/	n	31	27.6s	59.9s
/RCS	y	3	25.5s	50.9s
/bin	y	67	30.6s	77.2s
/dev	y	301	54.7s	143.1s
/etc	y	274	50.6s	182.8s
/lib	y	8	26.8s	73.6s
/usr	n	27	27.4s	53.9s
/usr/adm	y	36	29.0s	67.0s
/usr/bin	y	105	36.4s	124.0s
/usr/crash	y	5	28.8s	56.8s
/usr/dict	y	16	27.9s	56.4s
/usr/doc	y	846	82.7s	147.4s
/usr/encore	y	6	26.5s	48.1s
/usr/etc	y	3	25.3s	42.3s
/usr/games	y	96	31.7s	60.0s
/usr/guest	y	0	24.6s	39.3s
/usr/hosts	y	249	43.4s	87.5s
/usr/include	y	94	37.4s	74.3s
/usr/lib	y	1225	112.3s	286.7s
/usr/local	n	10	27.3s	66.1s
/usr/local/adm	y	95	33.8s	56.9s
/usr/local/bin	y	87	30.9s	54.0s
/usr/local/doc	y	24	26.7s	54.0s
/usr/local/etc	y	62	28.2s	61.3s
/usr/local/lib	y	3715	293.7s	516.8s
/usr/local/man	y	81	32.7s	57.5s
/usr/local/src	y	1222	182.5s	490.0s
/usr/local/tftpboot	y	13	26.5s	67.4s
/usr/man	y	2419	265.2s	492.0s
/usr/mdec	y	20	26.1s	47.0s
/usr/msgs	y	1	24.7s	36.4s
/usr/new	y	104	30.8s	42.7s
/usr/old	y	2	24.9s	39.8s
/usr/preserve	y	0	24.2s	34.9s
/usr/pub	y	3	26.1s	40.6s
/usr/spool	y	2414	725.6s	1475.4s
/usr/src	y	17148	935.7s	1961.6s
/usr/ucb	y	112	31.3s	45.5s
/usr/unsupported	y	3213	236.1s	391.9s
.TE
.pT
.KE
.PP
The first column is the root of the file tree audited;
the second column indicates whether the entire file hierarchy
under that root is examined
(\*(Lqy\*(Rq means it is,
\*(Lqn\*(Rq means only the named root is examined)
The third column indicates the number of objects
(files, directories, etc.)
about which characteristics were obtained.
The fourth and fifth columns tell how many seconds
in both CPU time and wall clock time the scan for each file tree
(or directory) took.
The entry
.I special
represents the audit of all setuid,
setgid,
block device,
and character device files on the system;
chaacteristics of no other files are obtained
during this run of the audit tool.
.PP
The audit took 4606.7 seconds (or about 1\(14 hours) of CPU time
and 12827.6 seconds (or about 3\(12 hours) of wall clock time.
(Note that there is a slight increment from the time to set up
the command in each case,
but this is on the order of 1 second per file tree and so has been ignored.)
Since the audit is usually done at 3 AM,
the system would be more lightly loaded than when these timings were made,
so the wall clock time would be smaller;
in fact,
other methods \- such as auditing file trees in parallel \-
can reduce the wall clock time even more.
So,
while the time to run the audit is high,
it can be reduced considerably
by taking full advantage of the package's features.
Even so,
our experience has been that running this program early in the morning
does not impact the system noticeably.
.NH
Security Analysis
.PP
In this section,
we apply our discussion of the model propounded above
to the file monitoring package
to locate security weaknesses.
We shall begin with the logging phase,
proceed to the auditing phase,
and end with some general observations.
.NH 2
Window of Vulnerability
.PP
One characteristic of state logging mechanisms is infrequency of use
due to performance considerations,
and this tool is no exception.
A file which has been accidentally changed will not be reported
until the next run of the tool,
so if an attacker is able to use that change to penetrate the system
and restores the changed characteristics to their previous values,
that accidental change will not be detected.
It is imperative that users of this tool understand
that this package merely provides information
about the state of files at the precise moment the file's
characteristics are obtained;
its goal is not to detect changes that have been made deliberately,
or that have been undone (as in our example above.)
(Such a warning is circulated with the package.)
Enabling the detection of all changes would require
a change logging mechanism integrated with the kernel,
which is presently not acceptable for policy reasons.
The only way to reduce this window of vulnerability
is to run the package more frequently,
but doing so must be balanced with the performance degradation
that other users of the system will experience.
.NH 2
Nonsecurity of the Host Systems
.PP
Our discussion of the model showed there is little
if any guarantee that an attacker cannot interfere with the logging
and auditing software on a nonsecure host.
This package runs on systems using various nonsecure versions of the
.UX
operating system,
and it is very widely known that
.UX
and most of its derivatives are nonsecure
[.reid reflections widespread breakins,
stoll stalking wily hacker,
hogan protection imperfect,
bishop security problems unix operating system,
grampp morris unix operating system security.].
Thus,
an attacker who modifies the software can change monitored files
without those changes being detected.
Simply auditing the auditing and logging software itself is not enough,
because changes to those files can also be made undetectable!
(For a classic example of such an attack,
see
[.thompson reflections trusting trust.].)
Again,
this is recognized in the claim that
the package will only detect accidental changes to monitored files
and cannot detect changes made by sophisticated attackers.
.PP
To lower the probability that an attacker will be able
to modify the auditing software,
the auditing part of the tool can be run on a \*(Lqsecure\*(Rq machine;
if no such machine is available,
the software can be placed on a physically secure audit machine
that does not permit any remote logins
and only those remote connections necessary for
the software on the audit machine to carry out its functions
(and only when it expects such connections to exist.)
This system would have to allow a remote outbound copy
and remote command execution
for the audit system to work;
other services,
such as mail transfer,
should be denied.
(It may however be necessary to allow outbound mail only;
this is discussed in the second following section.)
Other services (for example, a key distribution facility such as Kerberos
[.steiner neuman schiller kerberos authentication service.])
could also reside on this machine provided they allowed
only the most rigidly controlled access to their services
on that host.
.PP
It should be noted that this still leaves
the remote logging software vulnerable,
since that
.I must
run on a computer connected to the file system being audited.
To lessen (but not eliminate) the possibility
of the logging software being tampered with,
the programs should be cross-compiled on the audit machine
and copied over at the beginning of each audit.
If that is not possible,
either they should be re-installed at the beginning of each run,
or they should be copied over to the audit machine and
have their integrity checked then,
for example by using the checksumming method discussed in the previous section.
.PP
Note that if the remote network software is properly corrupted,
the audit software can be fooled completely
regardless of what steps are taken to detect the tampering.
This re-emphasizes the point made in the introduction:
a retrofitted package such as this can
.I never
give the same level of security as a package designed into the system
at the same time as the system is being designed.
.NH 2
Nonsecurity of the Log-Audit Link in Remote Auditing
.PP
When the audit software and the logging software
reside on different hosts,
the system is vulnerable at the transmission medium.
Such communications are usually over a local-area Ethernet,
a medium with respect to which
reading,
modifying,
and inserting packets in transit from one host to another is very simple
[.herbison security ethernet.].
Hence,
an attacker need not alter the software at either end \-
he or she could simply record the true characteristics sent for an
uncorrupted log,
and then during the attack alter the erroneous characteristics
while they are in transit.
.PP
There are several ways to hinder this.
The first is to protect the network cable physically,
so no unauthorized person may tap into it.
This will not prevent an attacker who gains control of a host
legitimately on the network from staging such an attack.
The second way is to use cryptography to protect the messages.
.PP
The classic, single key, form of cryptography will ensure privacy;
it however can be broken by non-cryptographic means.
Because the audit software and the logging software
must each have a copy of the key resident,
a clever attacker can determine that key by
tapping into the network and reading the messages
from the audit software to the logging software,
or analyzing the
logging software (possibly including its image in the machine's main memory),
and use that to decrypt, alter, and re-encrypt the log message
as it is in transit over the network.
Public key cryptography does not have this vulnerability,
since the audit software's public key is sent to the logging software.
If an attacker finds that key,
it is still computationally infeasible for him or her
to determine the private key which would enable the log messages to be
decrypted.
Thus,
any message the logging software sends will be genuine.
.PP
However,
spoofing is still a threat
if the attacker could detect when the auditing software were expecting
a log message,
and also could prevent the genuine log messages
from being sent.
Then he or she could simply forge messages indicating all was well.
Preventing this requires authenticity,
which in turn requires the logging software to have a private key,
which an adept attacker can determine
(just as if a classical cryptographic method were in use).
If the attacker could not tell when the auditing software
was expecting log messages,
then he or she could generate many false log messages;
if these were in addition to the genuine ones,
this would merely generate numerous warnings or reports
of problems with the file system being audited,
since either the log report would appear unintelligible
(generating an error report)
or two lines \-
the genuine one and the fabricated one \-
would be inserted into the log.
Preventing this determination would be quite difficult,
however.
.NH 2
Nonsecurity of the Dissemination of Results
.PP
Yet another vulnerability arising from informative auditing mechanisms
is that of the notification stage;
problems could escape detection if the results being sent
were altered or deleted.
Currently,
those results
and any errors encountered are mailed to the appropriate person
using electronic mail.
Electronic mail is not encrypted
or otherwise protected while in transit to its recipient's host.
This lack introduces a vulnerability related to
informing the administrators of the results of the audit:
an attacker can just change the mail!
.PP
Mail in transit can be altered in one of two ways.
If the mail is sent using a store-and-forward mechanism
(such as UUCP
[.nowitz uucp overview.]),
it can be altered on an intermediate node;
also,
if the letter is sent over a network without using encryption,
an active wiretapper could easily alter its contents
by tapping the network wire,
intercepting the message,
altering it,
and sending it on.
The message is also vulnerable when it arrives at its destination;
should an attacker succeed in penetrating
the machine on which the administrator receives mail,
the letter giving the results of the audit could easily be altered.
.PP
There are a number of proposals extant that would alleviate these problems,
although none would solve them entirely because there is no such thing
as software that is completely trustworthy.
One proposal
[.rfc1040,
linn kent electronic mail privacy enhancement.]
has been made to provide privacy and authenticity to electronic mail;
implementing this scheme would require only that the relevant
software (and the computing base on which it runs) be trusted.
Assuming that,
the only effective attack would then be to delete the letter entirely.
Since the file monitoring package can (and should) be configured to
generate a letter for each file system checked
(that is,
either a report of changes or a letter saying there were no changes
will always be sent),
the recipient would discover this.
Because the mechanism used to provide authenticity and privacy is a public-key
cryptosystem,
the attacker would not be able to access the key and hence
not be able to forge a letter that the recipient would accept as authentic.
.PP
Unfortunately,
implementations of the scheme to protect electronic mail
are still quite experimental,
and none are stable enough to use on a \*(Lqdrop-in\*(Rq basis.
For this reason,
electronic distribution of the results should be abandoned completely,
and an alternate method \-
such as hardcopy or distribution on WORM disks \-
should be used.
This would require that a trusted courier have access to both the
machine on which the auditing software runs
and the administrators;
if this is not possible,
the administrators will need access to the area
in which the secure machine is located.
.NH 2
Accidental Elimination of Important Results
.PP
One other characteristic of informative auditing mechanisms
is that they must present their results in
an intelligible,
succinct,
easy-to-understand format so the human interpreting those results
is not overwhelmed or confused by them.
Experience has shown that 
if a system administrator receives too much information,
all of it will be ignored,
the real alarms along with the false ones;
hence,
without these suppression mechanisms,
there is a good chance that the results from the file monitoring package
would be ignored.
So there is a tradeoff,
just as in other areas of security
[.reid reflections widespread breakin.],
and the goal of the practitioner is to choose the balance
most useful to the users of the tool.
.PP
Although added at the request of the system administrators using the package,
the ignore mechanism is simply too dangerous to leave in place.
However,
the use of wildcards in the master file does not always work
(recall the file spooling example in the previous section),
so an alternate, additional mechanism must be used.
One such mechanism would be a
.I "pattern master file" .
.PP
One pattern master file would be associated with each master file,
and would have entries just like the master file except
that the names might be patterns.
Then if a file is not found in the master file,
the pattern master file would be searched for an entry
with a matching pattern.
When one is found,
the log entry is compared to the characteristics listed for that pattern.
.PP
For example,
suppose outgoing mail
is kept in two files in the mail spool directory;
the files have names of the form
\*(LqLtr\f2nnn\fP\*(Rq
and
\*(LqAdr\f2nnn\fP\*(Rq,
where
.I nnn
is any three decimal digits.
A pattern master file for the mail spool directory
might contain the lines
.DS B
.Tp
.ta 8n 16n 24n
.cs R 25m
.ss 25m
\!.cs R 25m
\!.ss 25m
\!\!.cs R 25m
\!\!.ss 25m
Ltr[0-9][0-9][0-9]  -  0444  1  root  daemon  *  *
Adr[0-9][0-9][0-9]  -  0644  1  daemon  daemon  *  *
\!\!.ss
\!\!.cs R
\!.ss
\!.cs R
.cs R
.ss
.sp
.pT
.ce
Figure 4.  Pattern master file.
.DE
.LP
Then if an attacker tried to hide an executable file with the name
\*(LqLtr123\*(Rq,
the file would be reported because its protection mask (0755)
did not match the expected one (0644).
.PP
The next problem arises because the machine on which the tool runs is untrusted;
as we pointed out in the discussion of the model,
this means that the tool's trust in the system programs
on which it is built may be misplaced.
.NH 2
Reliance on System Software
.PP
This package assumes the underlying programs on the system
have not been modified to produce incorrect results.
(Appendix I shows which system programs
the logging and auditing parts of the package use.)
Since none of the hosts on which the tool runs is in fact secure,
an attacker could simply replace the standard system programs
with programs of his or her choosing.
A clever attacker could do so in such a way
that the programs would not appear to be changed when
their characteristics were logged!
This is a very serious vulnerability;
unfortunately,
it is also one against which little can be done.
.NH 2
Cryptographic checksumming
.PP
As our discussion of auditing emphasized,
one major weakness of any auditing tool
is the assumption that the characteristics being logged
will be sufficient to detect all changes to the files being monitored.
This is not necessarily true;
in the context of this particular tool,
a combination of well-intentioned commands
(or a skilled attacker)
could easily reset most of the characteristics
after changing a file.
Cryptographic checksumming is the only way to detect
(with a high probability,
although not to a level of certainty)
alterations to the data contained in a file.
.PP
But using such a checksum introduces several problems.
The first involves the way the
.UX
operating system keeps track of times related to the files.
The time of last modification to the data,
the time of last access (read or write) of the data,
and the time of last modification to either the data
or parts of the file descriptor\(dg,
.FS
.IP \(dg 2n
Specifically,
changes to the protection mode,
the owner or group,
the number of hard links,
the name of the file,
and
the time of last access or modification to the data
reset parts of this quantity.
In
.UX
parlance,
the file descriptor is called an
.I inode .
.FE
are recorded in the inode;
when the scanner runs,
if it checksums the file,
it will alter the time of last access.
It is possible to reset this time;
however doing so will change the time of last modification to the inode.
In other words,
of the three quantities recorded,
at most two may be left as they were before checksumming
should the auditor checksum the file.
(Incidentally,
this is the reason the final field of the log entries
gives the time of last modification to the data,
rather than to the data and the inode;
when checksumming is added,
checksumming the file and resetting the relevant quantities
will not alter that final field in the log.)
.PP
The second difficulty is time.
There are many checksumming algorithms described in the literature;
all however take a substantial amount of time to compute unless
special-purpose hardware is used.
For example,
using a fast portable software version of the DES
[.bishop application fast des implementation.],
a block chaining scheme with ciphertext feedback
would take about 2\(12 seconds to checksum an 8000 byte file on a VAX 11/780.
Referring back to the auditing configuration described in table 2,
we shall examine how checksumming files impacts performance on that
system.
.PP
Table 3 summarizes the relevant characteristics of the system involved.
.KS
.ls 1
.Tp
.TS H
box, center;
c s s s
c c c s
c c c c
l n n n.
Table 3. Checksumming characteristics
file tree	# files to	total bytes
root	checksum	not rounded	rounded
=
.TH
\f2special\fP	247	7407796	7407800
/	17	1105703	1105736
/RCS	3	10738	10744
/bin	67	1264654	1264656
/dev	2	8278	8288
/etc	263	2878298	2878784
/lib	8	464326	464344
/usr	0	0	0
/usr/adm	35	620183	620216
/usr/bin	105	1925088	1925128
/usr/crash	5	16954	16976
/usr/dict	13	3874180	3874224
/usr/doc	746	6433680	6436360
/usr/encore	6	191568	191568
/usr/etc	3	3698745	3698752
/usr/games	93	2732439	2732632
/usr/guest	0	0	0
/usr/hosts	125	2794324	2794328
/usr/include	64	131018	131216
/usr/lib	1193	6760972	6764760
/usr/local	0	0	0
/usr/local/adm	89	324906	325168
/usr/local/bin	81	8711404	8711488
/usr/local/doc	22	648966	649048
/usr/local/etc	60	1263036	1263136
/usr/local/lib	3628	40044008	40054752
/usr/local/man	59	203997	204216
/usr/local/src	1144	12464794	12468432
/usr/local/tftpboot	12	2976874	2976920
/usr/man	2386	5615737	5624264
/usr/mdec	20	53096	53112
/usr/msgs	1	8	8
/usr/new	99	3620056	3620184
/usr/old	2	53248	53248
/usr/preserve	0	0	0
/usr/pub	3	5558	5576
/usr/spool	2189	12977963	12980592
/usr/src	16266	84788416	84837752
/usr/ucb	110	5829638	5829664
/usr/unsupported	3067	48595808	48604272
.TE
.pT
.KE
.LP
In this table,
the column \*(Lq# files to checksum\*(Rq is the number of files
to which checksumming may be applied;
the other files are directories,
device files,
or other types of files for which checksums are meaningless.
The total number of bytes in the files for which checksums are meaningful
is given in the third column;
since some cryptographic checksumming algorithms
encrypt 8 bytes at a time,
the fourth column contains the sum of the sizes of those files
rounded up to the nearest 8 bytes.
Assuming 1 DES encryption takes 2.443\(mu10\u\s-2\-3\s0\d seconds,
table 4 summarizes the length of time needed
to checksum the files in each file system:
.KS
.ls 1
.Tp
.TS H
box, center;
c s   s s
c c | c c
c c | c c
l n | l n.
Table 4. Time to checksum
file system	DES in	file system	DES in
root	CFB mode	root	CFB mode
=
.TH
\f2special\fP	2262.2	/usr/local	0.0
/	337.7	/usr/local/adm	99.3
/RCS	3.3	/usr/local/bin	2660.3
/bin	386.2	/usr/local/doc	198.2
/dev	2.5	/usr/local/etc	385.7
/etc	879.1	/usr/local/lib	12231.7
/lib	141.8	/usr/local/man	62.4
/usr	0.0	/usr/local/src	3807.5
/usr/adm	189.4	/usr/local/tftpboot	909.1
/usr/bin	587.9	/usr/man	1717.5
/usr/crash	5.2	/usr/mdec	16.2
/usr/dict	1183.1	/usr/msgs	0.0
/usr/doc	1965.5	/usr/new	1105.5
/usr/encore	58.5	/usr/old	16.3
/usr/etc	1129.5	/usr/preserve	0.0
/usr/games	834.5	/usr/pub	1.7
/usr/guest	0.0	/usr/spool	3963.9
/usr/hosts	853.3	/usr/src	25907.3
/usr/include	40.1	/usr/ucb	1780.2
/usr/lib	2065.8	/usr/unsupported	14842.5
.TE
.pT
.KE
.LP
So,
to checksum all files would take 82630.9 seconds (about 23 hours) of
additional CPU time.
This is clearly unacceptable.
.PP
If the cryptographic checksumming is added as an option to
.I lstat ,
only certain file systems need be checksummed on each run;
in fact,
most likely only those executables in the superuser's path would be
checksummed daily,
since if those are altered the damage could not be contained.
The directories involved would typically be
\*(Lq/\*(Rq,
\*(Lq/bin\*(Rq,
\*(Lq/etc\*(Rq,
\*(Lq/lib\*(Rq,
\*(Lq/usr/bin\*(Rq,
\*(Lq/usr/lib\*(Rq,
\*(Lq/usr/etc\*(Rq,
and
\*(Lq/usr/ucb\*(Rq;
this would add 7308.2 seconds (about 2 hours) to the CPU time.
This is much more acceptable.
.PP
The process can be made even quicker,
but at the cost of sending more information to the logging process.
Not all files in the directories named above need be checksummed
because the data in them is expected to change.
For example,
when a user logs in,
an entry is added to a file in \*(Lq/etc.\*(Rq
Checksumming this file would be ridiculous,
since it would simply show the file had changed.
So adding a mechanism to specify the files to be checksummed
independently of
.I lstat
would reduce the time needed for cryptographic checksumming.
However,
the files that need not be checksummed
must be identified very carefully
lest important files whose contents should not be altered
be erroneously included in this list.
.PP
Such a scheme could be implemented by adding another program to
.I auditls
that would accept as input some initializing information
(for example,
a polynomial or an initial vector)
followed by a list of files to checksum.
It would then send back the checksums,
and those would be compared to the precomputed master checksum.
.PP
Checksumming is not a panacea,
however;
aside from issues of choosing an algorithm well
[.meyer matyas cryptography.],
the checksum will spot alterations extant only when the
logging is performed.
In other words,
an attacker can copy the file,
alter it,
and then just before the logging process accesses the file to
compute the checksum,
the attacker replaces the altered file with the previously copied version.
This re-emphasizes a point made throughout this paper;
a retrofitted package such as this can
.I never
give the same level of security as a package designed into the system
at the same time as the system is being designed.
.NH
Conclusion
.PP
We have described a file monitoring tool developed
to detect changes to files
and which had to be placed on a system
in such a way that the kernel of that system
was not modified.
A brief overview of that package was given,
and the weaknesses of the package were examined
in light of the model propounded.
These weaknesses may be summarized as follows:
.IP \(** 3n
Since the tool uses a state logging mechanism in a production environment,
it can be run only infrequently
(once a day) and so there is a very large time window
when an attacker can change something,
work undetected,
and erase evidence of his or her work.
.IP \(** 3n
The tool attempts to eliminate unnecessary messages
so that messages warning of potentially serious problems
stand out.
While this mechanism reduces the number of false alarms,
a clever attacker could exploit it to conceal evidence of his or her attack.
.IP \(** 3n
Since
.I none
of the tool runs on a protected machine,
an attacker could alter it
and thus escape detection.
.IP \(** 3n
When auditing is done on a remote host,
the logging messages are sent over an ethernet in the clear
and so could be altered in transit.
Since results are disseminated using electronic mail,
the same is true of that;
also,
both are dependent on the network software being uncorrupted.
.IP \(** 3n
As the hosts to which electronic mail is sent are nonsecure,
the letters containing the results of the audit
could be altered on those hosts.
.PP
This analysis leads to methods of correcting
or ameliorating some of these weaknesses,
but others simply cannot be due to the constraints
under which the package was designed and implemented,
and the model again has anticipated these.
.PP
The model of logging and auditing that we have described
is comprehensive enough to encompass very different schemes
used in a variety of contexts;
for example,
statistical database query control
and file access monitoring systems
do not seem to be related and
yet they create closely related security problems,
and the mechanisms designed to improve the security of one
will also improve the security of another.
By using this model to classify different auditing schemes,
their usefulness for a given situation may be more readily apparent
since much of the analysis stems from
the particular classification.
This will assist designers and system managers in their analysis
of security monitoring products and schemes.
.SH
References
.LP
.ls 1
.[]
