CERT
Back to [32]   [33]    Forwards to [34]



The Common Intrusion Detection Framework (CIDF)

Stuart Staniford-Chen*, Department of Computer Science, University of California at Davis.
Brian Tung, Information Sciences Institute, University of Southern California.
Dan Schnackenberg, The Boeing Company.

1. Introduction

This position paper is intended to provide a quick overview of the objectives and approach taken by the Common Intrusion Detection Framework (CIDF) working group. This working group was formed as a collaboration between DARPA (Defense Advanced Research Projects Agency) funded intrusion detection and response (IDR) projects, although the effort is now open to anyone who wishes to participate.

The goal of the CIDF working group is to develop a set of specifications enabling-

This document first provides some background explaining the need for standardization. This is followed by the objectives and requirements for interoperable IDR components that were developed by the CIDF working group, and then the approach taken by the CIDF working group.

2. Background

Attacks against computer networks and systems are growing ever more sophisticated. It is no longer expected or possible for a single ID system to deal with every plausible form of attack.

At the same time, those attacks are taking place on a grander scale. They can be orchestrated across a wide-area network, and over a long period of time. The need to deploy a *distributed* ID system is becoming increasingly urgent.

In such an environment, the ability of intrusion detection systems and their components to share *advanced* information about these attacks is especially important. Such sharing would allow systems to combine attack indicators to more accurately identify and pinpoint attacks. Integration with response components and network management systems would also allow administrators to automatically deploy sophisticated response and recovery tactics.

In order to share that information, however, the various systems must agree on how to express it. Furthermore, since these ID systems may be deployed in widely varying environments, they must also agree on how to locate and communicate with one another.

3. Objectives and Requirements

The overall objective is to enable software reuse and interoperability for IDR components. The infrastructure on which the IDR components rely for interoperability must be secure, robust, and scalable. The focus of the CIDF working group's efforts has been to define an application-layer "language" for describing information of interest to IRD components, and a protocol for encoding that information for sharing between components.

First, the *language* used to express information about intrusions and related matters must describe objects that are almost arbitrary, so that the format of expressions should not be a fixed format. Instead, the language must be flexible enough to allow a component to express whatever relevant information it has available. At the same time, the language must not be so free-form that the receiving component cannot interpret it.

This language should have a wide enough vocabulary and sophisticated enough syntax to cover a broad range of expressions. This language must be expressive enough to express-

This language must also allow one to express relationships between events, to justify analysis results, and to explain complex responses (e.g., if this happens, then do that, else do the other thing). In short, one must be able to write "sentences" that are about other sentences. To be more specific, this language should be able to express at least the following:

Beyond expressiveness, the language must also have the following attributes.

  1. Unique in expression. It is probably not possible to have an expressive language that literally admits of exactly one formulation for each sentiment. Our requirement, therefore, is as follows: If a sender and a receiver can agree on the *objects* of interest, but not on the *way* they will express information about those objects, then they should still be able to understand each other. If they cannot, then the language is too arbitrary.

  2. Precise. Two receivers reading the same message must not draw mutually contradictory conclusions from it.

  3. Layered. There should be a mechanism in the language by which specific concepts are defined in terms of more general ones.

  4. Self-defining. It should be self-evident from a message how each datum within it should be interpreted. For example, a sequence of four octets (i.e., bytes) is not merely four octets, but an IPv4 address; and not merely an IPv4 address, but the address of a host; and not merely the address of a host, but the address of a host from which an FTP command was issued; and so forth.

  5. Extensible. There should be a mechanism by which a sender can use its own vocabulary, and indicate that fact to receivers, in such a way that receivers can either recover the meaning of the new vocabulary, or decide whether and how to interpret the rest of a message in which it occurs.

The method of encoding this information must also have the following attributes-

  1. Efficient. In comparison with a hard-wired format, a format that can be understood by any compliant receiver should be no more than twice as long over the long run. (In other words, the marginal cost of using this language should be no more than a factor of two.)

  2. Simple. Components that do not need to understand the semantics of the full language to fulfil their role in the system (e.g., sensors) should not be required to understand the full language to send and receive simple messages.

  3. Portable. The encoding for the language should not depend on the endian-ness of the host on which a message is encoded, or on the details of its networking.

  4. Minimal Complexity. Providing a language that meets the above objectives will be necessarily complex, however, the language and encoding complexity should be minimized to the extent feasible.

4. Current CIDF Language and Encoding

In this section, we will describe the approach that the CIDF group has taken toward achieving these objectives. We do not claim that it is the only possible approach, but it is the one that the current CIDF group has agreed upon.

The CIDF group decided on a format called S-expressions, which like Lisp expressions are lists grouped within parentheses. These S-expressions are headed by semantic identifiers, or SIDs for short, which indicate some semantics for the grouped list. For example, the S-expression

(HostName 'first.example.com')
is headed by the SID HostName, which indicates that the following string, 'first.example.com', is to be interpreted as the name of a host. Larger S-expressions may indicate the role of that host within an event, say. For instance, the following S-expression

(Delete
(Context
(HostName 'first.example.com')
(Time '16:40:32 Jun 14 1998')
)
(Initiator
(UserName 'joe')
)
(Source
(FileName '/etc/passwd')
)
)

is a complete sentence asserting that the user with username 'joe' deleted the file '/etc/passwd' from the host 'first.example.com' at 16:40:32 on Jun 14 1998. This example highlights special kinds of SIDs, such as verb SIDs (e.g., Delete) that show what happened, and role SIDs (e.g., Context, Initiator, and Source) that show "whodunit", and what it was done to, where it was done, and so forth. Other SIDs give details (such as the time) about these various components of the event.

These "sentences" in ASCII form are inefficient, so to save space, we have developed an encoding format that reduces the size of the messages.

When properly encapsulated, these encoded sentences form Generalized Intrusion Detection Objects, or GIDOs for short (CIDF is certainly not short on acronyms, as you can see!). In fact, the terms "sentence" and GIDO are often used interchangeably.

5. Infrastructure

Before ID components can even speak to each other, they must locate other components who have some reason to talk to them. A common language is no use if there is no common sphere of interest. Therefore, CIDF is also developing a "matchmaking" service which connects components that produce certain kinds of GIDOs with those that consume them.

The basic approach is to use a large scale directory service, LDAP (Lightweight Directory Access Protocol). Each components registers with the directory service and advertises the kinds of GIDOs it consumes and/or produces. On this basis the components are placed into *categories*, which allow other components to easily find those they wish to talk to.

The directory may also contain public key certificates, which allow components to authenticate each other and to verify each other's authorization information before sending GIDOs back and forth.

The additional infrastructure requirements is for a message layer that provide secure (privacy, authenticity, and integrity mechanisms), reliable, messaging in environements subject to attack.

6. Status (July 1998)

The CIDF language, APIs, and infrastructure are far from complete. The language needs to be constrained further to eliminate confusing ambiguities, there is still much of the directory service that needs to be ironed out, and there is a need for common APIs to meet this intial objective of reusable components.

Nevertheless, the CIDF working group is making progress. We recently conducted an interoperability test, involving independently developed programs using the CIDF specification. Many of the programs achieved full interoperability (within the limited scope of their tests), and two other programs attempting to do real (albeit simple) intrusion detection came within a hair's breadth of interoperating on the very first try.

We are presently planning a more complete test of CIDF in which several IDS systems will interoperate - this is slated for March '99.


* Presenting




Back to the Table of Contents
Back to [32]   [33]    Forwards to [34]