CERT
ISW'97 site

 Front Page | Table of Contents | Final Agenda | Index of Authors | Download




Back to [7]   [8]    Forwards to [9]
DEFINING SURVIVABILITY

Randy Browne
New Jersey Computer and Communications Corporation
2001 U.S. Route 46 East; Suite 310
Parsippany, New Jersey 07054-1315

In keeping with current thinking on open systems, one would like a survivable system to consist of a large number of individual survivable networks (clusters) that are "self-extending". Any number of such survivable clusters could be connected through various gateways, and would, with little (if any) human intervention, automatically form a larger survivable cluster perhaps by exchanging threat, vulnerability and service data, and then automatically replicating and/or migrating resources across cluster boundaries to actually increase (and not reduce!) survivability of services. Such "self-extending" clusters would have to be much more highly automated than contemporary "human/interactive" intrusion-detection systems, in order to thwart rapid and highly automated attacks. One (but not the only) technical barrier to achieving "self-extending" survivability, is finding an adequate definition for survivability. In this paper, I briefly touch on three issues; all of which pertain to both defining survivability and solving the problem, but with the main emphasis on definition.

(1) INITIAL RESEARCH SHOULD FOCUS ON DOMAIN-SPECIFIC SURVIVABILITY. In the near-term, it would be best to thoroughly address survivability for specific domains, and defer emphasis on general paradigms and mechanisms. My opinion arises in part out of my concurrence with the view (implied) in the call for this workshop; that survivability is a "compound-property" (i.e. means more than "the sum of" safety, correctness, security, integrity, service assurance, etc., as opposed to meaning something unrelated to these). Consequently, survivability is at least as hard as the integrity problem, since survivability should, among other things, mean or imply "maintaining some kind of integrity under duress". Much research was conducted during the 1980s to find general integrity models. While general integrity mechanisms have been found (e.g. Type Enforcement [BOEBERTKAIN]), there is no consensus on a general semantics for integrity, and most all of the meaningful issues are domain-specific. Since survivability is (at least) a "compound-property", a suitable definition of survivability would likely have some domain-dependencies inherited from its "sub-properties" (e.g. integrity).

(2) A CHALLENGE IS TO FIND "BLACK-BOX" SURVIVABILITY DEFINITIONS. Even if we restrict ourselves to domain-specific survivability, it is still a challenge to define survivability without referring to a system's high level design. It is easier to come up with good, but ad-hoc "gray-box" definitions for survivability than "black-box" definitions (the latter of which are probably simpler, more abstract, and consequently preferable). This is related to the first issue, but an example will illuminate other matters.

Consider a simple real-time communications system wherein every message must be delivered within 3 seconds. Suppose that at (time) T=0, a message with content "A" is input to the system; at T=1, a message with content "B" is input. Then, at T=2, a "virus" attacks the system, with the result that at T=3, "B" is output; at T=4, "A" is output. The "obvious" interpretation of the "virus attack" is that "B" was delivered on time (i.e. was 1 second early) and "A" was not (i.e. was 1 second late). But, an equally valid interpretation is that "A" was corrupted into "B" (and delivered on time) and "B" was corrupted into "A" (and also delivered on time). Unless we consider causal relationships between inputs and outputs, we cannot know whether our real-time communications system violates integrity or violates service assurance (i.e. timing constraints). Our communications system satisfies both integrity and service assurance, but not simultaneously; only when each property is considered in isolation of the other.

The above example is relevant to defining survivability, because for analysis purposes, it is desirable to separate survivability into its "sub-properties" (e.g. integrity, service assurance, etc.), and individually establish each "sub-property". The various "sub-properties" would then have to be consistently recombined into an overall survivability condition (probably using auxiliary conditions; noting that consistently recombining trustworthiness properties is hard [MCLEAN]). Now, partly because of the above example, and extrapolating from the results in [BOULAHIA-CUPPENS], I suspect that for a "compound-property" like survivability, causality best captures the link between "sub-properties". In the above example, integrity and service assurance are linked by a particular (causal) input/output relationship for a communications system. Had we been discussing a database system, the input/output behavior could entail a very different link between integrity and service assurance. Unfortunately, if I am correct about the need to express causal relationships (to link the "sub-properties" of survivability) in a specification, a top level survivability specification for a complex system is likely to be "littered" with auxiliary variables making reference to internal system state, subsystems, etc., which is more of a "gray-box" definition for survivability than "black-box". The challenge is to find "dark" (if not completely "black") definitions for survivability.

(3) MUCH "PREDICATE TECHNOLOGY" NEEDS TO BE DEVELOPED. One of this author's research interests is how mobile computing enhances survivability; yet, with the current state-of-the-art, mobile computing can actually impair survivability since various trustworthiness properties can interact in complex ways. In a tactical combat environment, one might have a critical system service replicated among a small number of server hosts (which are geographically dispersed for protection). Unfortunately, mobile communication patterns may (covertly) disclose the physical location of those servers (a so-called "covert channel") and thus expose those servers to the risk of physical destruction in combat (where I intend information survivability to include physical threats). Note that this can be a problem even if "users" do not have any explicit secrecy requirements; the requirement to avoid disclosing the physical locations of critical servers can come from the overall threat and not directly from "user's wishes".

Covert channel phenomena can thus impair survivability as in the mobile computing example. However, although the non-disclosure problem is well-studied (perhaps even over-studied); nonetheless, thorough and pragmatic treatment of the covert channel problem is largely unresolved by the few past attempts (e.g. [BROWNE, WEBER]). Thus, various "predicate technologies" (such as covert disclosure protection) need to be developed further in order to implement survivable systems. We must recognize that survivability demands better solutions to old problems, and not just solving new problems.

REFERENCES

[BOEBERTKAIN] Boebert, W.E., and Kain, R.Y., "A Practical Alternative to Hierarchical Integrity Policies", 8th NCSC, Gaithersburg, Maryland, 1985.

[BROWNE] Browne, R., "An Entropy Conservation Law for Testing the Completeness of Covert Channel Analysis", 2nd ACM Conference on Computer and Communications Security, Fairfax, Virginia, 1994.

[BOULAHIA-CUPPENS] Boulahia-Cuppens, N., and Cuppens, F., "Asynchronous Composition and Required Security Conditions", 1994 IEEE Symposium on Research in Security and Privacy, Oakland, California.

[MCLEAN] McLean, J., "A General Theory of Composition for Trace Sets Closed Under Selective Interleaving Functions", 1994 IEEE Symposium on Research in Security and Privacy, Oakland, California.

[WEBER] Weber, D., "Quantitative Hook-Up Security for Covert Channel Analysis", The 1988 IEEE Computer Security Foundations Workshop.



Back to the Table of Contents
Back to [7]   [8]    Forwards to [9]