CERT
Back to [16]   [17]    Forwards to [18]



Position Paper for the 1998 Information Survivability Workshop

Emergent Algorithms for Survivable Systems

David A. Fisher    &    Howard F. Lipson
dfisher@cert.org          hfl@cert.org

CERT® Coordination Center
Software Engineering Institute

Key sectors of our society are becoming increasingly dependent upon highly distributed information systems that operate in unbounded networks, such as the Internet. This includes many of the information system components of our nation’s critical infrastructures [PCCIP 97]. As a result, the risks to our society associated with successful intrusions into these information systems have never been greater.

Traditional security approaches may not be adequate to protect large-scale highly distributed systems that operate in unbounded networks. In the realm of networked information systems, the natural escalation of threats versus countermeasures has demonstrated time and again that practical systems are always vulnerable to attack. In fact, the fundamental underlying assumption of survivability is that no individual component of a system can be made immune to all attacks, accidents, and design errors. Most attributes of survivability are characteristics of a system as a whole and not of its individual components.

This paper introduces the concept of emergent algorithms as a new method for enhancing survivability in unbounded systems. Survivability involves the generation and maintenance of software quality attributes in the form of global nonfunctional properties that often cannot exist at the level of individual system components. Thus, distributed or network-based solutions for enhancing survivability are not only characteristic of the problems of primary concern, but may also be essential to their solution.

Distributed solutions that merely partition sequential solutions and then distribute the parts over a network often require resources in each node proportional to the size of the network. Emergent algorithms offer the possibility of fulfilling survivability, performance, and other nonfunctional software quality goals, in highly distributed systems operating on unbounded networks.

The emergent character of attributes for survivability and the need for more effective methods to generate and maintain other global nonfunctional properties in unbounded networks, suggest an approach analogous to those of natural processes in biological systems, social behavior, and economic systems in generating emergent properties. The current fascination with emergent computation [For 91a, For 91b, Res 96] arises to a great extent from observing global properties that are naturally generated, but do not exist anywhere locally within components of those systems.

Intuitively an emergent algorithm produces global effects through cooperative local actions distributed throughout a system. The task is to refine this notion into a set of guidelines that are useful for developing and analyzing practical emergent algorithms that fulfill explicit functional and nonfunctional requirements. A first step in this process is to characterize emergent algorithms in a manner that differentiates them from other computations, through formal analysis and testing.

A useful definition of emergent algorithm would include computations that produce global properties that do not exist locally, but should not exclude similarly structured computations in which the properties exist locally as well as globally. This line of thought quickly leads to the realization that any attempt to characterize emergent algorithms entirely by the effects they produce will be either too limiting or will include any computation that generates global properties. Their characterization must include aspects of how the computation is organized, structured and carried out.

The final distribution of the results of a computation cannot serve as a useful criterion to distinguish emergent algorithms.  :Thus:

An emergent algorithm may produce results that are distributed globally but do not exist locally, that are entirely local, or that are represented both locally and globally.

Intuitively, emergent computations involve local actions and distributed cooperation.

The concept of "local action" includes the requirement that the nodes involved in a computation can communicate directly only with their immediate neighbors. There is no need to place limitations on the topology of the interconnections among nodes and their neighbors, but there must some limitation on the number of immediate neighbors per node. Any number proportional to the total number of nodes would fail to impose a formal limit. It is convenient and often desirable to limit the number of immediate neighbors per node to a fixed constant independent of the number of nodes, although any value less than proportional to the number of nodes could be used.

Each node executing an emergent algorithm can communicate directly only with a number of immediate neighbors that is less than proportional to the total number of nodes in the system.

Global visibility is the ability to observe the state of nodes throughout a system. If an emergent algorithm could read the state of nodes beyond its immediate neighbors (i.e. nodes proportional in number to size of the system), it would violate the communications requirement above. Similarly, central control is the ability to alter the state of nodes throughout a system. If an emergent algorithm could write to nodes beyond its immediate neighbors, it likewise would violate the communications requirement above.

An emergent algorithm can make use of neither global visibility nor central control. Each participant in an emergent computation can neither read nor write directly to nodes proportional in number to the size of the system.

Emergent computations are cooperative and distributed. For convenience and by custom, distributed normally means that the number of nodes involved in a computation is proportional to the total number of nodes in the system.

Results of an emergent algorithm must be critically dependent on information that is distributed over multiple nodes whose number is not bounded by any constant independent of the size of the system.

Whether in real-life situations, genetic algorithm, or networked computer systems, the topology of the interconnections that give meaning to the term immediate neighbor can change frequently. In the physical world, immediate or near neighbor is typically defined by distances which change as the participants change position. Networks are regularly reconfigured or expanded to satisfy changing needs. Thus, although every node participating in an emergent algorithm must be able to communicate with its immediate neighbors, the algorithm itself should not depend on knowledge of the network topology. The critical role of communications and relative unimportance of topology suggests that many aspects of emergent algorithms could be expressed most effectively as protocols. In the absence of global visibility and control, the only aspects of an emergent computation not covered by protocols are the state transitions within individual nodes:

An emergent algorithms can be expressed as a protocol together with a set of local state transitions.

Perturbation and undirected communications activity can be beneficial to the success and performance of emergent computations. Computations cannot adapt or converge to solutions until they have access to the appropriate information. In the context of emergent algorithms, information can only be transmitted through sequences of immediate neighbor communications. The available sources and needed destinations of information are often unknown locally. Strategies that depend on sending information only to where it is known to be needed (and only when it is needed), or on requesting information only from where it is known to be available, may be unsuccessful. Performance of an emergent algorithm may sometimes be improved by using otherwise idle communications capacity to transmit available information that might be needed elsewhere.

Activity, in the form of frequent communications among neighboring nodes, has the effect of distributing information to where it may be needed and is essential for rapid and efficient emergent computation.

Emergent algorithms are inherently distributed in character, may involve unreliable information and untrustworthy participants (automated or human) including those outside the administrative control of the system, may be implemented on platforms that are dynamically changing or not fully known, and may involve components that have been compromised. Thus, emergent algorithms may produce stochastic results. This is not inherent in the algorithms themselves, but as a practical matter is almost always dictated by the characteristics of the application. Any application with survivability requirements, and accompanying assumptions of compromised components, imposes a stochastic character on solutions.

An emergent algorithm may demonstrate stochastic behavior or have outcomes that are guaranteed only as a function of some statistical properties of the problem space.

Given all of the above considerations, we are led to the following definition:

An emergent algorithm is any computation that achieves formally or stochastically predicable global effects, by communicating directly with only a bounded number of immediate neighbors and without the use of central control or global visibility.

As the definition indicates, an emergent algorithm produces global effects through cooperative local actions distributed throughout a system.

If one is modeling natural phenomena, biological processes, or social systems that involve emergent properties, performance may not be a major concern. Our primary concern is with the practical application of emergent algorithms where conventional algorithms are unavailable or too expensive. Thus, the execution and storage costs associated with emergent algorithms have been a major focus of our work at the CERT/CC.

To gain insight into the potential performance of emergent algorithms, the above guidelines were used to develop emergent algorithms to solve some distributed network problems for which there are currently known and widely used solutions (such as the problem of Internet routing). The resulting emergent algorithm for Internet routing has the same execution costs as currently used routing algorithms (i.e., order ln3N, where "N" is the number of nodes in the system), but has space requirements per node that are extremely small relative to current solutions (i.e., order ln3N versus order N ln N).

 

Footnote:

® CERT is registered in the U.S. Patent & Trademark Office

References

[For 91a]
S. Forrest, editor, Emergent Computation — Proceedings of the Ninth Annual CNLS Conference, MIT Press 1991, 452 pp.

[For 91b]
S. Forrest, "Self-Organizing, Collective, and Cooperative Phenomena in Natural and Artificial Computing Networks", Introduction to the Proceedings of the Ninth Annual CNLS Conference [see For 91 a], p. 1-11.

[PCCIP 97]
Presidential Commission on Critical Infrastructure Protection, Critical Foundations — Protecting America’s Infrastructures, The Report of the Presidential Commission on Critical Infrastructure Protection, October 1997, 173 pp.

[Res 96]
M. Resnick, "Beyond the Centralized Mindset", Journal of the Learning Sciences, Volume 5, No. 1, 1996, p. 1-22.




Back to the Table of Contents
Back to [16]   [17]    Forwards to [18]