[24]
![Forwards to [25]](../all_the_pictures/arrow_right.jpg)
Borg: A Scalable and Secure Distributed Information System
Han Kiliccote1 and Pradeep Khosla2Institute for Complex Engineered Systems
Hamburg Hall 1201
Carnegie Mellon University
Pittsburgh, PA 15213
1. Introduction
Borg is a novel paradigm for distributed agent-based information systems that retains advantages of centralized architectures while eliminating the need for a central server. In Borg agents are responsible for performing the services usually provided by the server and master agent. Also, by applying a novel security and replication scheme, Borg makes the information system safe against attacks, robust towards failures and resilient to performance bottlenecksData and reasoning capabilities are distributed to Borg Agents. The agents in this architecture perform the services usually provided by the server and master agent. However, agents still perceive the services provided by other agents as performed by a "monolithic server" or "master agent." Agents can request services (such as to store data) from this "virtual" server. This virtual server is the aggregation of all the agents and acts as a single logical unit. Each agent connected to the system becomes part of the whole system. It downloads part of the knowledge stored in other Borg Agents, and contributes to the queries. The information in the database is replicated using an innovative method multiple times, which eliminates dependency on any individual. Even if significant percentages of the Borg agents are damaged or destroyed, the overall performance is minimally affected and no information is lost. Since there is no central command structure (master agent) or central storage facility (server) in Borg, it also does not contain a single point of failure, i.e., it is not possible to destroy Borg or cause a security breach by eliminating or capturing a few selected individuals.
While retaining the advantages of centralized architectures, our approach eliminates the need of a central server and master agent by making agents responsible for performing the services usually provided by the server and master agent. Also, by applying a novel security and replication scheme we developed, we make the system safe against attacks, robust towards failures and resilient to performance bottlenecks.
2. Overview
Borg assumes that during the normal operation of the system, some percentage of agents in the system will not be available, i.e., the unavailability of some agents is not an exception but the norm. Based on this assumption, to make the system robust against component failure, Borg fully distributes all information in a redundant fashion.Since in the envisioned application domains, the number of agents can be very large (e.g., millions of agents), it is necessary to devise automatic schemes for partitioning and distributing knowledge that meet our security, reliability and efficiency criteria. When a Borg Agent is created and added to the system, it downloads some part of the information that can be extracted from other agents. The amount of information that is extracted depends on various aspects. These include: the available information in the agent-based system, storage available in the agent, the processor speed of the new agent, the expected reliability of the links to the machine in which the Borg agent is running, the reliability of this machine (if known), and the value of the information for potential intruders. Our approach will also incorporate (not yet implemented) information such as MTTF (mean time to failure) and MTTR (mean time to repair). Since many of these criteria are dynamic, the amount of information stored in an agent may change dynamically.
The mechanisms for distributing information across agents are entirely distributed, to minimize the vulnerability and to ensure the scalability of the approach. There is no central database that specifies which information is stored on which agent; instead, each agent can determine by itself the agent that can fulfill a request for information, and it also possesses information about how to find out about the expertise of other agents. This is accomplished by associating each piece of information with an identifier that can be used to find the location of the information that does this. This novel approach makes it possible to obtain information without having to query all agents (e.g., via broadcast). It therefore avoids unnecessary consumption of precious bandwidth resources.
3. Security through Distribution
Security is one of the primary advantages of Borg. The first security scheme is provided by the very nature of the Borg architecture. In Borg, information is dynamically broken into smaller pieces and distributed to many agents. Thus, agents never possess large quantities of information that can endanger the mission if intruders captured or otherwise gained access to the information contained in a small number of agents. This type of information hiding is similar to steganography (making information invisible). Steganography deals with the hiding of messages so that potential monitors do not even know that a message is being sent. It is different from cryptography where the monitors have the encrypted message but cannot decrypt it easily. This scheme is similar to a jigsaw puzzle. For example, the enemy, by examining the 20 pieces of a jigsaw puzzle shown in Figure 1, cannot extract the information stored in the bigger picture because there are so many other pieces missing.

Figure 1. Puzzle pieces and completed puzzle
Since their invention 20 years ago, various secure information dispersal algorithms have been proposed. One of the first such protocols, described in (Blackley 1979), divides any message (e.g., the secret recipe of a soft drink) into n pieces, called shares or shadows, such that any m of them can be used to reconstruct the message. Such a scheme is also called an (m-n)-threshold scheme. For example with a (4, 10)-threshold scheme, someone can divide the message among ten participants, such that any four of them can put their shares together and reconstruct the secret message. Even if six participants are not available, the message can still be constructed. However, the threshold scheme guarantees that three participants cannot, in anyway, reconstruct the message. Blackley's scheme work in an m dimensional space where secrets are random points and shares are m-1 dimensional planes.

Figure 1 Simplified Blackley's Secure Information Dispersal Scheme, m = 2, n = 3
Secret sharing schemes were invented independently by (Shamir 1979) and (Blackley 1979) and have been extensively investigated since then (refer to Simmons 1992 for an introduction to the topic). During the last twenty years, various methods and algorithms were developed to extend the capabilities of threshold schemes. Some of these algorithms even guarantee integrity of the information with cheaters (participants that deliberately reveal wrong shares), verify the validity of the shares without reconstructing the message, prevent the reconstruction of the share if more than k participants do not want to reveal the secret, and dynamically update the shares so that a participant can be disenrolled.
We are currently working on a distributed virtual processor that extends the secret sharing schemes so that the distributed information can also be processed securely. In this virtual processor, the information is distributed to multiple agents using a special secret sharing scheme we developed. To process the information, each agent uses the special methods we developed and updates their local shares without communicating with (thus without revealing the information). At the end of the computation, only the output of the computation needs to be combined to reveal the result of the computation.
4. Methods for agent and information replication
As the number of agents becomes very large, the system is subject to frequent component failure. Thus, reliability against failure is a primary concern in developing large agent-based systems. The mechanisms we developed for securing information are also used to automatically replicate information many times across many agents. Every time information is replicated, it is partitioned differently. Using the analogy of the jigsaw puzzle, our approach creates many puzzles, all with different pieces (cuts). The proposed scheme provides reliability for individual component failure. Our new mechanisms ensure that after the dispersion, the data is in fact replicated n - m times. Thus, as long as not all n - m agents that carry a piece of information fail all at the same time, no information will be lost. We are also working on providing mechanisms for automatically propagating information among agents that carry duplicates of similar information and mechanisms that regenerate shares when a large percentage of Borg agents become unavailable.Technically, our scheme is realized through one-way hashing functions. More specifically, our approach employs a family of different hashing functions, which are guaranteed to sort information into different buckets (and hence to yield different partitionings of the information).
In the future, we are planning to employ methods that dynamically update the structure and the behavior of the system by monitoring and analyzing system wide performance metrics. For example, one extension we are working on will allow Borg to dynamically create more agents and replicate the most-requested information and services in those agents as the load in the system increases or failures cause the system reliability to drop below a certain level. We are also planning to add functionality to Borg that can be used to automatically eliminate agents, or change the services provided by certain agents, so as to improve the overall system performance and robustness. We are also planning to develop methods that will enable Borg to intelligently recover from and to manage faults, by substituting working agents for those that have failed. Our research will devise mechanisms that increase the overall system performance, by replicating services and information stored in agents.
5. Summary
Borg is a general framework for developing highly distributed, very large agent-based information systems. The infrastructure of Borg is founded on a serverless distributed semantic network architecture. This architecture distributes information stored in the global knowledge base to multiple computers so that breaking into a few selected machines will reveal no information or no information is lost by the failure of a small percentage of components or links.This architecture can be used in many mission-critical applications because (1) the architecture provides the ease of implementation of centralized architectures; (2) the reliability is higher compared to other systems; (3) security of the information is attained by using an innovative method; (4) the architecture is scalable to very large number of agents.
1 Research Engineer. Institute for Complex Engineered Systems. Carnegie Mellon University
2 Professor and Director. Institute for Complex Engineered Systems. Carnegie Mellon University
[24]
![Forwards to [25]](../all_the_pictures/arrow_right.jpg)





