- Pattern Name
- BeeHive
- Aliases
- Hub, Collector, Concentrator, Accumulator, Amortize Failure
- Context
- A data-mining process must make the best effort possible to locate, and then process, information that is very multi-dimensional and distributed.
- Problem
- A system must process information that is available through a vast state space that is constantly changing. Processing an area until it is exhausted is not enough because new information may become available at any time. Iterating through the state space sequentially is impossible due to its size.
- Forces
- Part of the effort to locate new information must be dedicated to inspect both completely new and previously inspected areas of the state space. Areas already inspected may be likely to have appropriate new information, but at relatively slow rate of appearance.
- Solution
- Break up the problem into many independent agents, each inspecting a portion of the state space in a somewhat random way. Have each agent return not only with the information harvested itself, but where the information came from as well. Record the information in a centralized repository. As each agent returns with its information, it is re-instructed to search in a new area with adjusted search criteria depending on current needs. Uninspected portions of the state space are penetrated by a stochastic process. When an area rich in appropriate information is revealed, then more agents are redirected into that state space when they become available.
- Resulting Context
- Parts of the data space finally get some attention rather than none. The effort spent investigating the vast unknown does not occupy a major portion of the processing power. This is essential particularly if well-known portions of the state space have a 'peak period' that requires dedicated resources. 'Off-peak' investigation optimizes overall utility of such systems because no absolute 'downtime' occurs.
- Design Rationale
- When a system must comb through large amounts of information in a vast state space, the biological model of a Bee Hive is a useful model for a 'collector-concentrator' approach to what is being gathered. Each bee goes out, looking for a nectar source, returning with whatever it finds and instructions on where to find it. Each bee is subject to a wide variety of conditions and factors that might by chance place it near a resource it can notify other bees about. The key factor here is that in order to exploit the state space effectively, a large portion of the initial effort to gather resources must be spent on failure. After having searched through the areas that yield nothing, eventually [shift to calculation-based metaphor] the collector node directs the satellite nodes to independently exploit the state space of the richest sources. The collector node also occupies a portion of its dialog with the satellites to direct them to as-of-yet uninspected areas or to revisit previously unfruitful areas. From an N-dimensional way of talking, the system as a whole does not exhaustively traverse the state space, but stochastically inspects it in a non-deterministic way. This 'fills in the picture' like the plotting of sparse points in a fractal pattern. Eventually, depending on the richness of the data source, the satellites may penetrate, via an N-dimensional 'pinhole', an area that is much richer than its 'neighborhood'. This methodology amortizes the cost of failure while taking full advantage of available resources.
- Examples
- Groups of Intelligent Agents with a common goal, Data mining the World Wide Web, ad hoc queries of Data Warehouses that search for critical data patterns, Making honey. (Immune response too, perhaps? -- BrianFoote ; Yes, the goal is to eliminate antigens. Each white blood cell is an independent agent, mutating antibody RNA sequences. -- David)
- Related Patterns
- ModelYourEnvironment, CriticalNumberOfWorkers, CriticalResourceFlow, ModelYourSelf
--
DavidCymbala
I appreciate your design rationale section, especially! That helped me most, even the N-dimensional stochastic search bit. Then I went back and read the pattern top to bottom understanding it. It does remind me a little bit of the stochastic anti-aliasing algorithm used in computer graphics (although there the data is static while enormous). I could do with some examples I understand though, if I get to put in a request. Simply saying data mining and ad hoc queries doesn't educate or convince me. Are there specific systems out there that use this technique (besides bees)? -- AlistairCockburn
After attending the Autonomous Agents '98 conference, I found
a veritable flood of examples of this pattern. Agents are ideal
examples of uses for it. The RoboCup? competition displayed some
intriguing examples of centralized control stratgies versus individual
robot autonomy. Making each robot smart enough to handle movement
and navigation tasks, they worked their way up to team strategy in the
centralized part of the system. -- David
try http://av.yahoo.com/bin/query?p=robocup&z=2&hc=0&hs=0
Verrry Interesting. Questions:
--
PeterMerel
A BlackBoard can be used by systems that create decision theoretic plans
and attempt to create a crude of sophisticated model of the strategy
environment. BeeHive probably doesn't directly imply this aspect of the
system. I'm hoping to beef up ModelYourEnvironment to address these issues.
-- David
Bees probably model the external environment as a group, building mutual
awareness of food sources. The 'dance' that a bee does to transmit the
information to the other bees in the hive probably also reinforces the
model of the internal state of the hive that the bees collectively hold.
-- David
Remember HoneyComb?s have a HexagonalArchitecture
CategoryPattern