A common requirement in many emerging applications is the ability to process, in real time, a continuous high-volume stream of data. The real-time nature of data stream systems and the vast amounts of data they are required to process introduce new fundamental problems. A useful type of monitoring query called a threshold query detects when a value rises above, or falls below, a predetermined threshold value. Threshold function queries can be implemented by collecting all the mail items to a central location but such a solution is very costly in terms of communications load.
This innovative technology implements a threshold query on a monitored data stream with significantly decreased communication load. As data arrives on the streams, every node verifies that the constraint on its stream has not been violated. A vector of time-varying variables is derived from each stream. Given an arbitrary function over the vector space (i.e. from the vector space to real numbers), and a threshold value, the invention enables detecting when the value of the function over the weighted average of the vectors derived from the streams crosses above or below the threshold value, while minimizing communication between the nodes holding the streams. This geometric analysis of the problem guarantees that as long as the constraints on all the streams are upheld, the result of the query remains unchanged, and thus no communication is required. A solution exists for closely coupled environments where the nodes can communicate efficiently and for a loosely-coupled environment where the broadcast cost is high.
• Lower high-volume data stream monitoring costs
• Higher data mining speed
• Increased data stream monitoring volume
• Enhancing the performance of distributed classification tasks (for example, spam mail filtering).
• Performing global, complex, monitoring tasks in sensor networks, large distributed systems, and large communications networks.
Technical Keywords: Threshold monitoring, geometric analysis, threshold query, data stream, data mining
Market Keywords: Threshold monitoring, geometric analysis, threshold query, data stream, data mining