Big Data: Searching for a Needle in a Needle Stack

ArvindIn the world of Big Data, ‘search’ is the quintessential problem that we always hear about. Searching for something you are looking for in a massive amount of data of all types, formats and sizes. An analogy that is often used to describe Big Data search problems is the proverbial “needle in the haystack” – with the additional caveat that the haystack is getting bigger every day. While search is an extremely  challenging problem, there are two good places to start –  you usually know what you are looking for (the needle) and it is easy to tell the difference between a single piece of hay and the needle. The problem of searching in big data often becomes a problem of organizing the data and trying to find the most efficient ways to partition the data and scale the process of looking at every piece of hay. 

There is another class of problems in Big Data around ‘discovery’. Looking in that same massive dataset but now you are looking for something that you don’t know. Something unknown. Something unexpected. Something hidden. Since you don’t know what you are looking for, you cannot describe it and often don’t know where to start. If search is looking for a needle in a haystack, discovery is looking for a needle in a needle stack! 

But discovery problems are being worked on every day. Intelligence analysts are trying to discover new threats. Medical researchers are trying to discover new drugs. Financial analysts are trying to discover new trading strategies. Across every industry, discovery problems abound – and solving these discovery problems generates incredible value to organizations – whether it’s discovering a new threat, a new drug, a new trading strategy, etc. 

From the time of Archimedes to the California Gold Rush to modern times, various versions and spelling of the Greek word heureka (meaning ‘I have found’) has been used to convey the discovery of something of incredible value that was unknown, unexpected, or hidden. It speaks to the fundamental human emotion of finding something for the very first time – something that no one else had thought of or could even describe. That moment of discovery. That moment of elation. 

YarcData is proud to announce the launch of ‘The uRiKA Moment’, our blog dedicated to discovery of unknown relationships in Big Data. To the theory, science, application and practice of graph analytics that helps data scientists and business users in their voyage of discovery. Here’s to your uRiKA moment…

Leave a Reply

Your email address will not be published. Required fields are marked *

*


2 + seven =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>