EXPLAINED DARK DATA – Prof. Ahmed Banafa
Dark data defines as the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value.
Dark data is a type of unstructured, untagged and untapped data that is found in data repositories and has not been analyzed or processed. It is similar to big data which is large and complex unstructured data (images posted on Facebook, email, text messages, GPS signals from mobile phones, tweets, Tick Tok videos, Snaps, Instagram pictures, and other social media updates, etc.) that cannot be processed by traditional database tools, but dark data differs in how it is mostly neglected by business and IT administrators in terms of its value.
Dark data is also known as dusty data.
Dark data is data that is found in log files and data archives stored within large enterprise class data storage locations. It includes all data objects and types that have yet to be analyzed for any business or competitive intelligence or aid in business decision making. Typically, dark data is complex to analyze and stored in locations where analysis is difficult. The overall process can be costly. It also can include data objects that have not been seized by the enterprise or data that are external to the organization, such as data stored by partners or customers.
Up to 90 percent of big data is dark data.
With the growing accumulation of structured, unstructured and semi-structured data in organizations — increasingly through the adoption of big data applications — dark data has come specially to denote operational data that is left unanalyzed. Such data is seen as an economic opportunity for companies if they can take advantage of it to drive new revenues or reduce internal costs. Some examples of data that is often left dark include server log files that can give clues to website visitor behavior, customer call detail records that can indicate consumer sentiment and mobile geo-location data that can reveal traffic patterns to aid in business planning.
Dark data may also be used to describe data that can no longer be accessed because it has been stored on devices that have become obsolete.
Types of Dark Data
Data that is not currently being collected.
Data that is being collected, but that is difficult to access at the right time and place.
Data that is collected and available, but that has not yet been productized, or fully applied.
Dark data, unlike dark matter which is a form of matter thought to account for approximately 85% of the matter and composed of particles that do not absorb, reflect, or emit light, so they cannot be detected by observing electromagnetic radiation, dak data can be brought to light and so can its potential ROI.
And what’s more, a simple way of thinking about what to do with the data –- through a cost-benefit analysis –- can remove the complexity surrounding the previously mysterious dark data.
Value of Dark Data
The primary challenge presented by dark data is not just storing it, but determining its real value, if any at all.