Big Data Analytics
One of the hottest terms in ITOpens in new window is big data — data sets so large and complex; typically those in the high terabyte range and beyond, that traditional tools, like relational databases, are unable to process them.
What Is Big Data?
Big data refers to collections of data sets with sizes outside the ability of commonly used software tools such as database management tools or traditional data processing applications to capture, manage and analyze within an acceptable time frame or within a reasonable cost range.
Interestingly, data volume numbers aren’t the only way to categorize big data. For example, in his now cornerstone 2001 article 3D Management: Controlling Data Volume, Velocity, and Variety, Gartner Doug Laney described big data in terms of what is now known as the 3Vs:
- Volume: The overall size of the data set
- Velocity: The rate at which the data arrives and also how fast it nees to be processed
- Variety: The wide range of data that the data set may contain – that is, web logs, audio, images, sensor or device data, and unstructured text, among many others types
Big data creates tremendous opportunity for the world economyOpens in new window both in the field of national security and also in areas ranging from marketing and credit risk analysis to medical research and urban planning. The extraordinary benefits of big data are lessened by concerns over privacy and data protectionOpens in new window.
With the advancement in ICTOpens in new window, and devices connected to the Internet and connected to each other, the volume of data collected, stored, and processed by managers and employees are increasing every day. The mobile Internet has enabled the emergence of big data analytics.
Big data analytics—as earlier described—refers to the process of examining large data sets to uncover hidden patterns, correlations, and other useful information and make better decisions.
Big data sizes are constantly increasing, ranging from a few dozen terabytes in 2012 to today many petabytes of data in a single data set.
A petabyte is about a million gigabytes or the equivalent of about 20 million filing cabinets full of written data.
WalmartOpens in new window collects more than 2.5 petabytes of data every hour from customer transactions and uses those data to make better decisions.
Facebook uses the personal data you put on your page and tracks and monitors your online behavior, and then searches through all those data to identify and suggest potential “friends” and to target advertisements.
In an era of digital mediaOpens in new window, businesses and their customers are expecting more from big data. Big data has value for which customers will pay, making it an excellent business proposition. People and companies are attaching sensors to things that have never been measured before. On a personal level, consider what happens when you use a FitbitOpens in new window.
The device’s sensors gather data from your movements and activities, including number of stairs climbed, distance walked or run, calories consumed, calories burned, sleep patterns, and total number of steps taken. Most users were unaware of much of these data until they strapped on the sensors within a Fitbit.
Fitbit users can access data about their habits which is synced from the device to the user’s smart phone or computer. A dashboard allows users to track their progress. Moreover, this kind of health data can be aggregated, and health habits shared with others, such as a health professional or a software health analytics program.
Using the aggregated data, a doctor could develop a more thorough picture of a patient’s overall health and habits. In industrial business, the Industrial Internet of ThingsOpens in new window is producing unimagined amounts of useful data.
Also in this series include:
- Adapted from the manual Encyclopedia of Information Science and Technology, Fourth Edition, authored by Khosrow-Pour, D.B.A., Mehdi.