View All Business Blog Blogs

Big Data! What Is It Good For? Just About Everything!

by Guest Blogger

By Guest Blogger, Richard Villars, Vice President of Storage and IT Executive Strategies with IDC

As my colleague, Matt Eastwood, mentioned in an earlier blog , data creation is occurring at a record rate. After noting this development, business and IT executives tend to ask IDC one question, “So what’s driving all this data growth?”

In the financial services industry, the answer is easy to provide. The volume of diverse financial transactions occurring around the globe continues to accelerate. Players in this industry face ever stricter anit-money laundering regulations that require big investments in “Big Data” solutions. For most other industries, however, the answer to this question of data sources provides some interesting insights into what use cases are driving Big Data developments. Industries that just recently began to digitize their content are or soon will be joining financial services as some of the biggest Big Data consumers.

  • Media/entertainment: Moved to digital recording, production and delivery and are now collecting large amounts of rich content as well as user viewing and gamer playing behavior data
  • Healthcare: Quickly moving to electronic medical records and images which they want to use for short term public health monitoring and long term epidemiological research programs
  • Life sciences: Will soon be generating large volume of low cost (<$1,000) gene sequencing data that needs to be analyzed to look for genetic variations and potential treatment effectiveness
  • Video surveillance: Transitioning from CCTV to IP TV cameras and recording systems that organizations want to automatically analyze for behavioral patterns. Generating data at an accelerating rate from fleet GPS transceivers, RFID tag readers, “smart meters”, and cell phones (call data records, or CDRs) that they want to use to optimize operations
  • Transportation, logistics, retail, utilities, telecommunications: Web and social media solutions such as TMZ, Facebook, and Twitter are among the “newest “new data sources. A number of new businesses are now building “Big Data” environments that leverage consumers’ (conscious or unconscious) nearly continuous streams of data about themselves (e.g., likes, locations, opinions).

IDC had a chance to speak with Cameron Befus, VP of Engineering at Tynt, an emerging leader in analysis for media and advertising industries. A couple years ago, Tynt took notice of the growing volume of “copy and paste” actions on many information websites. They wondered if anybody cared. They wrote a little program that would track these actions (they added other actions such as “print” later). As Cameron noted, “we started showing it to publishers; their eyes got big; and they wanted to know more about what they could do with it.” At the one year mark that were collecting and analyzing over 4 billion data points (e.g., web site cut and paste operations) per month. Today (less than a year later), they’re approaching 20 billion data points per month with peaks loads approaching 30,000 events per second. To handle this load they deployed a set of Hadoop clusters (production and development) with over 100 nodes and half a petabyte of capacity.

After talking about these data sources, the next question IDC hears is, “so how do I decide if a Big Data solution can deliver business value for my organization?” Regardless of industry or sector, the ultimate value of a specific Big Data use needs to be judged based on one or more of three criteria.

Does it provide more useful information?

IDC spoke to a major retailer that is implementing a digital video system throughout its stores, not only to monitor theft, but to analyze the flow of shoppers through the store at different times of day, week, and year. It also wants compare flows in different regions. This effort makes it easier for the retailer to tune and assess layouts and promotion spaces on a store by store basis.

Does it improve the fidelity of the information?

IDC spoke to several earth sciences and medical epidemiological research teams using “Big Data” systems to monitor and assess the quality of data being collected from remote sensor systems; they are using Big Data not just to look for broad patterns (The obvious and traditional HPC use case), but to identify and eliminate anomalous and false data caused by malfunctions, user error or temporary environmental anomalies (think birds nest in the sensor).

Does it improve the timeliness of the response?

As my colleague, Steve Conway, noted in another blog post , several private and government health care agencies around the world are deploying Big Data systems to reduce the time to detect insurance fraud from months (after checks have been mailed and cashed) to days (eliminating the legal and financial costs associated with fund recovery).

As a CIO or senior IT executive, you need to understand what your organization is planning around Big Data. Only then can you begin to develop a Big Data IT infrastructure strategy to support it.

Richard Villars is Vice President of Storage and IT Executive Strategies with IDC. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.

SHARE: twitter stumble upon delicious facebook

COMMENTS: 0

Comments are closed.

Submit a Comment

Connect with Facebook

Reminder about Comments:

All comments will be moderated by AMD before they are published. Unrelated comments or requests for support will not be published. Please post your technical questions in the AMD Forums or for drivers and other support resources visit AMD Support. By submitting a comment, you are agreeing to AMD Terms and Conditions.