Big Data! It’s Real, It’s Real-time, and It’s Already Changing Your World
Being in the IT market research business, a lot of people ask me what’s the next big thing? It would be easy to pick a trend like data center transformation, converged IT infrastructure The Cloud, or the explosion in mobile devices. All are interesting developments that will alter the IT world. In terms of fundamental impact on business and society, however, I have to say that Big Data will be the most significant development in the next several years. It promises to completely reshape many industries and professions.
The first question that follows that declaration is: “What is Big Data?”
As will become clear in future blogs, Big Data isn’t a specific technology, though many link it to Hadoop and MapReduce. It isn’t about any particular type of data, though many link it to unstructured, social media, or machine generated data. It isn’t even about solving a specific problem, though many link it to goals such as fraud detection or smart grids.
Big Data is about the use of technologies (hardware and software) to manage, mine and analyze large collections of information to solve a wide range of complex problems. Virtually all of these Big Data projects address one or more of the following challenges:
- Dealing with heterogeneous data from multiple sources, often structured and unstructured (e.g., a diagnostic system that can simultaneously scan medical images, medical/pharmaceutical journals, epidemiological/genetic databases, and a patient’s own records)
- Dealing with high volumes (in terms of both size and rate) of data that are dynamic and constantly changing (e.g., a smart grid electric metering system that delivers 3,000 times as much data per site as traditional systems)
- Dealing with unpredictable content that has no apparent schema or structure (e.g., scanning public twitter streams of people praising/complaining about a new product launch to anticipate demand spikes or brand image challenges)
- Enabling real time or near real time collection, analysis and use of information and conclusions.
This final challenge, getting from data to value in hours, minutes, seconds, or less, is the “new” development that makes Big Data so important. Government agencies and companies in industries such as retail, healthcare, and telecommunications dealt with large volumes of information for decades. The perception that Big Data projects are also major science projects that are expensive, risky, and require skills not found in many traditional IT organizations remains the biggest barrier to broader adoption. In the past, the ability to find, analyze and use the data often required massive investments in people or computing resources. Results were measured in months or at best weeks (usually, long after the danger or opportunity passed).
If you did a little investigating inside some of your own business units, you would likely be surprised what’s already there in terms of Big Data projects. Thanks to a sustained and dramatic decline in the costs of compute power, memory and storage capacity (along with new data handling techniques like Hadoop and MapReduce), it’s possible for some bright folks in your organization to effectively deal with all of the data variety, volume, and complexity problems. Even better, they can do so while it’s still possible to take advantage of the knowledge.
One final word of caution, however. Many of these Big Data projects are best described as “junior science projects” with a small core of servers and storage assets. They aren’t the next iteration of a Google-like compute grid, at least not yet. From a business and IT governance standpoint, however, these kinds of “junior science projects” can quickly turn into the next “Manhattan project” with company-wide and industry-wide business, organizational, and legal consequences.
For your IT organization, integrating Big Data initiatives and requirements into data center and IT services plans will be vital. IDC expects many of your primary IT suppliers to make sustained investments via acquisition and new product packaging in solutions that target the “Big Data” environment. Beyond more classic IT product companies, however, IDC also expects Big Data to be an area where leading cloud service providers (the early leaders in developing and deploying Big IT solutions for their internal needs) will launch more targeted cloud-based offerings as part of expanding their market reach into critical business areas. Expect that some smart team will soon be coming to talk with you about this great new idea that will transform the company, your customers or your community.
Welcome to the world of Big Data.
Richard Villars is Vice President of Storage and IT Executive Strategies with IDC. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.
POSTED IN: Cloud Computing
TAGS: AMD, AMD Opteron, big data, enterprise, Hadoop, IDC, servers, SMB

