The (March) Madness of Big Data

April 8, 2015

The NCAA Tournament is over, and things are a bit touchy around here these days, since my boss is a huge Wisconsin fan. (Sorry about your Badgers, Matt…..)

When the Duke Blue Devils won their historic fifth NCAA Men’s Division I Basketball Championship, they concluded a tournament that is legendary for its unpredictability; hence its nickname: “March Madness.”

Since the event expanded to 64 teams in 1985 (four more teams were added for preliminary play-in games in 2011), only once have all four regional No. 1 seeds advanced to the Final Four. This year was only the fifth time in history that at least three No. 1 seeds have wound up playing the final weekend. First-round upsets and unforeseen deep runs happen every year.

Last year, the “Sage of Omaha” Warren Buffett tacitly acknowledged the near impossibility of predicting the outcome of the event by offering $1 billion to anyone who could correctly guess the outcome of all 63 non-play-in games – the elusive “perfect bracket.” All contestants were eliminated from his pool on the tournament’s second day. Connecticut, an unheralded seventh seed, went on to win the 2014 title over eighth-seed Kentucky.

Given its popularity among sports fans and its difficulty for prognosticators, the NCAA Tournament – and amateur and professional sporting events as a whole – is a prime opportunity for big data forecasting. It is not alone in this regard, as analytics are also reshaping the insurance, healthcare, and consumer goods industries. Let’s look at four ways in which big data is changing these established fields:

1) Bringing big data to bracket forecasting
While actual game outcomes remain a tough nut to crack, data analytics have been successfully used to predict which schools get at-large bids on Selection Sunday the weekend before the tournament starts. More than half of the event’s slots go to teams that did not win their conference tournaments, but were deemed worthy of entry based on their season achievements.

A predictive model called the “Dance Card” formula was developed a while back by several university professors and powered by analytics from SAS. It correctly guessed all 37 at-large spots in 2013 and exhibited 93.5 percent accuracy in the decade before that.

Work continues on extending tools like this one to predict game results and eventual tournament winners, helping fans with their competitive bracket pools. With the NCAA Tournament in particular, many data points must be accounted for and data center resources have to be put to work.

“Predicting the World Cup, NFL season, and now the NCAA tournament, requires the incorporation of player and team stats, tournament trends, game outcomes, location of contests, league trends through multiple seasons and data from Web and social channels,” Bing’s Walter Sun told SportTechie this year.

2) Can big data help with product recalls?
When a product is recalled, the biggest challenge for manufacturers and retailers is often making consumers aware that a recall is under way in the first place. People can easily miss the press releases or the fine print in the grocery store.

Fortunately, there seems to be a solution to this problem. Consumers already supply retailers with a lot of information via loyalty cards. Businesses such as Costco actually require this data, in the form of memberships, and use their extensive databases to make recalls much easier to execute.

Data points such as email and physical addresses and phone numbers enable Costco to quickly reach its customers. Plus, the number of bounced emails and text messages also lets the merchant gauge the efficacy of its recall notifications.

Expect big data to become as interwoven with product recalls as it is with sporting event prognostication. Models such as Costco and the automobile industry (all cars are registered) show how databases can be put to work in making consumers aware of critical product defects.

3) Understanding diseases through genomic data
Genomics is still a relatively new field, but it is already reshaping how healthcare professionals understand the causes and potential treatments of many conditions. For example, by combing through vast amounts of genomic data it may be possible to see what role genetics plays in predisposition to diseases. Our client, Counsyl, is doing just that. They are completely disrupting the traditional genetic testing model through big data.

Moreover, big data methodologies have been useful in identifying medical patients with genetic mutations that did not result in commonly associated medical conditions. Studying these individuals could be instrumental in improving treatments for patients who are symptomatic.

With the surge in health-related mobile apps, fitness trackers, and wearable devices like the Apple Watch, the growing amount of health data in circulation could also be harnessed to improve treatments. There is a bias toward acute care (i.e. addressing problems) but having a greater amount of data to work with could create a basis for more proactive regimens.

4) Insurance industry playing catch-up on big data initiatives
Insurers have many opportunities to use data to adjust their premiums. Right now, however, the industry is not that deeply invested in initiatives such as telematics, which could take the form of integrated boxes in cars that reward drivers for driving cautiously.

In the years ahead, more insurance companies may experiment with big data methodologies to dynamically price their premiums. In addition to the “black boxes” used for auto insurance, data from wearables could be used in health insurance pricing and Internet-enabled kitchen appliances could factor into home insurance plans.

Overall, look for insurers to explore new business models as the Internet of Things comes into focus. Having more data sources to draw upon could reshape how these companies handle premiums and plans.

In summary, big data is affecting our lives, businesses, and IT strategies in many ways, from basketball to genetic testing to insurance.

At Digital Realty, we support the diverse IT workloads, including big data, of a wide array of industries. How might big data affect your IT strategy, now or in the future?

Let’s talk about it! Drop me a line via our Contact Us page, or ping me on Twitter.

Andrew Schaap, SVP of U.S. Sales (@andrewschaap1)

Contact Us