Discover PerformanceHP Software's community for IT leaders // November 2013
The analytics payoff of getting your data to talk to you
HP Autonomy VP Brian Weiss talks about the next generation of analytics—and how it’s improving bottom lines right now.
Brian Weiss hears it over and over from execs grappling with the exponential growth of business data: “I know there’s value in there, but how do I get it out?”
Weiss has answers. Leading the Autonomy domain expert group, his focus on advanced analytics strategies lets him help enterprises draw insights that have gone untapped in their various data repositories.
“The goal of any enterprise today is to get smart enough to use new analytics technology in ways that pay off,” he says, “to optimize the IT infrastructure so that it connects data into new intelligence sources and is responsive enough to innovate with what it discovers.”
Discover Performance recently talked with the Autonomy vice president for our article on “connected intelligence.” Here, the conversation touches on emerging approaches to data analytics, and what a CIO should be thinking about when choosing an analytics engine.
Q: All this talk about applying advanced analytics to make sense of huge volumes of data—what should IT leaders be looking for?
Brian Weiss: It used to be that when we talked about dealing with data volumes, it was all about storage: “How do I store all this cheaply? It’s a pain that I can’t delete or find anything.” Today, it’s more about: “I’ve got all this user-created data, and I know there’s great value in there, but how do I get it out?”
The challenge is that much of this information is unstructured, so one way is to hire a slew of people to sift through, read, and analyze everything. But of course that doesn’t work. You simply can’t do that, given the volume and variety of data constantly coming in.
Computers with powerful and fast analytics can now do this for us, inferring patterns and seeing nuances in unstructured data automatically, the way a human would, so that we can now understand information in context, and let the machine help us sift through reams of noise to answer questions we might not have thought to ask. And people are now beginning to realize that this is not only possible, it’s happening. They’re seeing real examples. Let’s say you want to look at what customers are saying on YouTube, Twitter, Facebook, and in survey responses, and then correlate that with telemetry from cars. You can get insight from stuff like this in real time, and you can act on it in time to make a difference.
Q: Tell us what you mean by understanding data in context.
BW: OK, let’s look at an example. If during the course of e-discovery for a major lawsuit, as the lead attorney in defending an engineering firm, I need to understand how the word “bridge” was used in millions of files, I can determine the precise context in which it was used and get the system to use that contextual meaning the way I do. The word “bridge” can, of course, refer to a physical structure, a card game, a dental fixture, a musical score . . . whatever. Simple keyword search gives me all of those uses and more. But since I am looking for relevance (a human construct), I really need the machine to zero in on the context that applies to my legal research. I can do this with conceptual search technology. The computer can automatically discern the different meanings of that term—and surface information pertaining to the bridge that fell down, rather than the witty banter about Sunday’s card game.
Q: We all hear a lot about not only the volume of data that large enterprises struggle with today, but also the incredible variety of types and sources. How much of a challenge is this?
BW: Human information, or unstructured data, is growing at a phenomenal rate, but to most computers it’s fundamentally just noise. Most systems simply don’t know how to deal with things like email text and attachments, audio, tweets, video, and so on. We take for granted the complexity of ideas, slang, sentiment, and the fact that our brains sort out new relationships and patterns in a very fluid way.
Without a powerful analytics engine that can understand all types of data, and adapt as the meaning changes, your only choice is to throw more people at the problem. That’s the rub.
Q: So what are the implications of all this for how a business operates day-to-day? How does the IT leader need to help change internal thinking and processes?
BW: You should be thinking about all the places in your environment where you believe the only path to information insight is to throw more people at the problem to get the answers you need. In that e-discovery example, you might need a staff of hundreds of junior lawyers and analysts to sift through data and analyze text in context, looking at millions of source documents—SharePoint files, emails, memos, audio, etc. You’d say, “Read this doc and tell me if it is relevant.” Weeks later, you might have a few valuable insights to work with. And a huge bill.
But now, a human analytics engine like HP IDOL can read and summarize what it finds in millions of documents before you ask any question at all—and do this very, very fast. You don’t need to guess at the terms which might get you there, because the data is now talking to you. You have the ability to tell the system to look for the sorts of things you need, in context, and have it explain what it means the way a human would see it.
Q: That’s great. Can you give us a few more examples?
BW: Sure. Consider that we now have the ability to train computers to help categorize data as humans might, based on context and concepts. Take retail marketing. You can now use technology to quickly and accurately segment customers according to what they are talking about, reading, and saying they prefer by analyzing the content—in addition to the action of buying the book or clicking the “like” button.
In a hospital, you can look beyond the structured metadata of a checkbox diagnosis form, and bring in an analysis of the notes that doctors transcribe after a patient is diagnosed, or the way patients express themselves in response to caregivers. Patterns emerge from this sort of machine-assisted insight that can be very valuable to a practitioner.
Or, as a movie studio, you can intelligently analyze Twitter feeds, blogs, email, and surveys to get immediate feedback on how viewers react to your movie preview and marketing efforts—and then make adjustments right up until the film’s release, or even fine-tune it in near-real time while the film is still in the theaters.
Q: For organizations struggling with huge volumes of data, where do you start?
BW: Often it makes sense to start small, by connecting and analyzing your current data sets—and then move on to live data. This has proven to be a good way to dip your toe into automated analytics on your information. You can find out what insights you might derive, while training the pattern analytics to target the precise information clusters you care about most in the content. The idea is to train the computer to find and categorize noisy data the same way that you or I would if we were given the task of doing it manually.
Everyone has at least one electronic junk drawer! So data cleanup is a great place to start extracting value from your content with this sort of automation. You’re tidying up data—for risk and cost reduction, among other things. But you’re also enriching your understanding of what you end up keeping, and enabling the business to see the value that a machine-assisted process can bring to the table.
For more on formulating the strategy to draw actionable intelligence from your data, download our free ebook, “Connecting data for new business insight” (reg. req’d), and visit HP.com/haven.
HP Software’s Paul Muller hosts a weekly video digging into the hottest IT issues. Check out the latest episodes.
Introduction to Enterprise 20/20
What will a successful enterprise look like in the future?
Challenges and opportunities for the CIO of the future.
What the workforce of 2020 can expect from IT, and what IT can expect from the workforce.
Data Center 20/20
The innovation and revenue engine of the enterprise.
Dev Center 20/20
How will we organize development centers for the apps that will power our enterprises?
Welcome to a new reality of split-second decisions and marketing by the numbers.
IT Operations 20/20
How can you achieve the data center of the future?
Preparing today for tomorrow’s threats.
Looking toward the era when everyone — and everything — is connected.