Quantcast
Channel: Enterprise Software Musings by Holger Mueller
Viewing all articles
Browse latest Browse all 639

Why is analytics so hard? Or: The holy grail

$
0
0
I have been going over this for a while and today a twitter interaction with @DAHowlett, @InFullBloom and @JimHolincheck put me into action - thanks for the jolt!



Let's look at the scope, as I understand it and use it going forward: I am NOT talking about the marketing buzz word analytics but the place where Analysis happens and Actions / Guidance result from it. Wikipedia pretty much nails it:


From Wikipedia.

So let's start with analysis. The problem with analysis is - it keeps moving, moving and then moving.

Sales Data analysis is good example to illustrate the problem: Today every enterprise user with appropriate access can get total sales. So far so good. But now it gets more complicated - you can splice sales data by product, geography, customer, sales organization, sales channel, etc. And quickly you come a datawarehouse situation. Datawarehouses are well understood and billions have been spent to create and maintain them. Can they answer all sales analysis questions? To a certain extent, as the dimensions of the star schema foresee them. Not when you need to add new dimensions and vast volumes of data, take for example the
  • effect of your Superbowl ad on sales,
  • the effect a celebrity embracing your product on Facebook,
  • the effect of a power outage in New Jersey, the effect of floodings in Thailand etc.
So what technology allows you to store information and then analyse it with questions you did not foresee at the time of storing the information? Right, you know it's MapReduce / Hadoop.Huge part of the problem solved.
[Sidebar: Ironic that MapReduce / Hadoop were developed to predict sales data - predict a fair price for a display ad from understanding who has been doing similar searches with certain socio-demographic characteristics.]

It get's trickier with the other aspect of analytics - the action / guidance. To do that you need to create a model that can predict outcomes. And the model often also needs to know which outcomes are favorable.

And now to the holy grail: Yes you can analyse data and you can give the end user tools to perform analysis in the confines of a datawarehouse. The reality test of course is - will your user use your datawarehouse... or Microsoft Excel, as Josh Greenbaum points to correctly in a side conversation:



When you need to open it up to MapReduce / Hadoop you can't empower the end user anymore - the tools are not there yet - the whole 'bigdata' industry is still too much in its infancy. We are in the 2nd phase of maturation of the industry where tools are being created to allow more programmers to get to the problem (see Pig etc). Far away from handing them to the end user.  

As far as modelling is concerned - we are even more in the infancy of applying models. Though models are well understood and modelling techniques do not see too much innovation - the heavy lifting is still left to the 'eggheads' and PhDs. We haven't even seen the 2nd phase of maturation as with MapReduce / Hadoop. Some very brave and smart startups are trying to leapfrog the 2nd phase and move right away to broad end user enablement (see e.g. Sparklinglogic) of the 4th phase.

This is why - IMHO - the enabling of the end user is the 'holy grail' of analytics. Whoever manages to tackle this first will have a first mover advantage and a solid lead and opportunity in a very big market.


[Update1]
Interesting enough there were more publications with thoughts and data that Analytics is still in phase 1 / 2.
  • Jeff Bertolucci (@JBertolucci) over at IDC points out how there is a skills shortage, predicted by McKinsey to be 140-190k data analyst experts. He wrote about this earlier, too.
  • And Chris Murphy (@Murph_CJ) points out how the enterprises are trying to help themselves - with P&G hosting a conference with other enterprises (McDonalds, Boeing, BP, Disney, Fedex, GE, Goldman Sachs) and networking about the use of analytics.
Appendix 1
End user enabled analytic apps are a very different problem to what @JimHolinchek refers to above with TurboTax. Even though very complex, the tax code has a number of finite possibilities to fill out your tax return. And TurboTax does a great job empowering the taxpayer to provide their tax returns. The technique is what @InFullBloom calls interrogatory configuration.  I called it - like Oracle - self service setup. Only not applied to tax returns, but to HRMS setup or general business application setup. But we are far away from that today - if we want to use it for analytical class problems and applications.

Appendix 2
Here are the 4 phases of innovation I refer to:
  • Phase 1 - Only experts can apply the technique to make the innovation happen.
  • Phase 2 - Through tools more trained professionals in the relevant technology can make the innovation happen.
  • Phase 3 - A business user can - with appropriate, but affordable training - use the innovation.
  • Phase 4 - Any user can use the innovation, with no / minimal training. 

Viewing all articles
Browse latest Browse all 639

Trending Articles