Archive for the ‘Data Mining’ Category

Analytics Start-Up Series – 2 of Many

I will continue with part 2 of the start up series and focus on the third part of the element wheel – New Product Design (Air)

New Product Design

What? By the time I had joined the team, we were still defining what we want to do, and what we don’t want to do. Analytics has come to connote not just data mining and predictive analytics, but even research analytics, product based analytics, dashboards, etc. We wanted to focus more on the marketing analytics, and that’s why we were MCoE.
However, the problem/uniqueness of our positioning was that we were focusing on a process driven analytics approach, while 90% of the listeners with different levels of analytics’ understanding had thought largely of product driven approach only. Even “SAS-skills” were “SAS” skills.

For whom?
Moreover, we needed to decide which all verticals we will build our presence in. That’s tricky. Being small and new, you want to prove a point. You are ready for any project that comes your way. However, if you do your first project in healthcare with your key focus being FS, the next time you go to a client, you have a healthcare case study to talk about. If it’s an FS client, you neither want to own the case study, nor disown it. Boom!

Why? The bottomline for a sales guy remains – why would someone buy what you’re selling? Is there an identified need? Is it expressed? Would you need to educate the buyer? In the Indian market, for instance, Fractal Analytics, I think, has done a great job of educating the financial services sector about the need of and opportunities for analytics. There are similar examples elsewhere and in other industries too. Having said that, if we look at the analytics market today, the education is taken care of. The market does have an expressed need for analytics. If we take a closer look, the first industries to adopt analytics were financial services and telecom. And both these sectors loved keeping their data to themselves. They built strong in-house teams. However, things have changed in the last few years as firms have started engaging third party vendors (just as they adopted consulting/consultants as third party unbiased experts with a broader view) for analytics. But today, a lot of other sectors including Public Sector, Healthcare, etc. have emerged as buyers of analytics. Marketics, for instance, had more depth in FMCG than FS, given the ex-P&G background of its leadership team

Where? From a third party vendor point of view, almost everyone wants a piece of the US analytics market. The other markets have been slower to adopt analytics outsourcing. The other truth is the relatively crowded analytics vendor market in US. Net result, apart from some of the early movers like Fair Isaac, not a lot of vendors have been able to build a large scale. However, my MCoE stint taught me that there is a significant opportunity lying the Asian and European market as well, if you have the right connections, credibility and content.

Lessons learnt – Identify a market that you understand well, and where you have the credibility to sell. Sell only a bit, and understand it in depth, and avoid trying to be everything to everyone.


Analytics Start-Up Series – 1 of Many

Yesterday, I was talking to one of my friends who wanted to get my perspective on what did I learn or not learn about starting an analytics company (having been a part of multiple start-up environments). That’s when I thought tt might be a good idea to pen down my thoughts on this.
Disclaimer: This applies to my understanding of a company which is doing offshore analytics.

I always looked at my learning along four dimensions

This post is going to be the first in a series of posts where I will write about my experiences.

1. Strategic Alignment With Overall Business

I am drawing largely from my first company experience here, which was with a very large Indian IT firm thinking of setting up an analytics practice. When I joined the team, Gayatri Balaji was leading the initiative, and she had Dhiraj Narang helping her. Me and Shivani Sohal joined her as part of our third training stint. Apart from Gayatri, the three of us were raw with no analytics background/experience.

Analytics, or Marketing Center of Excellence (MCoE) was a prized initiative of the company at that point. It was an attempt to move up the value chain by doing “intelligent” work (in the Business Intelligence way, not that the company was not doing any intelligent work!).
However, that being said, what our small and inexperienced team (with the exception of Gayatri) soon realized is that its one thing to say that we want to do “this”, and another to align it to the overall business.

There are four set of challenges that we discovered –

A. Existing product portfolio

Context : The company already had a BI practice, a CRM practice, and a lot of analytics was already being done in different relationship pockets. To put it bluntly, every time a relationship needed someone with “SAS skills”, they hired one and put him/her on the relationship. No need to aggregate the “SAS skills”, (which is what analytics job postings have reduced the required analytics skill-sets to!). Additionally, tools like SAP and Oracle have their own analytically intelligent layers, and SAP and Oracle are separate practices within the organization. Imagine your plight when you’re talking to a client about analytics and she says – “Well! But that’s pretty similar to what you’re BI team talked about. They probably had a higher product focus, though!” And you start looking at the account manager, who has probably introduced every practice to the client (to grow the account). It was not our fault that we were the new baby on the block. Interestingly, the leadership had never thought of aggregating the knowledge lying here and there in the firm to have a solid ground from the beginning.

Lesson Learnt:
If you’re going to cannibalize your existing product line, you need to be sure on what you are offering, why you are offering, and how will you work with the existing product lines.

B. Stakeholder Alignment

In the organizational power play – strategic positioning can mean that you are the weakest (fresh out of the closet) player, or the strongest player (the whole company is looking at you). Usually, you are not stuck in the middle. If you are the strongest, the performance becomes extremely short term oriented. Stakeholders want to see quick wins, proof of concepts and a latent potential (as visible through a practice bursting at the seams!)

a) “Who’s with you?” – We realized pretty soon that very few important players have been sold the concept and its importance to the overall soon. The attitude towards the new practice ranged from “This is neat!” to “Oh! So we are wasting money on analytics this time”. To an extent, the varying levels of cynicism is expected in large organizations. The problem we faced where cynicism in the mind of decision makers/policy makers.
b) “Who gets the credit?” – Driving from my other experiences, analytics team can potentially be at direct conflict/synergy with another practice. For example, an offshore analytics center (analytics outsourcing) model can be a potential threat as well as support to analytics consulting. Analytics (platform independent) can be a threat to product driven analytics, but can be used to augment the nature of analytics as well.
c) “Who gets the money?” – Given that analytics is a horizontal solution and not specific to industry, revenue recognition is always a challenge. All the verticals stake claim to the analytics revenue, while analytics unit may have a separate revenue target. For instance, a 100MM target for Financial Services vertical will be achieved through products, services, analytics, implementation, etc. However, the 20MM analytics target will be achieved through a combination of work done across verticals- such as Financial Services, healthcare. Every dollar generated by analytics team will be claimed by the respective vertical. However, the effort devoted to sell analytics will be lower, because there is no specific FS-Analytics revenue target.

Lesson Learnt: If you’re an outsider roped in to run a new business initiative, make sure that you understand the powerplay and relative buy-in. Understand the weak and strong points of everyone you’re compulsorily going to deal with. Equally important is to understand the relative aspirations that help you share, transfer credit of the work done in a politically correct manner. Sales is a tricky issue that we will touch separately later

(…to be continued)

More on Sports

Its surprising that the sports and analytics phenomenon should get so much press coverage suddenly.

I saw an article in viewpoint , the journal of Marsh & McLennan which again talks about the Maths of sports.

Some of my previous posts on this topic are here – 1, 2 and 3

Analytics and Invasion of Privacy

Lately, I have seen a string of articles on the concerns around Department of Homeland Security’s A.D.V.I.S.E program (Analysis, Dissemination, Visualization, Insight and Semantic Enhancement). People have had concerns around invasion of privacy and DHS acting like a peeping tom.

A similar set of concerns came around the DoD and Microsoft deal for analyzing Electronic Patient Records.

“Dr. Deborah Peel, chairwoman of the Patient Privacy Rights Foundation, views the patient information not as a goldmine ripe for exploitation but as a collection of personal and sensitive health information that needs to be zealously guarded and only accessed with express consent by the patient.”

This blog here raises an interesting point

“data mining by definition compromises the privacy of people represented in that database – if your personal information is included, there is no way to opt out.”

The least that can be said is that the concerns are valid. But having said that, here is what I think of the problem of privacy invasion –

1. What information is private information? When a customer signs up for something like a loyalty card and agrees to give information about their demographic profile, income, tastes & preferences, this is voluntary sharing of information. What they also realize, in addition, is that their purchase behavior can be tracked on the card (that’s how they earn loyalty points which are redeemable against tangible benefits). This kind of information, according to me, is not private information in the strictest sense (for the organization which has collected this data painstakingly).

3. What could be called privacy intrusion? Even though google has tried to bring some changes, what it usually does (I am talking about simple examples like ads next to your emails when you are accessing gmail accounts) can be called fairly intrusive.

2. Can private information be kept private? If we don’t associate a piece of information with a named individual in a database, keeping the name or the identifier as a random number, the analysis insights at the end of the day tell me a profile. Individuals with attributes a, b and c are more likely to behave in a certain manner. At this stage, we still don’t know who has attributes a b or c. This step is critical to upholding user privacy.

4. What can organizations do? As a third party analytics services provider, we must realize that data security standards need to be absolutely non-negotiable. This requires

a) working only with masked data,

b) removing information that helps identify an individual to as much extent as possible,

c) maintaining high security standards while transferring/porting data

d) create an onsite-offshore delivery model where data security concerns are alleviated by working onsite for some time and creating master data tables that alleviate data security concerns.

5. How big is the problem? Well, as any analytics provider will tell you, the real value of information is not in “who”, it lies in “what”, “how” and “why”! Once an organization has answered these three critical questions, “who” is the final step of the strategic gameplan, and can be answered at a group level, rather than individual level.

Having said this, projects like ADVISE are bound to create a fair bit of skepticism around the way private information will be treated, and the impact of lying in the group of false-positives (being identified as a terrorist when you are not one!)

Analytically Sports… Continued!

FractalAnalytics seems to be a step ahead of me! Here is a news item that covered their prediction on the First Match of World Cup (Cricket), between West Indies and Pakistan. Lo and behold, 2 of the predicted scores match exactly! Nobody would have expected that granular a performance prediction to be correct 75% of the times (as per their claims).

Going back to some of my earlier posts on use of analytics in sports – at Diamond Analytics Blog and here itself, what Fractal has already managed to do is a proof of concept.

The factors that I don’t see them looking at is the location/playground/weather/batting order/ bowling order/ etc., which do have a big impact on performances.

> Under overcast conditions, the chances of Indian batsmen holing out to the wicketkeeper goes up significantly.
>> The chances of genuine swing bowlers running through the side on grassy pitches is high
>> On flat tracks, against minnows, in subcontinent kinda pitches, batsman have a feast day

These are examples of hypotheses that can be tested using data.

It would be interesting to see how teams can use a model like this to decide team composition, play batting orders, etc.!

Complex Data to Complex Knowledge

Dell Zhang quoted the challenging problems in Data Mining research [ICDM ‘05’].

It will be interesting to touch upon each of these problems in greater detail. However, for now, the most interesting bit is 4. Mining Complex Knowledge from Complex Data. That is what defines the heart of data mining for me.

Mining – From its origins in extraction of minerals, mining has traditional implied extraction of extremely valuable stuff from earth. Wikipedia says any material that cannot be grown from agricultural processes, or created artificially in a laboratory or factory, is usually mined. What is implicit here is the application of intelligence for achieving this feat.

What organizations are increasingly finding difficult to do is to revisit the (apparently) already mined data and come up with new strategies. And when we say already mined, most organizations find it difficult to let go of the semi-cooked analysis that might have been done to meet immediate requirements of marketing executives breathing down the neck of analytics departments.

Complex – A complex is a whole that comprehends a number of parts, especially one with interconnected or mutually related parts. [Wikipedia]

For most of the organizations today, integrating parts of information to see the bigger picture is the new challenge. Today, strategies are not being formed at department level and there is a higher need for departments to come together for an integrated strategy. A perfect example would be the need for IT, Marketing, Customer Services and Products team to work together for an end-to-end customer offering.

Data to Knowledge is the heart of analytics and there can be a host of tools used for traversing the distance.

Like every problem solving exercise, Data Mining and Analytics is an extremely structured exercise involving a series of rigorous steps

  1. Business Understanding – involving setting the context and defining the problem to be solved.
  2. Data Understanding – which involves getting a sense of the data that is available, that can be made available, and that needs to be available for solving the problem
  3. Data Preparation – One of the most important and rigorous steps of an analytics project, this involves bringing various data elements together and creating a data story. Understanding linkages between various data sources, their integrations and disintegrations, tying them with the problem objective to create new variables, vintage of data, changing shape and design of data capture at the enterprise level are all seemingly tedious but life-saving checkpoints!
  4. Modeling/Segmentation/Solutioning – This is the point where the wheat is separated from chaff. Having got your data together, can you use the appropriate statistical and analytical techniques such as cluster analysis, regression, neural networks, et al. to solve the problem at hand. The solutions here range from simple reporting dashboard to complex algorithms that are not easy to explain.
  5. Validation & Deployment – A true romantic movie is never over unless all the things have fallen into place. We need to be able to establish beyond doubt that the results are accurate. Predictive modeling projects have been known to use advanced validation techniques such as coefficient blasting, in and out of time validation, sensitivity analysis, bootstrapping, etc. Deployment faces a different set of challenges in being able to replicate the solution on a production server for ongoing maintenance and reporting.
  6. The key stakeholder buy-in – This is a step that everyone overlooks as part of the analytics lifecycle. However, this step has nothing much to do with analytics apart from making sure that the first 5 steps are correct to the last dot and cross and is well documented for everyone’s reference.

That’s where the sermon of Rabbi Amit gets over.

Crime Analysis Blog

Here… I want to link it to my RFID post .. Crime Analysis using RFIDs!!

Thanks Sandro for pointing me to this…

%d bloggers like this: