The Buzz: Data analysts are going to replace domain experts
The Anti-Buzz: Data analysts are going to help experts ask the right questions
I hit you with the provocative title “The Death of the Expert” because that is precisely the title I was faced with earlier this week. I don’t normally go into too many details about my own career, but this week the annual Knowledge Discovery and Data Mining (KDD) conference was held in Chicago and I was in attendance. On Tuesday they hit us with the industry panels, and they saved their largest room for “The Death of the Expert.” How could you not attend something so outrageous? Especially when there was a chance that the outrageous was also true?
I will spoil the surprise for you as quickly as it was spoiled for all of us: The expert is not dead, not even close. Like all good panels, they would argue over several points, but on that point they were unanimous. (The panelists were clearly not consulted when the title was being chosen). On the contrary, they were quick to point out the narrowness of any model they themselves had ever built and that making good use of the those models required an expert to interpret and explain the results.
Still this is as good occasion as any to discuss with you the tension, both real and imagined, that might exist between domain experts like you, and the data analytics that are ostensibly trying to steal your job.
What data mining does better
We could list several things here. The obvious benefit is speed and a lack of tedious mistakes, (a computer doesn’t get tired after staring at a table of numbers all day). These are effectively the benefits you always get from computing, though. No, the real benefit you get from data mining is a lack of bias; or more accurately: an opportunity to eliminate some bias.
For some context: a company called kaggle has facilitated finding solutions to large problems posed by large companies by essentially just throwing it to the masses and letting the best techniques rise to the top. The winners invariably produce something which outperforms the best expert analysis to date; hence “the death of the expert”. The leap is achieved largely by these new models being “naive” enough to not make the “expert” assumptions that had been quietly coloring the results in the past.
In other words, data analysts work “with the blinders off” and are willing to consider things established experts are not. Everything comes with the added bonus of being backed up by good statistics.
Why experts are still needed
Distilling it as much as I can, experts do two very important things that data analysts would struggle to do. The first is that they understand what problems need to be solved. The analyst might set new records answering an important question, but the expert is the one who knew what question to ask in the first place. The second is that they can explain results. Good predictive models are, well, good at predicting, but not so articulate about why the prediction makes sense, nor very sensitive to the nuances of what its prediction means.
An example given by the panel was an off-shore oil rig, where an analyst discovered something new, that a certain statistic was accurately able to predict the future failure of systems on the rig. It was the expert who looked at this result and had the eureka moment, understanding how one thing could lead to another. In the sea of data, the expert would not have pulled this insight out on their own, but only they could make sense of it. (By the way, explanation is an area of research interest in data mining, and a largely acknowledged weakness).
To draw up a fictional example that might hit closer to home: You’re the expert. You have a patient. You observe symptoms, share them with an analysis tool, and get back global statistics on what diagnosis was made upon observing these symptoms. What do you do with this information? Let’s say 70% of the world diagnoses symptoms X as condition A and 30% diagnose it as condition B. Does this mean 30% of the world is wrong? Or does it mean that 30% of people with symptoms X have condition B? Do you accept the majority diagnosis? Thankfully you’re an expert and can provide expert answers to these questions.
The data is a tool. Perhaps you always diagnose condition B when you see symptoms X and never even considered condition A. In this case the data is providing you with an opportunity to eliminate bias. Maybe you had considered both conditions and checking the data is a mere formality – you’re sure it’s condition B and you’re the expert.
One thing that was emphasized at the panel was the need for both sides to trust each other. In situations where the analyst is a person, this is easier to manage, as the analyst and the expert can forge a professional relationship. In your case, you are more likely to see analysts take the form of software, and trusting software is another subject entirely.
The question is, do you see these new technologies as a threat, or as a tool? They will change your job, but they will not replace your human intelligence. Your practice management software streamlined many of your receptionist’s bookkeeping duties, but they did not replace your receptionist, they simply allowed them to focus on other, less automatable tasks. All the same some experts in some domains will choose to feel threatened.
I summarized the panel to a colleague who did not attend it. He made the slightly chilling observation that data analysts were replacing experts, it was just that they were only replacing the bad ones. If there is a threat, this is where it is. Data mining sets the minimum bar a little higher; if you are not more useful than a computer geek with a few statistics courses under their belt, then you are no expert at all.