Artificial intelligence (AI) was supposed to be the panacea for a huge range of business ills, but it doesn’t always produce positive, actionable outcomes. Why is this? More and more organizations currently use AI, machine learning (ML), or data analysis in their operations, yet the business value from these implementations simply isn’t being realised.
There’s a range of reasons why AI is flailing rather than surging ahead. People struggle to understand what AI really does, and its limitations. Often these misunderstandings are media-driven, with sensationalist or overly simplified reporting. However, these data science myths are also trotted out by many people who claim to know what they’re talking about.
8 Myths About AI
1. AI will magically fix all your problems
AI promises great things; increasing revenue, decreasing costs, identifying fraud before it happens, taking away all your repetitive and monotonous work. But organizations who go into AI and ML with grand dreams will often find the reality falls short.
AI should be a gradual, incremental process. Organizations should start with projects like improving processes and increasing customer satisfaction or automating business processes. Over time, as capabilities and understanding of AI grow, it can be used to tackle the big financial challenges.
Remember the Pareto Principle too; 80% of the outcomes come from 20% of the inputs. There’s no need to delve down into granular details and problems that aren’t going to yield measurable results. They simply bog the system — and your staff — down.
2. Machine learning is about ‘thinking like a human’
Humans are complex creatures, and our brains are incredibly complicated. We use heuristics all the time, ‘rules of thumb’ that we’ve learned with years of experience. We learn stereotypes that allow us to make snap judgements but aren’t necessarily correct. We don’t want computers to think like humans because humans have faulty thinking.
The truth is that machine learning is about making predictions from data. If that data is of poor quality, the results won’t be objective.
Garbage In = Garbage Out
Machine learning simply learns biases in the data, and the assumptions the team make. Why does this matter? Bias in algorithms, data and the team can result in measurable losses to the business.
For example, banks use AI to make decisions about who to lend money to. Who is a risk? Who will be more likely to pay back their loan on time? We know that much data is biased, so machine learning based on that data will be flawed too.
Historically, men were the mortgage holders. Women applying for loans were routinely turned down for reasons that had nothing to do with their financial means or ability to pay back the bank. If AI was to look at that data, it doesn’t say ‘Oh I see the banking system was historically patriarchal’, it says ‘women are turned down for loans more often, therefore women’s mortgage applications should be rejected’. Remember the Apple Credit Card fiasco that gave a husband a line of credit 20x higher than his wife?
Women are actually a lower credit risk than men, on every measure. Women pay loans on time, and default less. Therefore if a bank implements ML that is flawed, they are not only loaning money to higher-risk men, but they are missing out on the income that low-risk women will provide.
The other side of this is that laws mean you can’t discriminate on the basis on gender; it’s illegal to consider gender when determining creditworthiness. But gender-blind credit lending discriminates against women.
Machine learning is about learning from data, which is often flawed, biased, and far from objective. Most of the time, it’s not overt, but rather subtleties and proxies in the training data set.
3. AI is plug and play
With all the SAAS programmes and big promises made by software companies, you’d be forgiven for thinking that AI is easy. Just put the data in, and the machines will whiz through the information and spit out what you what you want to know. No coding knowledge needed!
But even if staff understand the programme, there is so much work that needs to happen first.
Data cleansing: There’s data, and there’s good data. There’s no point putting in huge amounts of data if it’s wrong, incomplete, there’s too small a sample, or if it’s recording the wrong information altogether.
Understanding the outcome of the task: When a business or client says they want a drill, a good data scientist knows they actually want a hole. Not only that, but they know if the data will be able to provide that information.
Domain knowledge: The reality of data science is that the industry is lacking talent; there simply aren’t enough quality data analysts and scientists. There’s a lack of trained and experienced staff, and this is hampering AI’s effectiveness and uptake in the market. Organizations don’t have (or, can’t find) appropriately skilled data scientists as staff, and so outsource to third party providers. Replying on external vendors is only a short term fix; domain knowledge is vital to produce accurate results.
4. Machine learning predicts the future
This is true, if the future is exactly the same as the past. ML trains on data which is historical, and makes predictions based on the theory exactly the same thing will happen again.
There’s more to ML than just making predictions though. You can use it to create business insights and simplify processes, to add new products or features, as well as forecasting. If you don’t use ML to change the behavior of your business decisions, what’s the point?
5. Predictions automatically get better over time
ML uses different algorithms, called models, to create their predictions. The minute you start a model in production, it starts degrading. This is because data can change, the environment can change, and people change. A model will be consistent. This is why models need to be retrained from the very start, or new models used if they are a better fit.
These degrading models are due to data drift. This is when whatever the model is trying to predict is changed by unforeseen variables. For instance, if you’re predicting sales in a physical store, other variables need to be taken into account, such as the weather, what holidays are coming up, and what your competitors are doing.
An example of concept drift is when a skin cancer diagnostic system misses skin cancers because of ignored variables. The machine knows to look for raised edges, irregular shapes, and changes over time, which alert the clinician to the suspected cancer. However, if the machine does not consider the color of the skin (due to sun exposure or race), there will be false negatives.
Generalization, or covariate shifts, are another problem that plagues models. If data used to train a model was from one population, perhaps a western, wealthy country, then it overfits for that group of data. Other groups and unseen data mean that the predictions will not be accurate as they don’t generalize well.
Measures must be taken to prevent model degradation. ML performance must be monitored after it’s deployed. If the model degrades, either restructure the model, or try another, better-fitting model. It might need new features to be added or changed parameters. This is called continuous learning, and if predictions are to be accurate, they need to be checked and adjusted.
6. Machine learning is about delivering higher accuracy
Accuracy is good, but that doesn’t indicate performance. A model that has 51% accuracy could predict lottery numbers correctly, and you win ten million dollars. A model with 99% accuracy could give a false negative when predicting a fraudulent loan application that results in huge losses.
ML works on probabilities, not certainties.
Much like the constant reassessment needed of models, results need to be checked for precision. How many false negatives to false positives? What’s the business value of these errors? How much in potential revenue did you lose? Is the system not discriminating enough and you’re overloading your sales team with too many leads, or are they twiddling their thumbs because the system is too fussy and rejects too many leads?
7. AI and ML is replacing people
Yes, and the sky is falling, Chicken Little. Every time there’s a big, threatening change, people panic that jobs will be lost. This creates resistance to adoption of AI, as people dig their toes in to resist job insecurity. One study showed that 38% of people expect that technology will eliminate jobs at their workplace in the next three years. It’s predicted that by 2030, there will be up to 20 million jobs lost to robots in the manufacturing sector. They are some scary numbers.
The truth is that AI and ML are augmenting people.
They are taking boring, repetitive tasks, and allowing people to get on with creative, unpredictable, more complex tasks. AI should be working hand in hand with humans to make positive changes in the workplace.
We can look back at the industrial revolution to see what the future holds for the AI revolution. This significant overhaul of almost everything about working in the 18th and 19th centuries did not cause long term widespread job loss and suffering. People always find new jobs (often after a period of painful adjustment) and fear of mass unemployment is ill-founded.
While AI will cause job losses, it’s expected any losses will be offset by new jobs created in the stronger, wealthier economy.
Automation and AI will change jobs and lives, that is inarguable. But for the most part, those changes will be positive.
8. The more data the better for machine learning
GI:GO. If you’re feeding the machine irrelevant information, data that hasn’t been cleansed, or is wrong, the results are going to reflect that. Data scientists say that about 50% of their role is cleansing data, and there’s a reason for that.
Even the cleverest machine can’t create insights from faulty data.
The benefits of data science could be massive
This one isn’t a myth; the business outcomes from data science, when done well, could be all those things that were promised. Faster, better, stronger, the superman of organizations.
But to use AI on an organizational level there needs to be broader understanding of what it can do, and where it’s not going to be useful. Otherwise, it’s simply another of those 80-something percent of data science projects that never get off the ground. To get ROI on your data science investment, be realistic about what AI can do, and apply it judiciously to well defined projects with great data. While that doesn’t sound as enticing (and easy) as plug-and-play-and-predict-the-future, it’s a far more successful strategy to get the results AI can deliver.
Read This Next
If you’re interested in more, Shatter the Seven Myths of Machine Learning.
Read This Next
分析の民主化に向けた DX 施策を成功させるコツとは？
アナリティクスにまつわる 5 つの必見 Q&A
Alter.Next 2022 を振り返りながら、データとアナリティクスのリーダーの貴重な考察をご紹介します。