AI is Arriving, and Most Companies Are Not Ready: Why Bad Data is an Existential Risk
For years, experts have pointed to Artificial Intelligence (AI) as the next frontier and the likely driver of the Fourth Industrial Revolution. Companies like IBM and Microsoft spend millions on advertising campaigns to promote how AI is changing everything. There’s a lot of excitement, understandably, about the potential impacts, positive and negative, that AI will have on every industry. However, the biggest concern for supply-chain and manufacturing companies should be about their readiness to take advantage of AI at all.
Artificial Intelligence is an umbrella term that can be used to describe a range of ways that computers can complete human tasks. AI includes something as simple as a series of If-Then statements, capable of following pre-set rules, such as a bot that helps you fill out your taxes the way an accountant would. One subset of AI that generates a lot of buzz is Machine Learning (ML), which describes a computer’s ability to not just perform human tasks, but to learn and improve as it gains ‘experience’ or data. Machine Learning is the aspect of AI with the greatest potential to disrupt industries, but understanding how it works exposes the hidden challenge that manufacturers and other businesses will face in adopting ML.
Essentially, Machine Learning, like the if-then AI example, is formulaic in nature, but is more complex and relies on access to a large amount of reliable data. These algorithms must be designed with goals in mind, and the only way to train them to the point of becoming useful is by feeding them large data sets to practice and learn from. Therein lies the threat that businesses should be focusing on. If a company is operating with decades of poor-quality data—as many are due to software evolution, human error, changing data practices, mergers & acquisitions, and much more—then Machine Learning algorithms won’t be able to help them. In fact, feeding a ML algorithm bad data will lead to inaccurate insights, which can damage the business.
Bad Data causes Artificial Un-intelligence
Data quality problems have already derailed ML projects in several high-profile trials. Amazon’s secret recruiting bot quickly developed gender bias against women, because it had been trained with the resumes of applicants over the past ten years, the majority of which were men. It learned to downgrade resumes that included the word “women’s,” such as in “women’s association of software engineers.” This demonstrates the importance of the data sets used to train ML algorithms. If a ML algorithm is trained with data that was created using substandard data practices, it may learn to mimic those practices, reducing the Return on Investment.
What is concerning is the failure among executives and upper management to recognize and confront this challenge. A recent Forrester survey of CIO’s found that just 17% said that their biggest challenge was not having “a well-curated collection of data to train an AI system,” although 63% identified it as one of their serious organizational challenges. Even if companies maintain consistent, high-quality data, it is typically siloed in different enterprise applications. If a company hasn’t created connections between financials, procurement, supply chain, maintenance, etc.; then they aren’t ready to fully deploy AI.
Cleansing bad data isn’t easy either, because in most cases it hides in and among good data. There isn’t always a clear way of identifying inaccuracies in a massive data set, but even a small amount of bad data can skew the algorithm. Investing in data cleansing is expensive, and it’s easy to understand why companies have put it off; without a clear goal, there’s little immediate RoI and for the most part companies can continue to operate with some bad data in their system. However, as more mature AI software hits the marketplace, competitors that have focused on data cleansing and good data practices will be poised to accelerate with ML technology, and for those that didn’t, it will be too late to catch up.
What is Hamiltonian doing to help?
Hamiltonian Systems is focused on deploying our software and expertise to improve data and establish proper data processes. With Kasei, we are automating and augmenting data processes to improve employee efficiency, reduce human error, and maintain high quality data across all applications. Kasei serves as a digital bridge, managing and verifying data centrally, and then migrating data bi-directionally. Kasei makes data cleansing easier by automating most of the manual work, identifying the pieces that require manual action, and employing master data management to ensure both speed and accuracy of review. Kasei helps companies prepare themselves for the arrival of machine learning by making sure that the data it needs will be accurate, accessible, and organized.
Any company adopting AI will need a multi-pronged approach that addresses employee skills and practices, new technology and infrastructure, ethics and social impact, and business strategy. Before any of that is possible, companies need to focus on improving their data, or risk being left behind by competitors that did.