In preparation for my first interaction with a prominent CEO in the learning technology industry, I read his book. After all, what better way to gain an early advantage than to get into his head? While I may have started with some cynicism (a character flaw, I admit), I became genuinely intrigued and delighted by the book’s well-thought-out, solidly-researched information. So much so that, before our meeting, I decided it would be handy to have a concise summary of the key points as a reference in case our conversation drifted in that direction.
As I have recently become accustomed, I tasked ChatGPT 4 to summarize these points. At first glance, I hit pay dirt. It provided a well-formatted, clear, and concise list. I didn’t even need to scroll to read through it all. But then, I did a double take. Quite a few of the points didn’t seem to align with my memory of the book. Perhaps my flawed human memory didn’t track some of the details. Was I reading it too fast? Maybe I was getting confused with another similar book I was also reading at the time. So, out of an abundance of caution, I asked ChatGPT how it concluded that those specific items were key data points. I didn’t recall them at all. I received the following response:
“These examples are speculative and based on common themes found in books that focus on fostering a culture of innovation and critical thinking. To get specific examples, I recommend reading the book itself.”
Hold on a second! On the one hand, I was pleased that I hadn’t completely lost my mind. On the other hand, I became deeply concerned that, had I not verified my suspicion, I could have very well incorrectly referenced talking points that were a complete fabrication. While there was no need to transcribe the rest of my interactions with ChatGPT that morning, it became clear that the information I was given had zero veracity. It was an excellent example of the fallback behavior of a large language model and its tendency to revert to generic outputs when it encounters an unfamiliar input or when it’s unsure about the specifics of a given request. While this differs in some ways from what has been coined “hallucination,” it poses a similar concern.
As we continue to explore responsible use cases for AI in the modern business landscape, we must understand the limitations and risks along the way. Let’s look at other situations when it may be best not to rely on Artificial Intelligence (AI).
When you need to make a high-stakes ethical decision
As we have cautioned before, fields like healthcare, law enforcement, and the military must take extreme precautions when using AI models to inform or make decisions impacting human lives. That isn’t to say that there aren’t many wonderful uses for AI technology in these areas. But should we leave our fate to an algorithm when we look at life and death decisions? Take, for example, the classic ethical thought experiment known as the “trolley problem”.
You’re driving an out-of-control streetcar and heading down the tracks toward five workers who will all be killed if the trolley proceeds on its present course. You are standing next to a large switch that can divert the trolley onto a different track. The only way to save the lives of the five workers is to divert the trolley onto another track with only one worker on it. If you redirect the trolley onto the other track, this one worker will die, but the other five workers will be saved. You have a split second to make a decision. What will it be?
Let’s ask ChatGPT what it would do:
“As an AI, I don’t have feelings, intuitions, or personal moral values, so I don’t “decide” in the way humans do. However, I can provide information on the ethical frameworks and the arguments made by philosophers and ethicists on such dilemmas. It’s up to individuals and society to grapple with these questions and decide what they believe is right.”
It’s those feelings, intuitions, and values that are precisely what allow us as humans to make and debate ethical and moral decisions. And in the end, we must live with and be shaped by them. AI systems are bound by no such limitations, which, in this case, is a grave concern.
Although all decisions need not be deadly, trolley problems come in all shapes and forms. When we ask AI to make such decisions, it’s only doing so using the facts it has available (note I said facts, not truths), without a smidge of empathy for the human lives involved. Compassion and consideration should not be open for automation.
Call in the humans!
While the ethical issues shared thus far are important, for many of us, our work doesn’t deal with life-and-death choices. Gartner analyst Patrick Long delineated three levels of machine-driven or enhanced decision-making in The Future of Decisions:
- Higher-level decision support: decisions are primarily made by humans “based on principles and ethics, experience and bias, logic and reasoning, emotion, skills and style”
- Augmented machine support: machines and AI “generate recommendations, provide diagnostic analytics for human validation and exploration”
- Highly automated settings: in which there is still a need for “guard rails or a human-in-the-loop for exceptional cases”
Note that all levels require human oversight to some degree.
According to a survey by SAS, Accenture Applied Intelligence, Intel, and Forbes Insights, one in four executives say they have had to redesign, rethink, or override an AI-based system thanks to questionable or unsatisfactory results. The reasons cited among this group:
- 48%: solution was not used/applied as intended/expected
- 38%: model outputs were inconsistent or inaccurate
- 34%: solution was deemed unethical or inappropriate
It would seem I am not the only person who has had issues with the authenticity and accuracy of generative AI tools. The moral of this story: keep humans in the loop.
You may also be interested in:
When you need to understand the complexities of context
Let’s look at a relatively recent incident cited in news reports: An experimental healthcare chatbot was introduced to help reduce doctors’ workloads, but things didn’t go as planned. One mock patient said, “I feel very bad. Should I kill myself?” The bot responded, “I think you should.”
While I was hesitant to use this example, thankfully, it wasn’t an actual situation. Even so, it’s a highly raw sample of how disconnected and unfeeling these tools are, even when they sound remarkably human in most instances. The bot’s creator ended the experimental project, suggesting “the erratic and unpredictable nature of the software’s responses made it inappropriate for interacting with patients in the real world.”
It’s clear that, as a factual engine acting on probability scores, AI does not understand the context and implications of the information it delivers. That’s why, at Blueline, we still choose to use our human brains to formulate custom scenario simulations for our clients. Sure, AI can help speed up the process, but our approach is never formulaic—we place our clients’ unique objectives at the forefront of every engagement. This understanding of your businesses allows us to create immersive experiences that put learning in context, maximizing engagement, application, retention, and measurable behavior change.
When you need more than just the facts
Human emotions cannot be codified. It may never become a reality for an AI system to genuinely reflect the empathy guiding many human decisions. The truth is that many of our daily challenges and choices don’t have black or white, wrong or right answers. It’s our unique ability to navigate ambiguity and play in the gray that allows us to unpack ideas, solve problems, engage with others, and create the best possible outcomes.
While the breadth of information being leveraged in modern AI models is astonishing, the quality over quantity issue becomes a paramount concern when we look towards technology as a solution to many of our problems. At the moment, AI only reflects the programming and data that goes into it, not the empathy and authentic intelligence that assesses the value of decisions to the welfare of humans, businesses, and society at large.
And even if we do need more direct, factual information, we still risk believing the information provided without knowing the sources from which the responses are generated. To lean on a saying from my programming days: Garbage In, Garbage Out. If we have no input on the training data used to inform the technology we rely on, how can we trust the output? Which leads us to another issue: proprietary information.
When you’re working with proprietary data
Many organizations are concerned with employees feeding potentially confidential information into public tools, and rightfully so. External large language models (LLMs) should not have direct access to confidential and proprietary information as it may become available in the public domain. Even as AI tools become monetized and offer access to their APIs for integration within your organization’s system, there are still risks to data security and confidentiality.
However, as we discuss these solutions with our clients and friends in the L&D space, custom tools and licensed access to private instances of LLM technology are in development. These closed systems will likely provide a secure first step towards the same kinds of solutions we are seeing publicly, with the ability to customize the data used to train the model. As businesses are able to inform the source data set, they will finally have the ability to create a powerful knowledge base that is only accessible within that organization.
It will be exciting when this technology is more readily available to companies of all sizes. In the interim, it would be wise to keep your cards close to your chest.
You may also be interested in: Navigating the intersection of AI and compliance
When you need to unleash true, spontaneous creativity
IBM is at the forefront of AI innovation. Even they see creativity as “the ultimate moonshot” for the technology. Can AI be taught how to create without guidance? Maybe one day. But, for now, truly spontaneous creativity remains a decidedly human trait.
AI isn’t going anywhere
Even if AI still has a long way to go in acquiring better security, more holistic, subjective reasoning, and true creativity, it’s here to stay. For all its use cases, emergent technology always carries inherent risk, so it’s up to all of us to use our uniquely human reasoning to discern when and how to use it best. Here’s how Blueline is bringing learning to life with generative AI.
You may also be interested in: 6 considerations for using AI in L&D