Generative AI and the Future of Problem Solving — old version
Exploring the implications of the generative AI revolution on problem-solving disciplines
TL;DR: What are the implications of the new developments in AI for problem-solving disciplines like design and management consulting?
- Current generative AI can substantially speed up research, analysis, ideation and prototyping, and many practitioners are already using it every day
- We are witnessing the rise of a host of specialized players that claim they can complement or replace humans in many problem-solving tasks
- While the current landscape is one of piecemeal enhancement of problem solving tasks, new technologies present opportunity for more radical redefinition of the problem solving process.
Meanwhile, from a market perspective, it’s hard to see how a rapidly commodifying technology such as LLMs can provide a moat to any application- or service-level company. Competitive advantage likely has to come from existing customer base, network effects or proprietary data access.
But to what extent can we say that current AIs are capable of problem solving? What would a true problem-solving AI look like?
I argue that the breakthrough will come by taking an agent-based approach. An artificial problem-solving agent that can work on the kind of problems designers face in their day to day needs to be capable of taking very specific input in terms of problem statement; of reasoning inductively, deductively and analogically; as well as having some form of actuator to deploy solutions in the real world. They also need to be able to concatenate sensing, reasoning and action depending on the situation, as described for eg. in the Cynefin framework.
Current LLMs, are “stochastic parrots”. While not actually reasoning, they can simulate these types of reasoning. This however may not matter for most practical purposes. In project where an 80–20 approach to accuracy will suffice, LLMs will do a good job.
This is why companies should shift from piecemeal replacement of problem solving tasks by AI, to rethinking completely how they structure the problem-solving process, and even which problem we try to solve.
What a time to be alive.
Generative AI and its value
Across markets and sectors, generative AI is changing things. McKinsey estimates the potential value created by generative AI at 2.6–2.4 USD trillion per year, around the ballpark of the UK’s GDP.
How is this disruption taking place?
I think there are three key levers of value that generative AI is creating:
- Increasing efficiency and speed — in every task that requires generating a textual or image based on an input — eg. writing code
- Enabling more variations (combination and customization) — eg. in R&D — things like protein folding, and marketing, where messages can be tailored to the individual user.
- Enabling expert ubiquity — Advice from a consultant, physician, psychologist, etc. becomes available anywhere any time
Some use cases, are more intuitive than others. While an increased efficiency in writing copy and creating images is pretty obvious, it’s less clear just how far generative AI will enable us to produce and deploy more product variations, and what exact form expert ubiquity could take. These question marks are particularly salient for what concerns problem solving.
The McKinsey study quoted above sees problem solving disciplines, exemplified by strategy, as less exposed to disruption. I think this is due to an underestimation of modes of disruption 2 and 3. As we’ll see, things for problem solvers could change quite a bit.
AI in problem solving*
Like everyone else, I’m wondering about how much the latest developments in AI will change my profession — which can be thought of as strategy consulting, design research, product management, depending on the day. I typically summarize this category of professions by simply calling them “problem solving”.
All of these profession function in a similar way:
- Framing the problem: you start with a problem statement, which you often have to reframe or play around with. Sometimes this means answering a straightforward questions, but more often than not it is often a messy type of problem that cannot simply be solved by following a procedure. Solving it will require some experimentation.
- Modeling the problem: You break down the problem into manageable components, while being aware of the relationship between components — i.e., you pick a framework
- Collecting data — through desk research, interviews, surveys or data analysis. You then classify the information, clarifying the relationship between the data you have and the framework you are applying, or if necessary adjusting the framework.
- Ideating a solution
- Prototyping the solution
- Testing the solution and iterating
The focus for some of the professions above is more on steps 1–4, but it’s important to see these in context. The fact that AI will change the way we can prototype and test solutions might affect the way it makes sense to ideate them.
And of course, as everyone in this type of job will know, there is much more to it than the above — things like stakeholder management, getting to consensus, getting management to take a decision, presenting information, etc. These are absolutely essential parts of the job, but it’s arguably not what you get paid for. No consulting firm sells its presentation skills: it’s “Kano basic” — it goes without saying. But even in these “soft” aspects of problem solving, generative AI might bring some change.
So what are the kind of solutions that we are currently seeing on the market, and how are they changing problem-solving?
Current AI capabilities
Here is a continuously updates list of solutions currently on the market or coming soon that are tackling different stages of problem-solving. The products mentioned below are to be taken as examples, not as endorsements.
Additionally, here is a set of experiments I’ve run on different stages of the problem solving process with GPT4.
So what does the situation look like as of today, from an end to end process perspective?
Here are a few observations:
- GPT4-level AI can take care of parts of problem solving in a way that is comparable to a junior-level professional.
- We have plenty of tools to do most individual steps of the process, mostly powered by ChatGTP, though some are little more than demos.
- Many of the tools use AI in a limited capacity, ie. to understand natural language input. Once that is done, an action is selected out of a set of hard-coded possibilities. In many cases, this is the only way to get stable, deterministic, debuggable output with the current technology.
- The most crowded space seems to be the one of “user insights” extracted from user feedback. This is unsurprising, as summarizing short text and responding to natural language queries is one of the most intuitive use cases for LLMs.
- Very few tools are trying to integrate the whole process end to end, the ones that do mainly do so from a PM angle. None of the tools can in any meaningful way be thought of as a problem solving agent. More on this soon.
- Possibly the most original application of AI to the research process is synthetic users. Whether the approach is methodologically sound, however, is highly debated.
So it looks like what we have is piecemeal solutions to enhance human work in most parts of the problem solving process. Out of the three levers of value created by generative AI that we mentioned before, most solutions seem to work on increasing the problem solver’s speed and efficiency.
I think more could be done. For example:
- Increasing efficiency and speed
- Automating data collection, clustering etc.
- Cutting down time to project delivery and personnel cost
2. Enabling more variations
- Eliminating path dependence on problem formulations and frameworks adopted early on in the process- every possible variation of an idea can be reasoned through to its ultimate consequences, or even be brought to market
- Generating more options during idea generation phase: we all know in brainstorming quantity comes before quality. Now the ideas quantity can increase exponentially.
- Enabling the prototyping and testing of an arbitrary number of variations
3. Enabling expert ubiquity
- Facilitating data collection with synthetic subjects or experts
- Disseminating problem-solving expertise and making it on-demand
These are just some of the potential uses that come to mind.
A world in which generative AI’s three levers of value are fully taken advantage of, is one in which problem solving looks completely different.
- In terms of process: as we eliminate the need for convergence, our double diamond will start to look like a cone, projecting infinite variations of possible solutions from a single problem statement.
- In terms of scope: AI’s low-marginal cost ubiquity will enable us to apply the problem solving method to a host of new problems large and small. Anyone will be able to deploy an AI to any ultra-specific problem, on-demand.
- In terms of problem-solving as a profession: let’s face it, the human input required will be less than it is today. This is not to say that there are necessarily gonna be less jobs, but things are definitely going to change.
But what would it take to take this to the next step? What would a full, end-to-end problem-solving AI look like?
Problem solving agents
Imagine being tasked by a bank with building a solution that helps their clients save more money. Instead of developing a solution yourself, you’ll soon be able to unleash an AI on it. This means that:
- Instead of you having to do the actual work, you’d have to think very carefully about how to set up the problem, with the right constraints and context. Once that is done, the machine would do the rest.
- Instead of you having to collect and analyze the data, the AI would derive insight about the target market and population for you. It would choose a few relevant frameworks, it would read reports, conduct interviews with users as needed, and then try to fit the data into the frameworks, modifying them appropriately, and selecting the subset that best model most important aspects of the problem.
- Instead of deciding on one solution, the AI would ideate dozens of solution ideas — from acquiring other companies to developing new apps to changing the way in-store service is provided. You could choose to look at them and select a few interesting ones, but the machine could also prune the less relevant out based on its utility function, or even go through with all possible solutions, test them all, perhaps with synthetic users, and derive its conclusions.
- The output could contain organizational elements, operational ones, as well as products and services. Many of these could be deployed automatically. For example, tasks could be automatically set up in an ERP, sketches could be drawn, developed and deployed online, ad campaigns could be set up to spread the word. The optimal sequence in which these actions take place — within the budget and time constraints set by the human user- would be itself generated based on best practices and feedback from previous actions.
- Additionally, deploying an arbitrary number of variations would be possible. Instead of deploying a single app, for example, the AI would deploy hundreds of different variations, based on the needs of different types of user.
So how could we make this magic happen?
Let’s think of AI as an agent. An agent senses itself and its environment through sensors, and acts upon it through actuators based on some measure of performance, such as utility maximization.
In order to use a problem solving agent, we would feed it a problem statement, setting a target state, a goal or a utility function. The agent would then go out and explore the environment, building a model of a the problem based on pre-loaded or learned frameworks.
It would then generate options for action, and implement some of them based on its utility function and preset constraints.
Finally, as its actions shape the environment, the agent could collect feedback and learn based on the extent to which the results matched its levels of performance.
The agent could be fully autonomous as described above, or we could have humans in the loop in multiple places to provide additional input and direction, and perhaps to give some kind of positive reinforcement to the agent based on qualitative measures of performance (“you got to the wrong solution, but the reasoning was interesting”).
Sounds simple, right?
Of course, it isn’t. There are four critical parts to implementing this model, “the monkeys” of the problem, that are worth zooming into:
- Formulating the problem statement
- Processing information at multiple levels of analysis (i.e. reasoning)
- Identifying the type of solution called for by the problem
- Prototyping and implementing solutions
Let’s examine these, then explore to what extent current technology can address them.
1. Formulating the problem statement
Some may think of prompt engineering as a critical issue. As Oguz Akar eloquently pointed out on HBR, we should rather think about the more general problem of formulating a good problem statement.
Formulating the problem statement in a way that truly captures what we need to solve for, the utility function and the boundaries of the environment, would still mostly be up to a human. Today, this is perhaps the most critical part of problem solving, and this would be the case also for an AI agent. I explored the issue in depth here in the context of human-centered problem solving.
Akar talks about problem diagnosis, decomposition, reframing and constraint design as the key components of a problem statement formulation. I bundle these all together into the challenge mapping exercise, in which we use spatial dimensions to decompose problems, diagnose specific issues or constrain the solution space (top-down) or reframe the problem (bottom up).
An example of reframing from real life: a client may formulate a problem statement as “improving a product experience”. When appropriately questioned, we may find out that the actual end goal for the client is to increase customer satisfaction and loyalty, and the best way to do it is not by manipulating the product, but the experience around it.
An great example of diagnosis mentioned by Akar: apparently, the key to finding a solution to the Exxon Valdez oil spill was pointing specifically to the oil’s viscosity in cold water as the key issue to be solved for.
In practice, rather than leaving problem formulation up to prompt arbitrariness, we should probably think about setting up the agent in one of two ways:
- With some kind of template that forces us to tell the machine what it needs to be told — context, level of analysis, known constraints to the problem space, etc. — this approach shown to some extent for example by Board of Innovation’s strategy advice bot.
- Alternatively, the agent could help us run a challenge mapping exercise by asking us questions about the problem as we set it up. This is feasible for example with a tool like GTP engineer.
Both systems would benefit from a clear ontology of concepts at different levels of analysis — which is not something that seems possible with current transformer architecture, as we’ll see in the next section.
2. Processing and inferring
Spread throughout the problem solving process, including in the problem formulation step discussed before, is the process of reasoning or inferring.
As I have extensively argued elsewhere, the core of what problem solving actually consist of is the creation a good model of the problem, with the right variables picked out and mapped onto the data we collect. The “ping-pong” between framework and data — up and down the ladder of abstraction — is the very heart of problem solving, out of which solutions emerge.
The overall logical process that takes place is typically called “inference to the best explanation”.
This is the key process that humans use in understanding the world. Not only it is used in problem solving, design, engineering, etc, but according to many it is at the heart of the scientific process. There is even evidence that this is how children at the earliest age learn about the world: they start with an approximate model of what the world is like, eg. objects that are not visible are non-existent, and by “running experiments” over the first couple of years they validate or falsify their assumptions to create a better model of the world.
While we call this process “inference to the best explanation”, as I illustrate extensively here, models are not only used for explanation, but also for understanding categorize concepts, understand their constituent elements, make predictions and decide a course of action.
For example, a researcher approaching a go-to-market problem might start by thinking in terms of modeling the target audience as different personas, and then identifying the key messaging for each persona. So he starts by collecting data about the target audience. He realizes that, rather than separate personas, the audience should be rather thought of as a single group with different mindsets. He will thus adjust his model of the problem to mindset mapping, go ahead and collect more data, etc.
This “vertical conception” of problem solving helps us highlight the logical steps that problem-solvers have to take. These include:
Deductive thinking (Top-down)
- Breakdown of a problem (Given a problem, give a MECE (or not) breakdown of its constituting elements)
- From rule to instance (Given quadrant in a given framework, produce an instance)
- From rule to example in context (Given a framework and a situation, give me a relevant instance)
Inductive thinking (Bottom-up)
- From set of instances to rule (Given N examples, identify a relevant modeling framework)
- From set of instances, context, to rule/learning
Analogical thinking (same level)
- Rephrasing, summarization
- Cross-domain analogy (eg. if we imagine the digital world as a space, what are a few spaces we could use as analogies for how a comment section of a new website should be?)
- From instances to instances (eg given a set of documents, generate new text with the same style and tone of voice)
The thought processes mentioned above can be simulated by LLMs with some success (I tested all of the examples of reasoning above with ChatGTP running GPT4, and got interesting results), but of course, as “stochastic parrots”, LLMs cannot be said to actually be thinking in terms of categories and levels of analysis. A more symbolic-based approach would be required to produce actual inferences at a semantic level, as opposed to a token level. Based on my limited understanding, this is what Yoshua Bengio seems up to working on with his Gflow Nets.
3. Identifying the type of problem and the type of solution it calls for
The problem solving process itself does not look the same in all situations. Depending on:
- Time constraints and urgency
- Availability of information about the environment
- The existence of best practices for this specific problem
- The strength of feedback loops in the problem system
the problem-solving process will look more linear or iterative. This was illustrated by Dave Snowden with his seminal Cynefin framework.
So the problem type would change what we consider to be the agent’s output. Depending on the case:
- The agent could decide to act immediately based on a pre-existent model of the problem, without fine tuning it with new data
- Or it could generate a number of options for possible courses of action, evaluate their utility and pursue the best option
- Or it could sample a number of low commitment options, get initial feedback from the environment, then proceed with a utility maximizing action
- Or it could rapidly choose an option that has worked in the past based on some built-in heuristic, shape the environment by acting upon it, then sense and respond.
Part of this classification of the problem into problem types would of course be done by the human in the problem statement definition stage. Even so, if we imagine the agent solving a complex, multi stage problem, any sub-problem would required this kind of classification, together with an implicit decision of how much resources to invest in each stage of the solution process.
And let’s add here the obligatory caveat about the risk of connecting somewhat unpredictable AI models to the outside world: it is by no means obvious that this can be done safely as of today.
4. Prototyping and implementing solutions
Most of the outputs described above would obviously require actual actuators, beyond ChatGTP’s default text output. This is to some extent provided by APIs that integrate it with business tools. Additionally, we may want some of the following outputs:
- Code -> deployment of the code
- A Cad model -> sent to a 3d printer
- A Figma file -> developed into code ->deployment of the code
- An API call
- etc.
I won’t analyze these one by one, as they all have their own issues. For some of these, prototyping is relatively easy, deployment may be hard (eg. code to some extent?). For some, prototyping itself is not trivial (eg. for an app’s interface).
Some of the most immediate ways to generate actions depend on interacting with a business software environment via API. Of course, once the agent is connected to a company’s ERP, CRM, MES, etc. and is autonomously setting up actions that lead to positive outcomes, we can say we have successfully created artificial problem-solving agents.
The gaps
As we looked at each of the issues above, we have looked at some of the technological gaps to fill in. All in all, it looks like ChatGTP+APIs can get us some of the way there. We are waiting to see what implementations of AutoGTP and Langchain could produce. I am very optimistic that, within the current LLM-based paradigm — i.e. one based on stochastic prediction of tokens, not on manipulating concepts — we can still get business-grade solutions (as opposed, eg. to scientific- or medical-grade) for many of the problems we work on in our day-to day.
The product that seems to come the closest to this vision is Thoughtworks’s Boba. Even though it takes a co-pilot approach rather than the agent one, it does a great job at covering the first part of the problem-solving process, up to storyboarding ideas.
The other big question looming large is: how can a business be built around problem-solving AIs? Here are some of the critical issues:
- Pace of changing tech — you may start implementing a GPT4–based solution, then a better tech comes along
- Cost — currently not sustainable for AI companies. Should they try to become profitable, they prices would skyrocket
- Rights — the controversy about who should own the rights to an AI-generated something based on non-open source material is far over
- Regulation — after many authoritative calls to regulate AI, authorities may actually start doing so
The biggest question of all is that of the moat. In the research quoted above, McKinsey sees end user applications as having the highest potential for new entrants. But how can a sustainable business be built on top of largely commodified AI models?
Here are a few possible sources of competitive advantage that come to mind:
- Availability of original training data
- Marketplace & network effects
- Existing customer base
- Ownership of implementation environment
- The ability to actually demonstrate superior performance by having the right evaluation tools
Concluding thoughts
What would a future of semi-autonomous problem-solving agents look like?
- In the first instance, it’s a world in which more problems are solved. Overall, most products’ user experience will be be better; most strategies will be more solid, most operations optimized. The lower cost of AI agents as opposed to traditional agencies, consultancies and in-house professionals means that the playing field will likely be leveled for small players. And this, of course will be just the beginning: as models improve, we’ll be able to tackle not only business problems, but to have swarms of research-AIs developing new drugs and uncovering new laws on fundamental physics.
- Every problem that can be solved in multiple variants, will. Mass customization has been around as a concept for a very long time. In many spaces, it was never fully feasible, or for that matter, desirable. With problem-solving AIs, the ability to implement customizations that are more than just cosmetic at almost no cost (at least for software), might change this.
- Let’s be honest, a world of semi-autonomous problem-solving agents could be a world with less human problem-solvers. For a while, humans may re-focus their energy on the communicative, emotional and consensus-building aspects of problem-solving and innovation. But there is no reason to think that AI won’t be better at that too at some point. In an experiment, ChatGTP has already been scored as more empathetic to patients than real doctors. We can definitely envision a future in which AI agents will know how to build consensus better than humans.
I think that as more and more analytical, creative and even relational tasks get taken over by AI, the space that will remain exclusively human is the world of intentions. It can’t be “machines all the way down”: AI might write a better love poem than most of us, but it is still up to us to ask an AI to write it. Similarly, as products and strategies get taken over by AI, where humans will keep making the difference is in setting up machines to pursue worthwhile, impactful and humane goals.
The author is a Strategic Design Director at Designit
Cover image: William Blake, The Ancient of Days