The Hidden Fabric of Knowledge

— Part 1 of Modeling in Problem Solving —

Introduction

  • Good projects don’t start with good data, but with good frameworks or theories. The data follows. Some consulting disciplines acknowledge this (management consulting), some less so (design thinking), falling prey to empiricism, more on which later.
  • Good projects are those in which the main framework we end up using is NOT the one we started with, but an elaboration, an iteration. In the very best projects, the evidence is observed through different, opposing frameworks, before drawing conclusions.
  • So what is a good framework? A good framework is one that is specific enough to enable me to say something interesting about a given situation, but not so specific as to “overfit”, losing any wider applicability. A great, classic example being the endlessly popular BCG matrix. Additionally, I’ve found that taking a framework from one discipline and applying it in a different context often yields interesting, unexpected results. For example, while a value stream map is typically considered an operations framework, its use in service design can both add important layers of information that are missing in traditional blueprints, and systematize the way information is being visualized.
  • Consulting projects are powered by rational thought. Thought seems something incredibly abstract and shapeless, but I found most of the thinking we do in problem solving is one of three kinds: top-down/breakdown/deduction; bottom-up/clustering/induction; and lateral/metaphorical/parallel thinking. This will strike some as a tautology, and I don’t want to claim this is fully exhaustive. But as a matter of practice, when I’m stuck with a concept and don’t know how to proceed, the spatial metaphor helps me move on: which way should my thought move? Should it move up, down, or to the side?
  • Across subjects and sectors, there are patterns in reality that, if recognized, can be used as heuristics hinting on how to solve the problem: aggregations of discrete, random events will always give you a normal distribution; if the events are not discrete but build on top of each other or have feedback loops, you’ll have a long tail and the Pareto principle applies. Any decision taken in the context of limited resources can be modeled as effort vs. reward, whether it’s us prioritizing an action or a user evaluating whether to download an app. If with limited resources you are looking at multiple actors with different agendas, you’ll likely run into patterns like the tragedy of commons, prisoner dilemmas, etc.

Design Thinking and the empiricist trap

A classic formulation of the Design Thinking methodology
  • The empiricist approach is false: from a purely descriptive perspective, we never actually simply start with data: any project is approached with pre-existing biases and mental models. Whatever knowledge I have before the project starts will shape the way I think about it, starting from data collection, feature engineering, all the way to ideation and recommendations.
  • The empiricist approach is biased: in a more normative sense, if empathizing/data collection is the first thing we do, we will inevitably fall prey to availability bias and just work with whatever dataset we have available. The dataset will likely display sampling bias and will have been sliced in a way that doesn’t necessarily serve our project goal. For example, Priscilla may assume that level of income is an important variable, and look to survey people in different income buckets. A non-empiricist approach would have revealed that other variables should be prioritized instead.
  • The empiricist approach is incremental: to paraphrase Henry Ford’s famous saying, if we start by listening to clients, all we’ll hear is that they want faster horses. If Priscilla starts by interviewing customers, she is likely to hear that people want faster service, less bugs, a wider variety of options. In short, no one will say anything truly new. This approach can be useful and should be used in many context, but it can quickly turn into a liability if what we are after is true innovation.
  • In the social sciences it is called “P-hacking” and it’s considered a cardinal sin.
  • In business, the “data driven organization” orthodoxy can easily translate to a paradigm where all kind of data is collected and analyzed without a clear strategic framework determining what kind of data we care about, what data is proxy for what variable, etc.
  • Much of machine learning as a discipline takes datasets as a given and extracts insight in a bottom-up way. This can sometimes be disconnected from essential higher-level, context knowledge. In 2008 Chris Anderson wrote an article titled “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete”. While some of the wording may be itself a bit obsolete, the principle stands today more than ever, especially for disciplines like deep learning: modeling is not done in a top-down, theory-driven way, but in a bottom-up, often black-box way.
  • More generally, the way we think about solving our day to day problems. Too often we think the answer to our issues will come if only we collect more information about the problem, when in fact this is only a way to procrastinate the actual work on a solution².

Starting with the a-priori

Introducing the Problem Solving Map: from the part to the whole, from the empirical to the a-priori

Problem solving as recursive reinterpretation

  • Framing the chart, the problem statement, or in other words, the intention with which we approach the problem, defines the placement of the chart itself, defining what is the macro-phenomenon that we consider our “whole” in need of explaining, predicting or solving.
  • The space of a-prioris, or the space of frameworks, which contains everything I know before the start of the project, from the most generic ideas about the world (like the fact that a large number of uncorrelated occurrences will tend to be normally distributed), to very specific domain knowledge (eg. the revenues of the top three players in the chrome-plated steel bars market), as well as any cognitive biases I may be carrying with me
  • The space of representation, where my framework, operationalized by variables, encounters real-world data. This is the space of modeling, a lot more on this later.
  • The space of observation, which contains my data, an intentional abstraction of a real-world observation, as well as my variables, a further abstraction or operationalization of my data.
  • And finally, the actual world, where we actually observe and build stuff.
The hermeneutic circle
The hermeneutic loop represented on the Problem Solving Map
  • Either she intervenes at the level of variables, and decides that, while personas is a useful framework, she may have to scrap demographic variables and additionally shift from using the number of apps downloaded as a proxy for digital literacy to formulating a questionnaire and ask the users about their habits.
  • Or she can be more radical and intervene at the level of the framework: Personas may not be the way to go after all; she may want to start with mindsets instead, or reframe the problem completely, from user-centered to market-centered, and start with a market sizing and segmentation. If so, she will then restart the circle and contextualize this new framework, let’s say mindsets, with new data, for example the number and type of users of a competing app.

Multiple concatenated loops

  • How I formulate the problem affects everything else.
  • The framework I choose affects how I pick the variables and engineer them, sample the data, slice and dice them
  • The way I frame the variables affect how I end up tagging or coding my dataset
  • A mismatch between data and labels will make me question my variables
  • The realization that my model requires additional variables, or that some variables have to be dropped (eg. due to correlation) will make me question my model
  • If my model underperforms in terms of predictive or explanatory power, I will question my framework (more on this later)
  • And as we said, if a few frameworks don’t withstand the impact with real world data, I might be out to solve the wrong problem.
Interlocking reinterpretation loops in problem solving

The cardinal sin of problem solving is taking the framework for granted and jumping to research: it’s the recipe for dull insights, biased incremental solutions.

The full hermeneutic model of problem solving on the PSM

Notes

  • A preliminary data analysis (“EDA” or exploratory research) is a common approach in many problems in which the dataset is a given. This is not wrong. However, it’s important to realize that when we are exploring our dataset, we are never doing so as a blank slate: depending on the discipline we come from, the techniques we have learned and the problem statement, we’ll be either looking at distributions and data types in given columns, or trying to identify behavioral patterns, etc. Increasing our awareness of the “lenses” through which we are looking at data is key to reduce bias and stimulate innovation.
  • Similarly, while frameworks such as Cynefin and Ooda loops recommend an “empiricist”, “sensing-first” approach to novel, complex or chaotic situations, we need to realize that again, this does not happen in a vacuum. The things we’ll be looking for in any sensing activity, or the set of possible pre-emptive actions to be implemented in chaotic unpredictable situations, will depend on what we already know: what have we learned to observe? What is our repertoire of possible actions in this kind of situation? Where does our muscle memory take us? Hence the importance of having a solid “latticework of mental models”, even when modeling as an activity stays mostly implicit — more on this in installment 3.

--

--

Strategy consultant, entrepreneur, curious person

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store