By Gaurav — 17 May 2025

Problem Solving

The first rule of problem-solving is to diagnose before looking for solutions. Though like all rules, the exception proves it. Sometimes it is cheaper to try out solutions to diagnose and solve a problem. For instance, to diagnose allergies, you could give an antihistamine. If the symptoms improve, the likely cause of the runny nose is an allergy.

Diagnosis

There are three parts to diagnosis: 1. Generating explanations, 2. Generating priors for the explanations, and 3. Isolating the cause.

Generating Explanations
There is an art to generating explanations. Crude explanations are rarely helpful. For instance, one could posit that the model is failing because the data is ‘bad’. However, you need to articulate what is wrong with the data, e.g., the camera has one or more dead pixels, for the discovery to be actionable. One way to get to granular hypotheses is to pursue the three why methods—also called the five-year-old’s method—ask why enough times to get to a precise enough hypothesis. There are three broad ways of generating explanations:
1. Finding correlations. To generate the candidate set, find variables correlated with the error. Say, for instance, that you are looking to explain why the ETA prediction model is failing. To find explanations, test if the error varies by location, time of day, day of week, etc. It is often useful to test correlations with a continuous rather than a Boolean variable. This allows you to test if an increase in the source variable causes an increase in the effect. For instance, say you wanted to ascertain if occlusion is causing worse performance, you could check to see if it is true that the greater the occlusion of, say, the traffic light, the worse the performance.
2. Learning from failures. Selecting on the dependent variable is rightly frowned upon. When you select on the dependent variable, even correlation is not guaranteed. However, analyzing failures is the go-to trick for generating explanations. It can be thought of as an inductive reasoning method. We use an example (or examples) to develop a more general hypothesis. The process may work as follows: Start by selecting failures. Sample failures randomly. Or pick the worst errors; the worst errors are often the site of the most obvious problems. Next, look at each of these examples closely. For instance, you may want to trace the example through the system. Or you may want to compare the failure with ‘similar’ successful cases to find potential patterns. Say you run an ETA prediction
  company. Say, when you look at the data with the worst misses, you discover that the locations you are getting a minute apart are hundreds of miles apart. This can then lead you to the diagnosis that your application is installed on multiple devices.
3. Ideation. The first two methods build on existing data, but existing data may not be enough. Often, it is useful to ideate independently about potential causes of failures and then investigate whether you have the right data to triage the problem. Plausibly the most important way to look at the problem is from a systems perspective. Use it to figure out where problems can happen.

Continue reading here (PDF).

Diagnosis

Subscribe to Gojiberries