What a Software Engineer Can Learn from Sherlock HolmesMay 16, 2022
In great literature Sir Arthur Conan Doyle’s Sherlock Holmes stories are some of the most accessible. As a young teenager I devoured them, reading most of the fifty-six short stories and some of the full-length novels. Many years have passed since I last immersed myself in a Sherlock Holmes mystery, but I recently discovered that none other than the great Stephen Fry narrates the entire collection. I immediately dove into the material with relish, listening through every adventure of the mythical detective as told by his faithful biographer.
While listening, I discovered a nostalgic love for the genius of Holmes, whose nonchalant manner, coupled with a commanding genius, inspired me intellectually when I was young. But I had another reaction that was surprising to me. As Fry brought the characters to life with his nuanced and engaging characters, it became clear that Holmes would have made a brilliant software engineer. Following that insight, I found myself drawing lessons from the great detective, which I discuss in this essay. These can be summarized as follows:
- Do not draw conclusions until you have sufficient data
- Construct testable hypotheses when solving problems
- Be a student of your craft
Do Not Draw Conclusions Until You have Sufficient Data
In Sir Conan Doyle’s first Holmes novel, A Study in Scarlet, there is a particular line from Holmes that should strike the modern reader as incredibly relevant. Holmes and Watson, newly established roommates, are just beginning to develop that mythological bond with which we are deeply acquainted. On their way to the first crime scene, Watson is surprised to find Holmes highly communicable on a subject entirely unrelated to the case: musical theory. Their exchange goes as follows (for those unfamiliar with Sherlock Holmes, the stories are written in first person, from Watson’s perspective):
“You don't seem to give much thought to the matter in hand,” I said at last, interrupting Holmes' musical disquisition.
“No data yet,” he answered. “It is a capital mistake to theorize before you have all the evidence. It biases the judgment.” (from A Study in Scarlet, Chapter 3, The Lauriston Garden Mystery).
This is an incredible statement by Holmes. I’ve always felt that in the popular imagination, as is the case with many of our fictitious geniuses, Holmes is viewed as a conjurer of truth. However, here he is not merely putting on a show of vain nonchalance. Instead, his “musical disquisition” is a natural outpouring of the disciplined practitioner. Holmes refuses to theorize until he has sufficient data available to him.
Another potent example of this obsession is seen when he actually fails to follow this principle. In The Adventure of the Abbey Grange, a short story from The Return of Sherlock Holmes collection, Holmes and Watson attend to a murder scene in Chislehurst, Kent. At first, the case appears to be a simple one: three burglars broke into a home and, in the course of their crime, kill the homeowner. Later, while about to catch a train back to London, Holmes suddenly and surprisingly changes his mind, causing both he and Watson to miss the ride. He explains to Watson that something about the facts of the case bothered him:
“... the lady's story was complete, the maid's corroboration was sufficient, the detail was fairly exact… But if I had not taken things for granted, if I had examined everything with care which I would have shown had we approached the case de novo and had no cut-and-dried story to warp my mind, would I not then have found something more definite to go upon?” (The Return of Sherlock Holmes: The Adventure of the Abbey Grange).
Here Holmes points out something very critical: instead of approaching the case with an unbiased mind, examining the facts for himself, drawing his own conclusions, and then hearing the opinions of others, he instead started by listening to stories of others.
As I contemplated this incredible trait of Holmes, I couldn’t help but think of several key applications in my own world. First, and perhaps most obvious, is Machine Learning (ML). While some may take data volume as a given in our byte-saturated world, nonetheless it is exceedingly easy to apply ML where examples are insufficient, resulting in an underfit model. Second, less sexy, and yet very key, is solving a problem before you deeply understand it. I have countless examples from my own career where a customer or colleague would begin describing a need to me and, before they had finished explaining the situation, I had already tried to apply some kind of technical solution to the problem. Often, these solutions were overly complicated and still insufficient to solve the problem. Third, and perhaps most common, is trying to fix bugs before fully examining the context in which the bug appears. This third situation results in significant time wasted as the wrong threads are pulled and randomness is applied in haphazard “turn-it-off-and-on-again” attempts to fix the issue. As Holmes explains, “... it is an error to argue in front of your data. You find yourself insensibly twisting them round to fit your theories.” (from The Adventure of Wisteria Lodge).
Construct Testable Hypotheses When Solving Problems
My first professional mentor, a wonderful fellow named Clyde, while not a scientist, nonetheless possessed some formal training in the scientific method. He used to talk through simple statistical problems out loud, teaching me to work methodically and to construct hypotheses when solving problems. Then, near the end of an exercise, he would say something like, “so if our theory is correct, we should see some xyz pattern in our behavior when we run abc analysis.”
It is with some delight that I found Holmes to be quite adept at the scientific method as well. Once he had evaluated his data (last section), he would formulate a testable hypothesis. After executing his hypothesis, depending on the results he might either sit back triumphantly or sigh with some sense of reluctant acceptance of failure. In fact, Holmes posits that this, rigorous evaluation of hypotheses, is the central theme of criminal investigation. He says to Watson, “One should always look for a possible alternative and provide against it. It is the first rule of criminal investigation.” (The Adventure of Black Peter, The Return of Sherlock Holmes).
One of the ways Holmes exercises this principal is through a testable, “working hypothesis” (from The Five Orange Pips, The Adventures of Sherlock Holmes). Consider the following statement, where Holmes approaches an open-ended situation where the timeline of events is not clear:
“Well, now, Watson, let us judge the situation by this new information. We may take it that the letter came out of this strange household and was an invitation to Garcia to carry out some attempt which had already been planned. Who wrote the note? It was someone within the citadel, and it was a woman. Who then but Miss Burnet, the governess? All our reasoning seems to point that way. At any rate, we may take it as a hypothesis and see what consequences it would entail. (from The Adventure of Wisteria Lodge, His Last Bow).
Here, Holmes takes the uncertain and opaqueness of the situation and, in order to make progress, he constructs a hypothesis and then holds it loosely in the palm of his hand. But notice the phrase “and see what consequences it would entail.” What does he mean by this? The consequences are the results of investigation and experimentation. Holmes is constructing a testable theory that he feels is most worthy of pursuit. Because it is testable, he can therefore prove or disprove the theory as he gathers more facts.
What does this teach software engineers? The parallels are quite strong when we consider the opaqueness of trying to implement software for a business case. In a perfect world there would be no murder and no ill-defined requirements. But what can we do about it? The same thing Holmes does: construct a testable, working hypothesis. This is plainly seen when we consider the difficult-to-master technique of Test-Driven Design (TDD). This approach involves writing tests that fit the requirements before ever writing any production code. As the coder, you define what you expect the behavior of the code to be through the test. That test should and will fail at the beginning, because you have nothing to test with, as yet. However, as you write production code to pass the test, you are naturally pressured to only write the minimal code necessary to “prove” the test as true. Further, the code you write is testable. Think of the consequences of having untestable code. How do you know it’s correct? Can you prove, within reason, that it's correct? If you cannot prove that code will behave as you claim, then you are in the same boat as a prosecutor feeling strongly that someone is a murderer, but only being able to provide circumstantial evidence.
This same mindset applies when debugging production issues. Consider, for example, that you have developed some theory as to why some micro-service keeps crashing. In fact, you feel very confident in your theory. How could you disprove it? Well, let’s say you believe it has something to do with the memory available on a pod/cluster/server/whatever. What could you do? Well, argue for the opposite case. If it isn’t the memory, then you should be able to provide some kind of memory intensive process and not see a crash in the same manner as before. Further, you could pull logs from when the crash occurs and then, during your memory-intensive test, watch for the same behavior.
Be a Student of your Craft
Holmes is the consummate student of his field. He lives out the conviction that ideas and art are greater than the philosopher who develops them or the artist who creates them, respectively. To Holmes, it is a privilege to study an interesting case. This call to higher study is reflected in his words to Watson: “I must thank you for it all. I might not have gone but for you, and so have missed the finest study I ever came across: a study in scarlet, eh?” (A Study in Scarlet, Chapter IV: What John Rance Had To Tell). Here Holmes is thanking Watson for being that catalyst to pursue a case that contains interesting and notable points. There are several applications in Holmes dedication to studying his craft. First, there is significant value in understanding the historical context in which you work. Second, and a natural extension of the first, core patterns exist in the field and, armed with that historical knowledge, a good student can identify those patterns in their current work. Third, an excellent practitioner acquires specialized knowledge that can be used in a moment’s notice.
Sherlock Holmes contains a certain reputation for coldness and even callousness at times. While this is not without truth, Holmes nonetheless gives access to his gifts by indirectly mentoring various younger detectives from Scotland Yard. In one such exchange, Holmes explains the significance of understanding history to one such mentee, asking if he’d heard of one “Jonathan Wild,” a “master criminal” from the last century. When the younger detective replied in the negative, and said he preferred more practical knowledge, Holmes replied in the following way:
“... the most practical thing that you ever did in your life would be to shut yourself up for three months and read twelve hours a day at the annals of crime. Everything comes in circles—even Professor Moriarty. Jonathan Wild was the hidden force of the London criminals, to whom he sold his brains and his organization on a fifteen per cent commission. The old wheel turns, and the same spoke comes up. It's all been done before, and will be again" (from The Value of Fear, Chapter II: Sherlock Holmes Discourses).
There are deeper and broader lessons here applicable to more than just software engineering; each of us would greatly benefit from understanding the context in which we live. But, for the purposes of this moment, let us consider what Holmes is saying. He believes the history of his field of study contains incredibly useful information that can be applied immediately. The value of that knowledge is so high, he recommends to his young mentee that he dedicate significant portions of time to understanding that history. In this particular case, Holmes says the value of this knowledge is seen in understanding the patterns of crime. I will get to patterns soon. But, very briefly, there are some other benefits derived from understanding how history unfolded to the point where you and I are sitting at our computers writing code. Historical knowledge gives you a sense of the future. The success of ML is a powerful example; while history does not exactly predict the future, it can provide useful heuristic for decision making. Beyond ML, I find myself having to make technical decisions each day that will unfold for good or evil in the future. How am I to choose the right architecture or solve the best problem. History can act as that guiding trend-line which, while not perfect, can nonetheless assist those choices.
Understanding historical context, according to Holmes, has the primary benefit of seeing patterns in criminal behavior. In the case described previously, Holmes is comparing a criminal mastermind of the past century to his own contemporary, the evil genius Professor James Moriarty. In this way, he is showing that the idea of a mastermind criminal is not impossible, but has in fact happened previously. He describes criminal history as a wheel that turns, implying that the same patterns reappear again and again.
However, Holmes does not limit his studies to the broader sweeps and patterns of history. In a more granular application, he has developed an ability to see a wide variety of patterns in his work that intuitively leap out at him. Later on in the very same case described above, Holmes is struck by a seemingly random detail: at the murder scene, a murder which took place in a home surrounded by water, he sees a single dumbbell.
“You will remember, Inspector MacDonald, that I was somewhat struck by the absence of a dumb-bell. I drew your attention to it; but with the pressure of other events you had hardly the time to give it the consideration which would have enabled you to draw deductions from it. When water is near and a weight is missing it is not a very far-fetched supposition that something has been sunk in the water. The idea was at least worth testing; so with the help of Ames, who admitted me to the room, and the crook of Dr. Watson's umbrella, I was able last night to fish up and inspect this bundle.” (from The Valley of Fear, chapter 7, The Solution).
Here the idea is summarized with “When water is near and a weight is missing it is not a very far-fetched supposition that something has been sunk in the water.” Holmes’ knowledge of this basic pattern is so built into his system that it arises as more of an intuition (though a very explicit one) than as a pattern. This should be the goal of a true student.
Patterns within software engineering are numerous and incredibly useful. However, I confess that I have not personally given myself to studying them to the degree that they are an intuitive part of my problem solving expertise. Having a level of familiarity and comfort with the nuances of patterns provides incredible benefits, including the ability to design new systems faster and the ability to troubleshoot existing systems with efficiency. Holmes puts this quite powerfully, when he first explains to Watson that he is a consulting detective and that his skills are sought out by official police force:
“I am generally able, by the help of my knowledge of the history of crime, to set them straight. There is a strong family resemblance about misdeeds, and if you have all the details of a thousand at your finger ends, it is odd if you can't unravel the thousand and first” (from A Study in Scarlet, Chapter 2: The Science of Deduction).
The third and final way in which Holmes exhorts us to higher standards of studiousness extends naturally from design patterns: develop and utilize specialized knowledge. Besides his extensive knowledge of criminal history, Holmes has invested an enormous amount of time in building up very specific knowledge sets. As he likes to tell Watson, he is able to differentiate one type of cigar from another, based merely on naked eye observation of ash (A Study in Scarlet, Chapter IV: What John Rance Had To Tell). Further, he can tell regions of London apart based on the mud found in these areas (A Study in Scarlet, Chapter II: The Science of Deduction). Holmes, with his years of applied study, says there are three essential traits needed to be a successful crime solver: observation, deduction, and knowledge (from The Sign of the Four, Chapter I: The Science of Deduction). While he is often famous for the first two, it is the third, special knowledge, the completes this trifecta of greatness.
While it probably comes as little surprise that Holmes would endorse the development of specialized knowledge, the current moment in history resists this piece of advice. There is a belief that has permeated not just the field of software engineering but, in fact, much of the world of knowledge workers. It is the belief that I should keep nothing in my own head, but rather I should simply tap into the specialized knowledge of the global network that came about with the rise of the Internet. In other words, why really learn anything yourself, especially something difficult, since the answer is just a few keystrokes away?
I would argue, instead, that developing and keeping specialized knowledge within our own minds creates high-bandwidth communication highways between the various areas that make humans cognitively unique. In other words, when you have developed a library of specialized knowledge, you have immediate, intuitive, and deep access to knowledge building blocks when you are creating, problem solving, meditating, learning, observing, etc. Researcher Felienne Hermans, who has spent years studying the cognitive aspects of how we code, explains that the more intuitive reading code is for a programmer (reading code well is specialized knowledge), the more they can know as they continue to learn. That bears repeating: the more you know, the more you can learn (InfoQ, The Programmer’s Brain). Holmes reinforces this idea. He says, “I have a lot of special knowledge which I apply to the problem, and which facilitates matters wonderfully” (from A Study in Scarlet, Chapter 2: The Science of Deduction). The word “facilitates” describes the power of specialized knowledge wonderfully. When we possess those instantly accessible, highly tuned skills, we can facilitate our work at a higher level than we ever thought possible. On the other hand, moving to the Internet as our augmented memory bank is a responsibility shift that appeals to us, but which comes at the cost of being professional, effective problem solvers.
A study of Sherlock Holmes from has been an enriching experience both personally and professionally. Revisiting the world of Holmes and Watson was a wonderfully nostalgic journey. Further, the insights I gleaned, some of which I have elaborated here, were powerful and applicable to my field. A focus on data has given me patience and taught me that truth can only be obtained by avoiding bias in my evaluation of problems. Thinking scientifically is bringing a discipline and efficiency to my work. Being a student of my field is increasing my contextual understanding and helping in development of specialized knowledge. There are many other lessons to learn from Sherlock Holmes as well: developing powers of concentration, being able to detach from a problem, learning to reason, etc. In the future I hope to work through several additional ideas. Perhaps by that time I will have began to master some of the other traits of the great detective.
Note: all references can be found at Sherlock-Holmes.es.