Monday, August 26, 2013

Classifying Knowledge for Engineering and Product Development

When designing and building a software product there are many things that individuals or teams know and there are also things that they don't know... this is pretty obvious. However this simplified distinction is missing some of the inherent complexity about knowledge that we can or cannot know. Donald Rumsfeld describes this larger epistemological concept, which can be quite complex, in a pretty straight forward way:
There are known knowns; there are things we know that we know. There are known unknowns; that is to say, there are things that we now know we don't know. But there are also unknown unknowns – there are things we do not know we don't know.
These three ways of classifying knowledge can be very helpful during product development for both engineering and product teams. Not only does it help to communicate the various types of things that we may or may not know but also helps to showcase that there are things that we don't even know that we don't know - i.e. concepts, tools, processes, best practises that might exist already but we don't even know about them and therefore don't know to look for them (which I will explain in more detail below).

1. Known Knowns

"Known knowns" are those things that each of us has previous experience with or those things that are held as best practises within the industry that we know about already. Here are some for engineering and product teams:
  • Engineering: Computer languages that are closer to the bare metal of a machine (e.g. C/C++) are generally faster then those languages that run on a virtual machine (e.g. Java, .NET, Python... etc) given the same algorithm and data.
  • Product: When designing conversion pages/screens, each additional step (button click, user input, screen swipe.. etc) in-between the user's entry point and the objective for them to accomplish, results in some trivial or significant amount of drop-off. This can be easily seen in a conversion funnel by measuring and then observing the individual percentage drop-offs between each successive step.
In the engineering case you usually don't need to write the exact same algorithm in multiple languages and measure the execution time but you can be confident that as a general rule of thumb a C/C++ algorithm will be slightly (or maybe even significantly) faster than a virtual machine based language running the same algorithm. Testing always makes sure this assumption is valid and the results can sometimes conclude the opposite but engineers perform so many performance or functional assumptions when writing code that validation for each and every one is not realistic.

In the product case you can be almost certain that the removal of a single non-essential step will increase the conversion rate by some amount (what exactly that amount would be is something that would have to be measured). For example, many conversion flows for social networks include the option to import contacts in order to find your friends/colleagues/acquaintances... etc so that you can connect or follow them (Facebook, Linkedin and Twitter do this). But in other products its less about satisfying the core functionality of the product in terms of connections and more about it being a virality growth hack to acquire more users. Importing contacts in the first case is obviously vital to engage and retain users in a social network but in the second case the step could be removed at the cost of acquiring less users (i.e. no virality). Removal of this step would almost certainly increase the overall conversion rate (due to the step no longer causing drop-off) but usually the increased amount of users, due to the contact import, out weighs the amount of users who drop-off at the contact import step. Therefore it's usually desirable to leave the contact import step in the conversion flow even though is decreases overall conversion rates by a slight amount.

As a final note, the better you are and the more experience you have, the more "known knowns" you accumulate. Because of this you work more efficiently since you have access to a wider array of expertise that you otherwise wouldn't have had. According to Malcolm Gladwell's "10,000-Hour Rule" written about in his book, Outliers, it may take 5 years or more (40 hours/week over 5 years is 10,000 hours) to accumulate enough knowledge to become an expert in any given field.

2. Known Unknowns

"Known unknowns" are those things that we may have some theory for or gut feeling about but don't actually know what the correct approach or answer might be. It might even be that you know that something exists but don't know much else about it. Essentially its being aware of your own ignorance about something. Many good senior engineers or business analysts face these types of issues on a day to day basis and are pretty good at either asking questions and listening to others, performing research to uncover the right approach/answer or, if not available, they experiment or prototype until it becomes more evident what the right approach/answer might be. Once this is done "known unknowns" turn into "known knowns" which is obviously a good thing.

For many things "known unknowns" are similar across engineering and product teams. A software engineer or business analyst might know that something exists which could help improve the product (say a new Python library or integration with another product) but not really know much about it. All they would have to do is begin researching it and given sufficient time they will figure out what they need to.

There is of course a specific area where these teams differ. The worlds of engineering and product start to diverge for "known unknowns" when predictability is accounted for. Software systems are inherently predictable, they generate the same output given the same input - they're built this way to manage complexity. There are things that seem random but they are usually the result of unexpected user input, actual random number/string generators and/or partial system failures that cause intermittent behaviour. The advantage of predictable systems is that various solutions can be re-used across domains and if something has been used once before chances are it can be used again in a different context and the benefit of a lower learning curve can be leveraged. There is also the benefit of a tangible "done criteria" for software systems. If a module within an application was supposed to parse a CSV file and insert each row into a database, then it's pretty easy to determine when all the work is done either by observing the results or writing some unit/integration tests to verify things. The consequence of this can be seen with all the open-source libraries and tools available that are built by one set of engineers and then used by thousands of others engineers for their own applications in a different context.

For product teams the inherent challenge is that users behaviour differently and are very hard to predict. Users behave differently from across different products but even more unsettling is that their behaviour changes over time on the same product. Essentially users are unpredictable and things that have worked in the past may not work in the future. It's analogous to the Law of Shitty Clickthroughs: what works today may not always work tomorrow as users learn and respond to your own product and all the other products that they use. On top of this its never as easy to simply mimic/copy an existing product's features and expect the same results. Engineering teams can use the same open-source library and be pretty confident that the results will be the same. However product teams who decide that a given feature might be worth mimicking/copying for their own product are really gambling on the idea that their users are the same type of users as the other product - and this is never usually the case. For example: If you're building some type of social network you're most likely observing what Twitter, Facebook, Linkedin, Pinterest... etc are doing but simply turning your UI into a Pinterest type feel or adding hashtags may not have the same results as those products had due to the fundamental difference that their users are not the same as yours.

Having said of this, engineering teams are able to turn "known unknowns" into "known knowns" often times more easily than product teams by researching and spec'ing out the design. Even before anything is built it's usually apparent that the given solution will work with enough engineering time (scalability aside). For product teams its a constant challenge to start with "known unknowns" and turn them into "known knowns" after thoroughly researching possible options or coming up with theories of how users might behave. Until the intended product feature addition or change is fully implemented (or, if you're lucky, maybe just some small component of it that users interact with), its almost impossible to know how users will behave.

Conclusion: For product teams "known unknowns" usually stay that way until you've actually shipped your product and see how users are using it. It's only at this stage that "known unknowns" become "known knowns".

3. Unknown Unknowns

This is the scary stuff. "Unknown unknowns" are the things you don't even know that you don't know about (actually they're not that scary because you're actually completely ignorant about them). They're not even on your radar and you don't even know they exist. The dangerous part with "unknown unknowns" is that even with enough time or energy they won't magically turn themselves into "known unknowns". In the engineering case this is like not even knowing that an entire field of academic study has worked on and solved a particular problem with an elegant algorithm that if you just new about you could use in your application and solve you and your team hundreds of hours working on your own solution. In the product case its like being completely ignorant that another business somewhere in the world has a competing product that is vastly better than yours even though you and your team might have done some extensive market research.

So what do you do with "unknown unknowns"? Well here is the only thing Mike Gagnon thinks you can do: 
The best you can do with “unknown unknowns” is be aware that the category exists and maintain an open mind. This way when information presents itself to you, you can cognizantly realize that it was an “unknown unknown” and then you’ll either be in the “known knowns” or “known unknowns” category.
This is why being a prolific reader is so important. It feeds your mind with a ton of new information, ideas, concepts and solutions that you had no idea existed previously. Everything you read, are told about or experience first hand has the opportunity to identify "unknown unknowns" which at the very moment no longer remain as "unknown unknowns".

My grade twelve english teacher once told our class that we should not spend the rest of our lives reading books that agree with our current belief system but rather we should deliberately read ones that disagree with it so that we grow and change for the rest of our lives. I don't think I really had any idea what he was talking about back then but some how more than decade later I still remember those words and I now understand what he means.

4. Unknown Knowns

This one wasn't mentioned by Donald Rumsfeld but its worth mentioning briefly. Scholars of logic will have noticed that there are two words with two possibilities each which translates into 4 possible outcomes. Mr. Rumsfeld mentioned three of them and obviously didn't talk about the forth but its worth asking whether or not "unknown knowns" are even possible.

It turns out that the well known psychoanalytic philosopher Slavoj Žižek extrapolated a forth category which he obviously called "unknown knowns". He said that "unknown knowns" are those things that we intentionally refuse to acknowledge that we know. Slavoj even wrote an essay on the matter but it speaks more about politics than the underlying concept of unknowingly knowing something. Essentially its those things that we actually do know but we either vehemently deny them or even go so far as to suppress the knowledge we have about them.

The engineering and product teams that I've managed have been highly collaborative and open to exploring new ideas even if that meant redoing and/or throwing out inferior work. So when one person found a solution that could significantly change or alter currently held assumptions, it was always encouraged to bring that to the team. In light of this there was never much incentive to suppress knowledge about something across the team in order to avoid telling the truth even if it hurt to tell it.

There could of course be a case where any one individual would have some incentive to suppress the knowledge they had about something and therefore harbour an unknown known. This is obviously a reality but building the right culture with the right people helps immensely in avoiding this behaviour outright or at the very least discourages it.

No comments: