Tacit Knowledge and Underdetermination

Our perceptive equipment lags behind the conceptual skills that we as humans take for granted. As a consequence, linguistic explanations of concepts or feelings are represented by a limited medium. We often can’t provide a thorough linguistic representation of a concept or perceptual judgement to ourselves or to others. As a result, we frequently speak in metaphorical terms to describe feelings and concepts. If it is hard enough to express our own knowledge and thoughts linguistically, it is even harder to express the knowledge of others, especially in written documents.

Brian Marick has an article on Tacit Knowledge that underscores the underdetermination of project documents. The documents are an expression of the ideas and knowledge of experts, but in and of themselves do not provide the complete picture. I particularly liked the paragraph where he describes collaborating to gain tacit knowledge, and looking at a requirements document:

…as a tool for aligning the unexpressed wishes of the experts with the emerging abilities of the product

Discovering Interfaces Continued

James Bach responds:

Testing is an exercise in discovering a lot of things, including interfaces. But it is more than just discovery, it is inquiry. Inquiry is action; discovery is an event. Inquiry is within my control; not so with discovery. But certainly, good testing involves a lot of discovery and a lot of interfacing.

James raises a good point, and helps clarify the “Discovering Interfaces” test technique. Discovery would follow inquiry, and action would follow discovery. Through inquiry, testers can discover testable interfaces, and look at the feasibility of testing through them. The tester can collaborate with developers and other stakeholders to decide if the project warrants testing at the level of a particular interface, or at another layer in the application. If it looks feasible, testers can begin developing test cases against that testable interface.

This is a technique that I would encourage testers to look at when they are working on test automation strategies. Merely looking at the GUI (the most obvious interface to test against), only shows a thin layer of the application. There are other testable interfaces that in many cases may be better test automation candidates.

For more on test interfaces, check out Bret Pettichord’s article: Design for Testability, and Brian Marick’s article: Bypassing the GUI.

John Kordyback on Testing

Testing is an exercise of discovering interfaces.

If this interests you, think about the above statement, and share your conclusions. I’ll post my own thoughts shortly.

(edit – I’ve added a couple of my own thoughts)

From a development perspective, driving out interfaces and making the code more testable will improve the overall design when following test-driven development. The developers drive out the design with unit tests using a tool like JUnit which helps them create interfaces. But JUnit itself uses an interface to exercise areas of the code. Now things get interesting…

From a tester’s perspective, this concept came clear to me when exploring areas of the application behind the GUI layer to write automated tests against. For example, if a developer makes a particular part of the code testable, we can use some sort of tool to write tests and exercise areas of the code. In this case, we find an area we’d like to test, and the developers create an interface for us to test against. If we create fixtures with FIT, we are essentially creating another testable interface. When we use WTR Ruby scripts to drive the browser via the DOM, we are using the DOM interface to test the application. Screen-scraping automated functional testing tools create their own testable interface.

From this perspective, the difference between JUnit or HTTPUnit tests, FIT or WTR tests or some other testing tool is the type of interface they use.

An interesting side-effect of seeking testable interfaces in a product from the code level up: the more we learn about the interfaces, the more we explore the product and the more we learn about it. The more we learn, the more information we gather about the applications strengths and weaknesses. It kind of sounds like testing software doesn’t it?

We can take a cue from our test-driven developer colleagues and how they use interfaces to design better code, and seek to drive out interfaces to help us test the product more thoroughly. This is another view of testing which takes us away from a strict black-box perspective. Think of peeling back the GUI layer (with web applications it can be simple to peel back the browser and start looking at the HTTP layer and down into the code), and look for testable interfaces, or areas that could use them. You might be surprised at what you find, and a new world of testing may be waiting to be discovered.

Test Automation as a Testing Tool

When we think of GUI-level test automation, we usually think of taking some sort of test case, designing a script with a language like Perl, Ruby or a vendor testing tool, and developing the test case programatically from beginning to end. The prevailing view is often that there is some sort of manual test case that we need to repeat, so we automate it in its entirety with a tool. This is a shallow view of test automation as James Bach, Bret Pettichord, Cem Kaner, Brian Marick and others have pointed out. I prefer the term “Computer Assisted Testing” (I believe coined by Cem Kaner) over “test automation” for this reason.

While developing automated tests, an interesting side effect came out in my development efforts. While I was debugging script code, I would run a portion of a test case many times. This would be a sequence of events; not an entire test case. I started to notice behavior changes from build to build when I watched a series of steps play back on my screen. When I would investigate, I found bugs that I may not have discovered doing pure manual testing, or running unattended automated tests. I started keeping snippets of scripts around to aid in exploratory testing activities, and found them very useful as another testing tool.

There are some benefits to automating a sequence of steps and blending this type of testing with manual testing. For example, if a test case has a large number of steps required to get to the area we want to focus testing on, using a script to automate that process helps us get there faster, frees us from distractions and helps us focus on the feature or unit under test. Humans get tired repeating tasks over and over and are prone to error. If precision is needed, computers can be programmed to help us. We can also test the software in a different way using a tool. We can easily control and vary test inputs, and measure what was different from previous test runs when bugs are discovered.

There are certain tasks that a human can do much better than a computer when it comes to testing. As James Bach says, testing is an interactive, cognitive process. Human reasoning and inference simply cannot be programmed into a test script, so the “test automation” notion will not replace a tester. Blending the investigative skills and natural curiousity of a tester with a tool that helps them discover bugs is a great way to focus some test automation efforts.

If you think of your automation efforts as “Computer Assisted Testing”, many more testing possibilities will come to mind. You just might harness technology in a more effective way, and it will show in your testing efforts.

I currently use Ruby as my Computer Assisted Testing tool, with the WTR Controller as my web application testing tool of choice. (cf. A Testing Win Using Ruby)

Iterative Testing (intro)

Edit April 11/2006 – This is a topic I’ll be talking about more in the near future. In the mean time, check out this test debt post which describes an iterative testing challenge.

A few people have asked me recently about how I test on Agile projects, or on iterative projects in general. I’ll post some of my experiences here with the disclaimer that these practices are very much a work-in-progress.

To begin this discussion, I’ll point readers back to this post which is about describing testing activities to business stakeholders. Testing activities are broken up into Business Facing and Technology Facing activities (terms coined by Brian Marick) not just over a release, but through each iteration. These two sets of activities not only guide test exection, but test planning as well. Furthermore, each area of testing activity has collaborative components.

In the coming days, I’ll share some of the techniques that I have been working on.

James Bach on Test Automation

James Bach has an excellent post on his blog about test automation with developer and tester collaboration. Be sure to check out his presentation on Agile Test Automation. It is well worth the read. A collaborative approach in test development is important if the tests are to be useful to the entire team. Tests should not just be useful to the testing team, or specialists who know how to use a proprietary testing tool.

Borrowing from Bret Pettichord’s article “Testers and Developers Think Differently“, pairing good developers who are effective problem solvers and software creators with testers who are effective problem presenters and test idea generators can be a powerful combination. In my own experience working with developers in this way, solutions that the testers need can be quickly developed to meet the unique needs of a project or testing department.

Javan Gargus on Underdetermination

Javan Gargus writes:

I was a bit taken aback by your assertion that the testing team may not have done anything wrong by missing a large defect that was found by a customer. Then, I actually thought about it for a bit. I think I was falling into the trap of considering Testing and Quality Assurance to be the same thing (that is a tricky mindset to avoid!! New testers should have to recite “Testing is not QA” every morning. ). Obviously, the testers are no more culpable than the developers (after all, they wrote the code, so blaming the testers is just passing the buck). But similarly, it isn’t fair to blame the developers either (or even the developer who wrote the module), simply because trying to find blame itself is wrongheaded. It was a failure of the whole team. It could be the result of an architecture problem that wasn’t found, or something that passed a code review, after all.

Clearly, there is still something to learn from this situation – there may be a whole category of defect that you aren’t testing for, as you mention. However, this review of process should be performed by the entire team, not just the testing team, since everyone missed it.

Javan raises some good points here, and I think his initial reaction is a common one. The key to me is that the people should be blamed last – the first thing to evaluate is the process. I think Javan is right on the money when he says that reviews should be performed by the entire team. After all, as Deming said, quality is everyone’s responsibility. What the development team (testers, developers and other stakeholders) should strive to do is to become what I’ve read James Bach call a “self critical community”. This is what has served the Open Source world so well over the years. The people are self critical in a constructive sense, and the process they follow flows from how they interact and create working software.

Underdetermination

How many testing projects have you been on that seemed to be successful only to have a high impact bug be discovered by a customer once the software is in production? Where did the testing team go wrong? I would argue that the testing team didn’t necessarily do anything wrong.

First of all, a good tester knows (as Cem Kaner points out) that it is impossible to completely test a program. Another reason we get surprised is due to underdetermination. The knowledge about the entire system is gathered by testers throughout the life of the project. It is not complete when the requirements are written, and it probably isn’t complete when the project ships. The knowledge can be difficult to obtain, and is based on many aspects not limited to: access to subject experts, the skills of the testers involved and their ability to extract the right information on an ongoing basis. Realizing that you are dealing with a situation where you probably do not have all the information is key. This helps guide your activities and helps you keep an open mind about what you might be missing.

Underdetermination is usually used to describe how scientific theories go far beyond empirical evidence (what we can physically observe and measure), yet they are surprisingly accurate. One example of underdetermination is described by Noam Chomsky. He states that the examples of language that a child has in their environment underdetermines the actual language that they learn to speak. Languages have rule sets and many subtleties that are not accurately represented by the common usage which the child learns from.

Testers regularly face problems of underdetermination. The test plan document underdetermines the actual testing strategies and techniques that will be employed. The testers knowledge of the system underdetermines what the actual system looks like. Often, key facts about the system come in very late in the testing process which can send testing efforts into a tailspin.

Just knowing that the testing activities on a project underdetermine what could possibly be tested is a good start. Test coverage metrics are at best a very blunt measurement. Slavishly sticking to these kinds of numbers, or signing off that testing is only complete when a certain percentage of coverage is complete can be misleading at best, and at worst dangerous.

If testing efforts do fail to catch high impact bugs, there are a couple of things to remember:

  1. It wasn’t necessarily a failure – it is impossible to test everything
  2. Testing knowledge at any given point in a project is underdetermined by what could be tested

If this happens to your testing team, instead of just incorporating this problem as a test case to ensure this bug doesn’t occur again, evaluate *why* it happened. What information were the testers missing, and why were they missing it? How could they have got this information when testing? The chances of this particular bug cropping up again is pretty slim, the chances of one like it popping up in another area of the program are probably much greater than one might initially think.

Instead of evaluating solely on coverage percentages on a project, be self critical of your testing techniques. Realize that the coverage percentages do not really give you much information. They tell you nothing about the tests you haven’t thought to run – and those tests could be significant in number as well as in potential impact. Evaluate what and how you are testing throughout the project, and periodically call in experts from other parts of the system to help you evaluate what you are doing. Think of what you could be missing and realize that you can do a very good job, even without all the information.

The scientific community does quite well even though they frequently only work with a small part of the whole picture. Testers should be able to as well. One interesting side note is that many significant discoveries come about by accident. Use these “testing accidents” to learn more about the system you are testing, the processes you are using, and more importantly what they tell you about *you*, your testing and what you can learn from it.

TDD Pairing – What I Missed

Since this is a training exercise, and I am working with a senior developer who is a good teacher, we spent some time reviewing our first pairing session. To teach me while we paired, the developer had created a situation to see if I could spot a problem. I of course missed it. During our TDD session, my primary focus was on thinking of testing ideas. The developer however was simultaneously thinking of making the code testable, improving the software design, and continuously improving the testability of the code. These three activities he says are the hallmarks of a good design.

Leading me down the garden path in the hopes of teaching me something, he deliberately developed a bad code smell and tried to guide me into seeing it. I was so focussed on generating testing ideas that I missed it. He deliberately made the unit tests awkward and difficult to implement. I trusted his design and took that for granted as a technical issue that I didn’t understand. I didn’t realize that the fact that the tests were onerous and difficult to set up and code was a bad test smell.

The lesson that I learned is that if we can add tests simply, it’s a sign of a good code design. Since the tests were awkward and I was completely dependent on the developer to add them, I needed to be concerned. As a tester, part of my job when pairing in a test-driven development situation is to watch for bad testing smells. Those bad smells in the tests are symptoms that something is wrong with the code. The developer pointed out that when it’s hard to test, it’s time to improve the code. When testability is improved, a byproduct is a better design.

At the end of the day, I was thinking about more test ideas and felt we needed to add much more to the existing design. The developer however realized we were in trouble and needed to refactor the existing code to make it more testable. Lesson learned – I need to watch that we can add tests easily. That is a sign of a good design. It doesn’t take a lot of programming skill to realize that some unit tests are awkward while others are simple and elegant. I’m also fairly confident that testers who don’t program would be able to learn to see the difference quite quickly after spending time with a developer who can demonstrate good and poor unit tests.

Thoughts on product development, management, design, mobile and other topics.