Exploratory Testing

Exploratory Testing is an effective way of thinking about software testing that skilled testers can use to think of techniques to find important bugs quickly, provide rapid feedback to the rest of the team, and add diversity to their testing activities.

To learn more about Exploratory Testing, check out James Bach’s site for articles. Cem Kaner has also written about it. Either of those sources can explain it better than I can.

I sometimes get confused looks from some practitioners when I tell them that I’ve found Exploratory Testing to be effective for my own testing. Some of the confusion may come from not knowing exactly what Exploratory Testing is.

I’ve worked as a testing lead on projects and have directed others to do Exploratory Testing to complement automated tests, and have sometimes met with resistance. Those who resisted later told me that they weren’t used to finding so many defects when they tested, but still were uncomfortable doing unscripted testing even though it seemed to be more effective. When I repeat what James Bach says, that “…testing is an interactive cognitive activity” and I value them for their brains and expertise, they are smart testers who can add a lot of value and I am pleased with the results, it’s rewarding to watch their confidence grow.

Some of the confusion may come from working in an unscripted environment when one is used to following a script. When confidence begins to displace confusion, testers often get more creative, they seem to improvise more when testing, and they seem to want to collaborate and communicate more with the rest of the team. When they are finding important issues quickly, sharing them with the developers and getting rapid positive feedback on their own work, it seems to help build team cohesion as well as individual confidence.
Here’s where I think another source of some of the confusion is. Some people seem to think that you can only do Exploratory Testing on software you haven’t used before. At least that’s the impression I get. If I say I will do Exploratory Testing on each build after the automated test suite runs, some people act puzzled that I would be doing Exploratory Testing on a program I am already familiar with.

As a software tester, I can still explore, inquire and discover testing software I’m familiar with. If we think about it, that’s often how scientific research works. Scientists deal with the familiar and look for patterns or occurrences that don’t seem to fit known models. Reviewing the familiar to verify that the models still hold true is common. Discovery can readily come from exploring something we are familiar with. A known behavior might change under certain conditions that we haven’t seen yet. We may try a new experiment (test) in a different way that yields results that we haven’t seen before.

Exploratory Testing isn’t always working as a discoverer working through a program for the first time. Maybe we are thinking of an explorer analogy like Lewis and Clark when we should be thinking of a scientific method exploration analogy. Sometimes Exploratory Testing can be like a voyage of discovery charting the unknown, but it is very often like a scientific experiment where we are exploring changing variables based on observed behavior.

The pursuit of knowledge and creative problem solving are facilitated with Exploratory Testing. Scripted testing or Directed Observation is a common “best practice” in software testing. How often do we miss problems because we direct what we want to see in a program and miss the obvious? Exploratory Testing is one way to help test with diversity and look at the familiar in new ways.

Tests as Documentation Workshop Notes

At this year’s XP Agile Universe conference, Brian Marick and I co-hosted a workshop on Tests as Documentation. The underlying theme was: How are tests like documentation, and how can we use tests as project documentation?. Can we leverage tests to use as project documentation to help minimize wasteful documentation?

Since the most up to date information about a product is in the source code itself, how do we translate that into project documentation? In the absence of a tool to traverse the code, translate it and generate documentation, are tests a good place to look? Can we just take the tests we have and use them as documentation, or do we need to design tests a specific way?

We solicited tests from workshop participants, and had some sample tests developed in JUnit with the corresponding Java code, tests developed with test::unit and the corresponding Ruby code, and some FIT tests.
Brian organized the workshop in a format similar to Patterns or Writers workshops. This was done to facilitate interaction and to generate many ideas in a constructive way. Groups divided up to look at the tests, and to try to answer the questions from the workshop description. Once the pairs and groups had worked through these questions, they shared their own questions with the group. Here is a summary of some of the questions that were raised:

  1. Should a test be written so that it is understood by a competent practitioner? (Much like skills required to read a requirements document.)
  2. How should customer test documentation differ from unit test documentation?
  3. With regards to programmer tests: Is it a failure if a reader needs to look at the source code in order to understand the tests?
  4. What is a good test suite size?
  5. How do we write tests with an audience in mind?
  6. What is it that tests document?
  7. How should you order tests?
  8. Should project teams establish a common format for the documentation regardless of test type?
  9. How do we document exception cases?

Some of these questions were taken by groups, but not all of them. I encourage anyone who is interested to look at examples that might answer some of them and share them with the community. While discussion within groups and with the room as a whole didn’t provide a lot in the way of answers to these questions, the questions themselves are helpful for the community to think about tests as documentation.

Of the ideas shared, there were some clear standouts for me. These involve considering the reader – something I must admit I haven’t spent enough time thinking of.

The Audience

Brian pointed out an important consideration. When writing any kind of documentation whether it is an article, a book, project documentation etc., one needs to write with an audience in mind. When reviewing tests (and as a test writer myself), I notice that I don’t always write tests with an audience in mind. Often I’m thinking more about the test design, than the audience who might be reading the tests. This is an important distinction that we need to think about when writing tests if we want them to be used as documentation. Can we write tests with an audience in mind and still have them as effective tests? Will writing tests with an audience in mind help us write better tests? If we don’t write with an audience in mind, they won’t work very well as documentation.

What are We Trying to Say?

Another standout for me was what is it that tests document? We were fortunate to have example tests for people to review. The FIT tests seemed to be easier for non-developers to read, while the developers jumped into the Ruby test::unit and JUnit tests immediately. Some testers who weren’t programmers paired with developers who explained how to read the tests and what the tests were doing. I enjoyed seeing this kind of collaboration, and it got me thinking. More on that later. The point is, if we are writing a document, we need to have something to say. I’m reminded of high school English classes and learning how to develop a good thesis statement, and my teachers telling us we need to find something to say.

Order is Important

Another important point that emerged about tests as documentation was the order of the tests. Thinking of tests as documentation means thinking of the order of the tests not unlike chapters in a book, or paragraphs in a paper. A logical order is important. Without it we can’t get our ideas across clearly to the reader. It is difficult to read something that has jumbled ideas and doesn’t have a consistent, orderly flow.

With regards to the audience, one group identified two different potential audiences among programmers: designers and maintainers. A designer will need a different set of tests than a maintainer. Furthermore, the order of the tests developed will differ if one is a maintainer than if one is a designer. There are more audiences on the project than programmers, and these audiences may require a different order of tests.

Dealing with the “What is it that tests document?” question, one group felt that different kinds of tests document different things. For example, the unit tests the developers write will document the design requirements while the User Acceptance Tests will document the user requirements. The fact that some developers seemed more at home reading the unit tests, and some testers were more comfortable reading the FIT tests might give some credence to this. They are used to reading different project literature and might be more familiar with one mode over another.

Another important question was: “How do we define tests and explain what they are supposed to do?” If tests also should serve as project documentation and not just exercise the code or describe how to exercise the product in certain ways, the definition of tests will change according to how they are defined for a project.

Workshop Conclusions

I’m not sure we developed any firm conclusions from the workshop, though the group generated many excellent ideas. A workshop goal was to look at areas for further study, so we certainly met that. One idea that came up that I’ve been thinking about for a few months is to have meta descriptions in the automated tests that are more verbose. The tests would have program-describing details within the comments. A tool such as JavaDoc or RDoc could be used to generate project documentation from the specially tagged automated test comments. I like this idea, but the maintenance problem is still there. It’s easy for the comments to get out of date, and requires duplication of effort.

Most important to me were the questions raised about the audience, and how to write tests with an audience in mind. It appears that the tests we seem to be writing to date may not necessarily be taken on their own and used as documentation like requirements documents. None of the tests that we looked at sufficiently explained the product. The readers either had to consult the developer or look at the source code. This wasn’t a weakness or shortcoming of the tests, but showed us that tests as documentation is an area that needs more thought and work.

A couple of other very interesting observations were made. One was by a developer who said that you can tell whether tests were generated by Test Driven Development (TDD) or not by reading them. Another idea was that if one is reading tests and has to consult the source code to figure out what the program is doing might be a testing smell. These observations coupled with the tester/developer collaboration when reading tests got me thinking in a different direction.

My Thoughts

At the end of the workshop, I found myself less interested in tests serving as documentation to replace requirements documents, project briefs or other project documents. Instead, I started thinking about reading tests as a kind of testing technique. I started to imagine a kind of “literary criticism” technique to use to test our tests. This is an area that is hard to deal with. How thorough is our test coverage? Are our tests good enough? Are we missing anything? How do we know if our tests are doing the job they could be? I see a lot of potential to test our tests by borrowing from literary criticism.

Brian spoke about writer’s workshops as a safe place for writers to have practitioners, peers and colleagues look over each other’s work before they are published. This kind of atmosphere helps writers do better work and is a safe environment to get good constructive criticism before they are published and potentially savaged by the masses if they miss something important. For a “testing the tests” technique, instead of an us-versus-them relationship to simply negatively criticize, we could have test writers’ workshops to critique each other’s tests. The point is to have a safe environment to make the tests (and thereby the product) as solid as they could be before they are open to be potentially “…savaged by the masses,” for example, customers finding problems or faults of omission.

Here are three areas I saw in the workshop that could potentially help in testing the tests:

  1. I saw testers and developers collaborating, and it occurred to me that explaining what you have written (or coded) is one of the best ways of self-critiquing. When explaining how something works to someone else, I find myself noticing holes in my logic. Also, the other person also may spot holes in what has been written. That editor, or second set of eyes really helps as pair programming has demonstrated.
  2. I heard expert developers saying they could read *Unit tests and be able to tell immediately whether they were TDD tests or not. TDD tests are richer by nature they told us because they are more tightly coupled to the code. I thought that there is potential there for senior developers to read the tests to help critique constructively and find potential weak spots. One could have a developer outside of the pair that has been working read the tests as a type of test audit or editorial review.
  3. The emergence of a possible test smell: “If we have to look at the code to explain the program, are we missing a test?” prompted me to think of the potential for a catalog of test smells that reviewers could draw on. We look for bad “writing smells” using rules of grammar, spelling, etc. We could possibly develop something similar for using this style of review for our tests to complement the work that has already been done in the test automation area. This could involve reading the tests to find “grammatical” errors in the tests.

I still think there is a lot of potential to use tests as documentation, but it isn’t necessarily as simple as taking the tests we seem to be writing today and making them into project documentation in their original form. I encourage developers and testers to look at tests as documentation, and to think about how to use them to possibly replace wasteful documentation.

I learned a lot from the workshop, and it changed my thinking about tests as documentation. I’m personally thinking more about the “test the tests” idea than using tests as project documentation right now.

The Role of a Tester on Agile Projects

I have been dealing with this question for some time now: “What is the role of a tester on Agile projects?” and I’m beginning to wonder if I’m thinking about this in the right way. I’ve been exploring by doing, by thinking and talking to practitioners about whether dedicated testers have a place on Agile teams. However, most of the questions I get asked by practitioners and developers on Agile teams are about dealing with testing on an Agile project right now, not whether a tester should be put on the team. The testers are here, or the need for testers on the team has been proscribed, so they are looking for answers on how to deal with issues they are facing right now.

Has the ship sailed on the question: “Is there room for dedicated testers on Agile projects?” already? Is it time to rephrase the question to: “What are roles that dedicated testers have added value with on Agile teams?” followed by “What are some good techniques to deal with the unique team conditions on Agile projects?”.

I’m willing to accept that some methodologies may not be compatible with this notion. The question remains, what are testers doing on real-world Agile projects, and what methodologies don’t seem to be amenable to dedicated testers? Of those, are dedicated testers pressured out due to team development philosophy, or are dedicated testers simply not needed? Real-world experience is what we as a community need to keep sharing.

Have the “specialized testers” arrived already, and has the question of whether they should be brought on teams become academic, or are we answering the question by doing? The question will answer itself anyway as time goes on, and experience tends to trump theory alone.

I have been a bit reluctant to put a stake in the ground about the role of testers on Agile projects without more experience myself, but judging by the questions I am getting and the constructive criticism that I have recieved, I should probably share more of my own experiences. I think it’s time for testers on Agile projects to start talking about techniques and what roles they have filled on Agile teams. From that we can gather a set of values that describe the roles, techniques and mindsets of those who are testers on Agile projects. Answering the question by exploration and doing is much more exciting to me than an academic debate.

Visible Tests and Web Testing with Ruby

Announcing a new project and blogging my notes for the Scripting Web Tests Tutorial which I prepared for XP Agile Universe:

Visible Tests

Automated test results are often difficult to communicate to business stakeholders on a project. People frequently ask me about the visibility of tests on a project saying: “The customer doesn’t understand JUnit tests” or “The JUnit and FIT tests mean very little to the customer. The customer doesn’t usually think in terms of green bars, or table-driven tests like developers and technical testers do. They believe we are testing, but they can’t see the tests run in a way that they relate to. How do we raise the visibility of tests?”

The WTR IE Controller has an advantage in test visibility since the testable interface we use is about as close as we can get to interacting with a program the way an end-user would. Business stakeholders understand testing at the GUI layer because that’s the way they relate to the program. These tests can be played back to the business stakeholders so they can see the tests run in a way they relate to. Technical people on a project team relate to the program at various layers and sometimes we forget about the business problems we are solving. Techies look at the backend and the front-end, business users usually only see and understand the front-end of the application. The preferred method of interaction with the program can differ between the groups.

Business stakeholders can watch the tests play back on a computer which provides rich, visual feedback. If the test fails, they see it fail. If it passes, they see it pass while exercising the application in a way that they would themselves. While they have faith in the ability of the technical members of the team, and will accept the testing numbers and their word, nothing replaces the assurance they get from seeing it work, and from manipulating the product themselves. With IE Controller tests, the customer can see if the tests pass or fail by watching how the application works in the Internet Explorer browser. Tests can also be designed to mimic business interaction, and provide results logging that non-technical project stakeholders can understand. The tests will demo the business solutions that the customer needs the application for in the first place. These kinds of automated tests help provide more assurance in what the technical members of the team are doing.

At an end of iteration meeting, the application can be demonstrated using these tests if the group wanted to give a quick demonstration.

Announcing WATIR

The latest version of the Web Testing with Ruby project has begun under the WATIR project spearheaded by Paul Rogers and Bret Pettichord. I’m excited by the prospect of a more sophisticated Open Source web testing solution using Ruby. I’ve had success with the IE Controller tests I’ve written to date, and look forward to using a more complete solution.

Testing and Numbers

Michael Bolton said this about numbers on testing projects:

In my experience, the trusted people don’t earn their trust by presenting numbers; they earn trust by preventing and mitigating bad outcomes, and by creating and maintaining good ones.

I agree. Lately, I’ve been thinking about how we report numbers on testing projects. I was recently in a meeting of Software Quality Assurance professionals and a phrase kept coming up that bothered me: “What percent complete are you on your project?” I think they mean that they have exercised a certain percentage of test cases on the development project. I don’t feel like I can know what “percent complete” of test cases I am on a project, so I’m uncomfortable giving out a number such as “90%” complete. How can I know what 100% of all test cases on a project are? Cem Kaner’s paper: Impossibility of Complete Testing shows us how vast the possible tests that can be run on a project are.

For all the projects I’ve been on that claimed to have 100% test coverage, each one had a major bug discovered by a customer in the field that required a patch. We obviously at best had only covered 100%-1 of the possible test cases. That one test case that the customer found a bug that we did not find was not in our set of test cases, so how could we claim that we had 100% completion? How many more are we missing?

Reporting numbers like this are dangerous in that they can create a false sense of security for testers and project stakeholders alike. If we as testers make measurement claims without looking at the complexity of measurement, we had better be prepared to lose credibility when bugs are found after we report a high “percent complete” number prior to shipping. Worse still, if testers feel that they are 90% complete of all tests on a project, the relentless pursuit of knowledge and test idea generation is easily replaced by apathy.

Cem Kaner points out many variables in measuring testing efforts in this paper: Measurement Issues and Software Testing. Accurate, meaningful measurement of testing activities is not a simple thing, so why the propensity for providing simple numbers?

I look at a testing project as a statistical problem. How many test cases could be in this project if I knew the bounds of the project? Since I don’t usually know the bounds of the entire project, it is difficult to do an accurate statistical analysis using formulas. Instead, I can estimate based on what I know about a project now, and use heuristics to help deal with the vast numbers of possible tests that would need to be covered to get a good percentage. As the project progresses, I learn more about it, and use risk-based techniques to try to mitigate the risk to the customer. I can’t know all the possible test cases at any given time. I may have a number at a particular point in the project, so of the test cases that I know of, right now, we may have a percentage of completion. However, there may be a lot of important ones that I haven’t, or the testing team together haven’t thought of. That is why I don’t like to quote numbers of “percent complete” without providing a context, and even then I don’t present just numbers.

The Software Quality Assurance school of thought seems to be numbers obsessed these days. I am interested in accurate numbers, but I don’t think we have enough information on testing projects to be using many of the numbers we have conditioned project stakeholders to rely on. Numbers are only part of the picture – we need to realize that positive project outcomes are what are really important to project stakeholders.

Numbers without a context and careful analysis and thought can give project stakeholders a false sense of security. This brings me back to Michael Bolton’s thought: what is more important to a stakeholder? A number, or the opinion of a competent professional? In my experience, the latter outweighs the former. Trust is built by delivering on your word, and helping stakeholders realize the results they need. Numbers may be useful when helping provide information to project stakeholders, but we need to be careful how we use them.

Expressing Interfaces

Mark McSweeny comments on Discovering Interfaces:

While testers work to discover interfaces in programs, developers work to express interfaces. This can be through developing an API, an interface for unit testing, all the way down to the data structures and algorithms you use. Well designed interfaces are testable interfaces, so a tester can tell you whether the interfaces you have expressed are testable or not.

The meeting of the roles through the program is interesting to me. The two meet in the middle of creation and inquiry. Sometimes as testers we forget that there are people behind what we are testing who have poured a lot of effort into the interfaces we are testing. The more testers can supply positive feedback to the developers, the more confidence they can get from their development efforts.

While testers may be satisfied when a program does not fail after they have attempted to make it break, a developer is satisfied when they have solved a problem using technology. Together, there is expression, inquiry, discovery, feedback, communication, and collaboration.

What is interesting with methods such as TDD is that they attempt to combine interface expression and discovery. One attempts to express an interface while also attempting to critique it.

Go Flames Go (Edit)

One win away from the Stanley Cup.

Go Flames!!

edit — It was a heartbreaker last night, but Calgary and Canada are proud of the Flames amazing playoff run.

In related news, Tim Van Tongeren has a post on his blog about a publishing error where the wrong article ran in a Tampa newspaper. They accidentally published the version about the Lightning losing the cup. Sounds like they needed a second set of eyes in production last night to help with testing.

Simplicity and Reliability

From the National Post:

Systems crash inevitable report: Networks for banks, hospitals, power lines at risk within the next five years

Michael Friscolanti
National Post

June 8, 2004

Computer networks that support Canada’s critical services — from hospitals to banks to power lines — will undoubtedly crash in the next five years, warns a government-commissioned report that says even immediate action cannot stop the inevitable.

Shoddy software has left the Internet and other parts of our telecommunications systems vulnerable to a massive meltdown, the report concludes. No corrective action can avert “a major failure,” but the authors say both the government and the private sector must act quickly if they are to prevent subsequent collapses.

[….]

With respect to software that has evolved to a high level of complexity, there may exist no single individual who grasps the entire program, let alone one who can keep track of all those who have contributed to its various components,” the report reads. ” [emphasis added]

I’m reminded of Ralph Johnson speaking on security last week. He noted that secure systems are usually known by one person; they have a simple enough design to be understood by one person (usually the designer). To be secure, a system must be reliable.

Software Design & Testability

I recently attended a talk by Ralph Johnson which I enjoyed.

One theme that Ralph talked about was simplicity. He was talking about security patterns, and how simplicity of design implies knowledge. If a design is knowable, it can be made secure. That led to a thought for me on testability. This is also true of a testable design; if a design is simple, it is knowable and therefore testable.

For example, as a tester one is often asked to look at requirements or user stories to determine if they are testable. If you as a tester can understand them, that’s a good indication that they are testable. If you can’t, how can you (or others) test? In my experience, when no one person understands or knows the project, it is difficult to test. If I need to go around to seven different people to understand the design of a project and none of them understand it in its entirety, chances are the project will be equally as difficult to test. Legacy systems suffer from this, but poorly designed new software projects are also difficult to test.

As a tester, if you don’t understand the design, don’t automatically think that you lack the necessary technical skills to understand. The architectural designs should be understandable to any stakeholder involved with the project (once the convention is explained to them). If the design isn’t understandable at a high level by the entire team, this is probably a bad design smell.

Ralph also mentioned that tools such as UML or DFDs serve to express program design to other people. (The opposing view is that these tools are used to generate code.) I agree with Ralph that these tools express design to be communicated to other people. If the architectural diagrams are easily understood, they will not only be testable, but more easily developed and communicated to the business. Tool generated code is extremely difficult to know and to test. People design software, people develop software, people test software, and people use software. The tools are there to help express each of those roles.

Thoughts on product development, management, design, mobile and other topics.