What Is a Good Test Case?

What Is a Good Test Case?

Abstract

Designing good test cases is a complex art. The complexity comes from three sources:
. Test cases help us discover information. Different types of tests are more effective for different classes of information.

. Test cases can be “good” in a variety of ways. No test case will be good in all of them.

. People tend to create test cases according to certain testing styles, such as domain testing or risk-based testing. Good domain tests are different from good risk-based tests.

What’s a Test Case?

Let’s start with the basics. What’s a test case?

IEEE Standard 610 (1990) defines test case as follows:
“(1) A set of test inputs, execution conditions, and expected results developed for a particular objective, such as to exercise a particular program path or to verify compliance with a specific requirement. “(2) (IEEE Std 829-1983) Documentation specifying inputs, predicted results, and a set of execution conditions for a test item.”

According to Ron Patton (2001, p. 65),
“Test cases are the specific inputs that you’ll try and the procedures that you’ll follow when you test the software.”

Boris Beizer (1995, p. 3) defines a test as “A sequence of one or more subtests executed as a sequence because the outcome and/or final state of one subtest is the input and/or initial state of the next. The word ‘test’ is used to include subtests, tests proper, and test suites.

“A test case specifies the pretest state of the IUT and its environment, the test inputs or conditions, and the expected result. The expected result specifies what the IUT should produce from the test inputs. This specification includes messages generated by the IUT, exceptions, returned values, and resultant state of the IUT and its environment. Test cases may also specify initial and resulting conditions for other objects that constitute the IUT and its environment.”

In practice, many things are referred to as test cases even though they are far from being fully documented.

Brian Marick uses a related term to describe the lightly documented test case, the test idea:

“A test idea is a brief statement of something that should be tested. For example, if you're testing a square root function, one idea for a test would be ‘test a number less than zero’. The idea is to check if the code handles an error case.”

In my view, a test case is a question that you ask of the program. The point of running the test is to gain information, for example whether the program will pass or fail the test.

It may or may not be specified in great procedural detail, as long as it is clear what is the idea of the test and how to apply that idea to some specific aspect (feature, for example) of the product. If the documentation is an essential aspect of a test case, in your vocabulary, please substitute the term “test idea” for “test case” in everything that follows.

An important implication of defining a test case as a question is that a test case must be reasonably capable of revealing information.

. Under this definition, the scope of test cases changes as the program gets more stable. Early in testing, when anything in the program can be broken, trying the largest “legal” value in a numeric input field is a sensible test. But weeks later, after the program has passed this test several times over several builds, a standalone test of this one field is no longer a test case because there is only a miniscule probability of failure. A more appropriate test case at this point might combine boundaries of ten different variables at the same time or place the boundary in the context of a long-sequence test or a scenario.

. Also, under this definition, the metrics that report the number of test cases are meaningless. What do you do with a set of 20 single-variable tests that were interesting a few weeks ago but now should be retired or merged into a combination? Suppose you create a combination test that includes the 20 tests. Should the metric report this one test, 20 tests, or 21? What about the tests that you run only once?

What about the tests that you design and implement but never run because the program design changes in ways that make these tests uninteresting?

Another implication of the definition is that a test is not necessarily designed to expose a defect. The goal is information. Very often, the information sought involves defects, but not always. (I owe this insight to Marick, 1997.) To assess the value of a test, we should ask how well it provides the information we’re looking for.

. Find defects. This is the classic objective of testing. A test is run in order to trigger failures that expose defects. Generally, we look for defects in all interesting parts of the product.

. Maximize bug count. The distinction between this and “find defects” is that total number of bugs is more important than coverage. We might focus narrowly, on only a few high-risk features, if this is the way to find the most bugs in the time available.

. Block premature product releases. This tester stops premature shipment by finding bugs so serious that no one would ship the product until they are fixed. For every release-decision meeting, the tester’s goal is to have new showstopper bugs.

. Help managers make ship / no-ship decisions. Managers are typically concerned with risk in the field. They want to know about coverage (maybe not the simplistic code coverage statistics, but some indicators of how much of the product has been addressed and how much is left), and how important the known problems are. Problems that appear significant on paper but will not lead to customer dissatisfaction are probably not relevant to the ship decision.

. Minimize technical support costs. Working in conjunction with a technical support or help desk group, the test team identifies the issues that lead to calls for support. These are often peripherally related to the product under test--for example, getting the product to work with a specific printer or to import data successfully from a third party database might prevent more calls than a low-frequency, data-corrupting crash.

. Assess conformance to specification. Any claim made in the specification is checked. Program characteristics not addressed in the specification are not (as part of this objective) checked.

. Conform to regulations. If a regulation specifies a certain type of coverage (such as, at least one test for every claim made about the product), the test group creates the appropriate tests. If the regulation specifies a style for the specifications or other
documentation, the test group probably checks the style. In general, the test group is focusing on anything covered by regulation and (in the context of this objective) nothing that is not covered by regulation.

. Minimize safety-related lawsuit risk. Any error that could lead to an accident or injury is of primary interest. Errors that lead to loss of time or data or corrupt data, but that don’t carry a risk of injury or damage to physical things are out of scope.

. Find safe scenarios for use of the product (find ways to get it to work, in spite of the bugs). Sometimes, all that you’re looking for is one way to do a task that will consistently work--one set of instructions that someone else can follow that will reliably deliver the benefit they are supposed to lead to. In this case, the tester is not looking for bugs. He is trying out, empirically refining and documenting, a way to do a task.

. Assess quality. This is a tricky objective because quality is multi-dimensional. The nature of quality depends on the nature of the product. For example, a computer game that is rock solid but not entertaining is a lousy game. To assess quality --to measure and report back on the level of quality -- you probably need a clear definition of the most important quality criteria for this product, and then you need a theory that relates test results to the definition. For example, reliability is not just about the number of bugs in the product. It is (or is often defined as being) about the number of reliability-related failures that can be expected in a period of time or a period of use. (Reliability-related? In measuring reliability, an organization might not care, for example, about misspellings in error messages.) To make this prediction, you need a mathematically and empirically sound model that links test results to reliability. Testing involves gathering the data needed by the model. This might involve extensive work in areas of the product believed to be stable as well as some work in weaker areas. Imagine a reliability model based on counting bugs found (perhaps weighted by some type of severity) per N lines of code or per K hours of testing. Finding the bugs is important. Eliminating duplicates is important. Troubleshooting to make the bug report easier to understand and more likely to fix is (in the context of assessment) out of scope.

. Verify correctness of the product. It is impossible to do this by testing. You can prove that the product is not correct or you can demonstrate that you didn’t find any errors in a given period of time using a given testing strategy. However, you can’t test exhaustively, and the product might fail under conditions that you did not test. The best you can do (if you have a solid, credible model) is assessment--test-based estimation of the probability of errors. (See the discussion of reliability, above).

. Assure quality. Despite the common title, quality assurance, you can’t assure quality by testing. You can’t assure quality by gathering metrics. You can’t assure quality by setting standards. Quality assurance involves building a high quality product and for that, you need skilled people throughout development who have time and motivation and an appropriate balance of direction and creative freedom. This is out of scope for a test organization. It is within scope for the project manager and associated executives. The test organization can certainly help in this process by performing a wide range of technical investigations, but those investigations are not quality assurance.

Given a testing objective, the good test series provides information directly relevant to that objective.

Tests Intended to Expose Defects

Let’s narrow our focus to the test group that has two primary objectives:

. Find bugs that the rest of the development group will consider relevant (worth reporting) and

. Get these bugs fixed.

Even within these objectives, tests can be good in many different ways. For example, we might say that one test is better than another if it is:

. More powerful. I define power in the usual statistical sense as more likely to expose a bug if it the bug is there. Note that Test 1 can be more powerful than Test 2 for one type of bug and less powerful than Test 2 for a different type of bug.

. More likely to yield significant (more motivating, more persuasive) results. A problem is significant if a stakeholder with influence would protest if the problem is not fixed. (A stakeholder is a person who is affected by the product. A stakeholder with influence is someone whose preference or opinion might result in change to the product.)

. More credible. A credible test is more likely to be taken as a realistic (or reasonable) set of operations by the programmer or another stakeholder with influence.
“ Corner case
” is an example of a phrase used by programmers to say that a test or bug is non-credible:
“ No one would do that.
” A test case is credible if some (or all) stakeholders agree that it is realistic.

. Representative of events more likely to be encountered by the customer. A population of tests can be designed to be highly credible. Set up your population to reflect actual usage probabilities. The more frequent clusters of activities are more likely to be covered or covered more thoroughly. (I say cluster of activities to suggest that many features are used together and so we might track which combinations of features are used and in what order, and reflect this more specific information in our analysis.) For more details, read Musa's (1998) work on software reliability engineering.

. Easier to evaluate. The question is, did the program pass or fail the test? Ease of Evaluation. The tester should be able to determine, quickly and easily, whether the program passed or failed the test. It is not enough that it is possible to tell whether the program passed or failed. The harder evaluation is, or the longer it takes, the more likely it is that failures will slip through unnoticed. Faced with time-consuming evaluation, the tester will take shortcuts and find ways to less expensively guess whether the program is OK or not. These shortcuts will typically be imperfectly accurate (that is, they may miss obvious bugs or they may flag correct code as erroneous.)

. More useful for troubleshooting. For example, high volume automated tests will often crash the system under test without providing much information about the relevant test conditions needed to reproduce the problem. They are not useful for troubleshooting. Tests that are harder to repeat are less useful for troubleshooting. Tests that are harder to perform are less likely to be performed correctly the next time, when you are troubleshooting a failure that was exposed by this test.

. More informative. A test provides value to the extent that we learn from it. In most cases, you learn more from the test that the program passes than the one the program fails, but the informative test will teach you something (reduce your uncertainty) whether the program passes it or fails.

o For example, if we have already run a test in several builds, and the program reliably passed it each time, we will expect the program to pass this test again. Another "pass" result from the reused test doesn't contribute anything to our mental model of the program.
o The notion of equivalence classes provides another example of information value. Behind any test is a set of tests that are sufficiently similar to it that we think of the other tests as essentially redundant with this one. In traditional jargon, this is the "equivalence class" or the "equivalence partition." If the tests are sufficiently similar, there is little added information to be obtained by running the second one after running the first.
o This criterion is closely related to Karl Popper’s theory of value of experiments (See Popper 1992). Good experiments involve risky predictions. The theory predicts something that many people would expect not to be true. Either your favorite theory is false or lots of people are surprised. Popper’s analysis of what makes for good experiments (good tests) is a core belief in a mainstream approach to the philosophy of science. Perhaps the essential consideration here is that the expected value of what you will learn from this test has to be balanced against the opportunity cost of designing and running the test. The time you spend on this test is time you don't have available for some other test or other activity.

. Appropriately complex. A complex test involves many features, or variables, or other attributes of the software under test. Complexity is less desirable when the program has changed in many ways, or when you’re testing many new features at once. If the program has many bugs, a complex test might fail so quickly that you don’t get to run much of it. Test groups that rely primarily on complex tests complain of blocking bugs. A blocking bug causes many tests to fail, preventing the test group from learning the other things about the program that these tests are supposed to expose. Therefore, early in testing, simple tests are desirable. As the program gets more stable, or (as in eXtreme Programming or any evolutionary development lifecycle) as more stable features are incorporated into the program, greater complexity becomes more desirable.

. More likely to help the tester or the programmer develop insight into some aspect of the product, the customer, or the environment. Sometimes, we test to understand the product, to learn how it works or where its risks might be. Later, we might design tests to expose faults, but especially early in testing we are interested in learning what it is and how to test it. Many tests like this are never reused. However, in a test-first design environment, code changes are often made experimentally, with the expectation that the (typically, unit) test suite will alert the programmer to side effects. In such an environment, a test might be designed to flag a performance change, a difference in rounding error, or some other change that is not a defect. An unexpected change in program behavior might alert the programmer that her model of the code or of the impact of her code change is incomplete or wrong, leading her to additional testing and troubleshooting. (Thanks to Ward Cunningham and Brian Marick for suggesting this example.)

. Function testing

. Domain testing

. Specification-based testing

. Risk-based testing

. Stress testing

. Regression testing

. User testing

. Scenario testing

. State-model based testing

. High volume automated testing

. Exploratory testing

Bach and I call these "paradigms" of testing because we have seen time and again that one or two of them dominate the thinking of a testing group or a talented tester. An analysis we find intriguing goes like this:

If I was a "scenario tester" (a person who defines testing primarily in terms of application of scenario tests), how would I actually test the program? What makes one scenario test better than another? Why types of problems would I tend to miss, what would be difficult for me to find or interpret, and what would be particularly easy?
Here are thumbnail sketches of the styles, with some thoughts on how test cases are “good”
within them.

Function Testing

Test each function / feature / variable in isolation.

Most test groups start with fairly simple function testing but then switch to a different style, often involving the interaction of several functions, once the program passes the mainstream function tests. Within this approach, a good test focuses on a single function and tests it with middle-of-theroad values. We don’t expect the program to fail a test like this, but it will if the algorithm is fundamentally wrong, the build is broken, or a change to some other part of the program has fowled this code. These tests are highly credible and easy to evaluate but not particularly powerful.

Some test groups spend most of their effort on function tests. For them, testing is complete when every item has been thoroughly tested on its own. In my experience, the tougher function tests look like domain tests and have their strengths.

Domain Testing

The essence of this type of testing is sampling. We reduce a massive set of possible tests to a small group by dividing (partitioning) the set into subsets (subdomains) and picking one or two representatives from each subset.

In domain testing, we focus on variables, initially one variable at time. To test a given variable, the set includes all the values (including invalid values) that you can imagine being assigned to the variable. Partition the set into subdomains and test at least one representative from each subdomain. Typically, you test with a "best representative", that is, with a value that is at least as likely to expose an error as any other member of the class. If the variable can be mapped to the number line, the best representatives are typically boundary values.
Most discussions of domain testing are about input variables whose values can be mapped to the number line. The best representatives of partitions in these cases are typically boundary cases. A good set of domain tests for a numeric variable hits every boundary value, including the minimum, the maximum, a value barely below the minimum, and a value barely above the maximum.


Whittaker (2003) provides an extensive discussion of the many different types of variables we can analyze in software, including input variables, output variables, results of intermediate calculations, values stored in a file system, and data sent to devices or other programs.
.
Kaner, Falk & Nguyen (1993) provided a detailed analysis of testing with a variable (printer type, in configuration testing) that can’t be mapped to a number line.

These tests are higher power than tests that don’t use “best representatives” or that skip some of the subdomains (e.g. people often skip cases that are expected to lead to error messages).

The first time these tests are run, or after significant relevant changes, these tests carry a lot of information value because boundary / extreme-value errors are common.

Bugs found with these tests are sometimes dismissed, especially when you test extreme values of several variables at the same time. (These tests are called corner cases.) They are not necessarily credible, they don’t necessarily represent what customers will do, and thus they are not necessarily very motivating to stakeholders.

Specification-Based Testing

Check the program against every claim made in a reference document, such as a design specification, a requirements list, a user interface description, a published model, or a user manual.

These tests are highly significant (motivating) in companies that take their specifications seriously. For example, if the specification is part of a contract, conformance to the spec is very important. Similarly products must conform to their advertisements, and life-critical products must conform to any safety-related specification. Specification-driven tests are often weak, not particularly powerful representatives of the class of tests that could test a given specification item.

Some groups that do specification-based testing focus narrowly on what is written in the document. To them, a good set of tests includes an unambiguous and relevant test for each claim made in the spec.
Other groups look further, for problems in the specification. They find that the most informative tests in a well-specified product are often the ones that explore ambiguities in the spec or examine aspects of the product that were not well-specified.

Risk-Based Testing

Imagine a way the program could fail and then design one or more tests to check whether the program will actually fail that in way.

A “complete” set of risk-based tests would be based on an exhaustive risk list, a list of every way the program could fail.

A good risk-based test is a powerful representative of the class of tests that address a given risk.

To the extent that the tests tie back to significant failures in the field or well known failures in a competitor’s product, a risk-based failure will be highly credible and highly motivating. However, many risk-based tests are dismissed as academic (unlikely to occur in real use). Being able to tie the “risk” (potential failure) you test for to a real failure in the field is very valuable, and makes tests more credible.

Risk-based tests tend to carry high information value because you are testing for a problem that you have some reason to believe might actually exist in the product. We learn a lot whether the program passes the test or fails it.

Stress Testing

There are a few different definition of stress tests.

. Under one common definition, you hit the program with a peak burst of activity and see it fail.

. IEEE Standard 610.12-1990 defines it as "Testing conducted to evaluate a system or component at or beyond the limits of its specified requirements with the goal of causing the system to fail."

. A third approach involves driving the program to failure in order to watch how the program fails. For example, if the test involves excessive input, you don’t just test near the specified limits. You keep increasing the size or rate of input until either the program finally fails or you become convinced that further increases won’t yield a failure. The fact that the program eventually fails might not be particularly surprising or motivating. The interesting thinking happens when you see the failure and ask what vulnerabilities have been exposed and which of them might be triggered under less extreme circumstances. Jorgensen (2003) provides a fascinating example of this style of work.

I work from this third definition.

These tests have high power.

Some people dismiss stress test results as not representative of customer use, and therefore not credible and not motivating. Another problem with stress testing is that a failure may not be useful unless the test provides good troubleshooting information, or the lead tester is extremely familiar with the application. A good stress test pushes the limit you want to push, and includes enough diagnostic support to make it reasonably easy for you to investigate a failure once you see it. Some testers, such as Alberto Savoia (2000), use stress-like tests to expose failures that are hard to see if the system is not running several tasks concurrently. These failures often show up well within the theoretical limits of the system and so they are more credible and more motivating. They are not necessarily easy to troubleshoot.

Regression Testing

Design, develop and save tests with the intent of regularly reusing them, Repeat the tests after making changes to the program.

This is a good point (consideration of regression testing) to note that this is not an orthogonal list of test types. You can put domain tests or specification-based tests or any other kinds of tests into your set of regression tests.

So what’s the difference between these and the others? I’ll answer this by example:

Suppose a tester creates a suite of domain tests and saves them for reuse. Is this domain testing or regression testing?

. I think of it as primarily domain testing if the tester is primarily thinking about partitioning variables and finding good representatives when she creates the tests.

. I think of it as primarily regression testing if the tester is primarily thinking about building a set of reusable tests.

Regression tests may have been powerful, credible, and so on, when they were first designed. However, after a test has been run and passed many times, it’s not likely that the program will fail it the next time, unless there have been major changes or changes in part of the code directly involved with this test. Thus, most of the time, regression tests carry little information value.

A good regression test is designed for reuse. It is adequately documented and maintainable. (For suggestions that improve maintainability of GUI-level tests, see Graham & Fewster, 1999; Kaner, 1998; Pettichord, 2002, and the papers at www.pettichord.com in general). A good regression test is designed to be likely to fail if changes induce errors in the function(s) or area(s) of the program addressed by the regression test.

User Testing

User testing is done by users. Not by testers pretending to be users. Not by secretaries or executives pretending to be testers pretending to be users. By users. People who will make use of the finished product.

User tests might be designed by the users or by testers or by other people (sometimes even by lawyers, who included them as acceptance tests in a contract for custom software). The set of user tests might include boundary tests, stress tests, or any other type of test.

Some user tests are designed in such detail that the user merely executes them and reports whether the program passed or failed them. This is a good way to design tests if your goal is to provide a carefully scripted demonstration of the system, without much opportunity for wrong things to show up as wrong. If your goal is to discover what problems a user will encounter in real use of the system, your task is much more difficult. Beta tests are often described as cheap, effective user tests but in practice they can be quite expensive to administer and they may not yield much information. For some suggestions on beta tests, see Kaner, Falk & Nguyen (1993).

A good user test must allow enough room for cognitive activity by the user while providing enough structure for the user to report the results effectively (in a way that helps readers understand and troubleshoot the problem).

Failures found in user testing are typically credible and motivating. Few users run particularly powerful tests. However, some users run complex scenarios that put the program through its paces.

Scenario Testing

A scenario is a story that describes a hypothetical situation. In testing, you check how the program copes with this hypothetical situation.

The ideal scenario test is credible, motivating, easy to evaluate, and complex.

In practice, many scenarios will be weak in at least one of these attributes, but people will still call them scenarios. The key message of this pattern is that you should keep these four attributes in mind when you design a scenario test and try hard to achieve them.


An important variation of the scenario test involves a harsher test. The story will often involve a sequence, or data values, that would rarely be used by typical users. They might arise, however, out of user error or in the course of an unusual but plausible situation, or in the behavior of a hostile user. Hans Buwalda (2000a, 2000b) calls these "killer soaps" to distinguish them from normal scenarios, which he calls "soap operas." Such scenarios are common in security testing or other forms of stress testing.

In the Rational Unified Process, scenarios come from use cases. (Jacobson, Booch, & Rumbaugh, 1999). Scenarios specify actors, roles, business processes, the goal(s) of the actor(s), and events that can occur in the course of attempting to achieve the goal. A scenario is an instantiation of a use case. A simple scenario traces through a single use case, specifying the data values and thus the path taken through the case. A more complex use case involves concatenation of several use cases, to track through a given task, end to end. (See also Bittner & Spence, 2003; Cockburn, 2000; Collard, 1999; Constantine & Lockwood, 1999; Wiegers, 1999.) For a cautionary note, see Berger (2001).

However they are derived, good scenario tests have high power the first time they’re run.

Groups vary in how often they run a given scenario test.

. Some groups create a pool of scenario tests as regression tests.
. Others (like me) run a scenario once or a small number of times and then design another scenario rather than sticking with the ones they’ve used before.

Testers often develop scenarios to develop insight into the product. This is especially true early in testing and again late in testing (when the product has stabilized and the tester is trying to understand advanced uses of the product.)

State-Model-Based Testing

In state-model-based testing, you model the visible behavior of the program as a state machine and drive the program through the state transitions, checking for conformance to predictions from the model. This approach to testing is discussed extensively at www.model-basedtesting.org. In general, comparisons of software behavior to the model are done using automated tests and so the failures that are found are found easily (easy to evaluate).

In general, state-model-based tests are credible, motivating and easy to troubleshoot. However, state-based testing often involves simplifications, looking at transitions between operational modes rather than states, because there are too many states (El-Far 1995). Some abstractions to operational modes are obvious and credible, but others can seem overbroad or otherwise odd to some stakeholders, thereby reducing the value of the tests. Additionally, if the model is oversimplified, failures exposed by the model can be difficult to troubleshoot (Houghtaling, 2001). Talking about his experiences in creating state models of software, Harry Robinson (2001) reported that much of the bug-finding happens while doing the modeling, well before the automated tests are coded. Elisabeth Hendrickson (2002) trains testers to work with state models as an exploratory testing tool--her models might never result in automated tests, their value is that they guide the analysis by the tester. El-Far, Thompson & Mottay (2001) and El-Far (2001) discuss some of the considerations in building a good suite of model-based tests. There are important tradeoffs, involving, for example, the level of detail (more detailed models find more bugs but can be much harder to read and maintain). For much more, see the papers at www.model-based-testing.org.

High-Volume Automated Testing

High-volume automated testing involves massive numbers of tests, comparing the results against one or more partial oracles.

. The simplest partial oracle is running versus crashing. If the program crashes, there must be a bug. See Nyman (1998, 2002) for details and experience reports.

. State-model-based testing can be high volume if the stopping rule is based on the results of the tests rather than on a coverage criterion. For the general notion of stochastic state-based testing, see Whittaker (1997). For discussion of state-model-based testing ended by a coverage stopping rule, see Al-Ghafees & Whittaker (2002).

. Jorgensen (2002) provides another example of high-volume testing. He starts with a file that is valid for the application under test. Then he corrupts it in many ways, in many places, feeding the corrupted files to the application. The application rejects most of the bad files and crashes on some. Sometimes, some applications lose control when handling these files. Buffer overruns or other failures allow the tester to take over the application or the machine running the application. Any program that will read any type of data stream can be subject to this type of attack if the tester can modify the data stream before it reaches the program.
. Kaner (2000) describes several other examples of high-volume automated testing approaches. One classic approach repeatedly feeds random data to the application under test and to another application that serves as a reference for comparison, an oracle. Another approach runs an arbitrarily long random sequence of regression tests, tests that the program has shown it can pass one by one. Memory leaks, stack corruption, wild pointers or other garbage that cumulates over time finally causes failures in these long sequences. Yet another approach attacks the program with long sequences of activity and uses probes (tests built into the program that log warning or failure messages in response to unexpected conditions) to expose problems.

High-volume testing is a diverse grouping. The essence of it is that the structure of this type of testing is designed by a person, but the individual test cases are developed, executed, and interpreted by the computer, which flags suspected failures for human review. The almost-complete automation is what makes it possible to run so many tests.

. The individual tests are often weak. They make up for low power with massive numbers.

. Because the tests are not handcrafted, some tests that expose failures may not be particularly credible or motivating. A skilled tester often works with a failure to imagine a broader or more significant range of circumstances under which the failure might arise, and then craft a test to prove it.

. Some high-volume test approaches yield failures that are very hard to troubleshoot. It is easy to see that the failure occurred in a given test, but one of the necessary conditions that led to the failure might have been set up thousands of tests before the one that actually failed. Building troubleshooting support into these tests is a design challenge that some test groups have tackled more effectively than others.

Exploratory Testing

Exploratory testing is “any testing to the extent that the tester actively controls the design of the tests as those tests are performed and uses information gained while testing to design new and better tests” (Bach 2003a). Bach points out that tests span a continuum between purely scripted (the tester does precisely what the script specifies and nothing else) to purely exploratory (none of the tester’s activities are pre-specified and the tester is not required to generate any test documentation beyond bug reports). Any given testing effort falls somewhere on this continuum. Even predominantly pre-scripted testing can be exploratory when performed by a skilled tester. “In the prototypic case (what Bach calls “freestyle exploratory testing”), exploratory testers continually learn about the software they’re testing, the market for the product, the various ways in which the product could fail, the weaknesses of the product (including where problems have been found in the application historically and which developers tend to make which kinds of errors), and the best ways to test the software. At the same time that they’re doing all this learning, exploratory testers also test the software, report the problems they find, advocate for the problems they found to be fixed, and develop new tests based on the information they’ve obtained so far in their learning.” (Tinkham & Kaner, 2003)

An exploratory tester might use any type of test--domain, specification-based, stress, risk-based, any of them. The underlying issue is not what style of testing is best but what is most likely to reveal the information the tester is looking for at the moment.

Exploratory testing is not purely spontaneous. The tester might do extensive research, such as studying competitive products, failure histories of this and analogous products, interviewing programmers and users, reading specifications, and working with the product.

What distinguishes skilled exploratory testing from other approaches and from unskilled exploration, is that in the moments of doing the testing, the person who is doing exploratory testing well is fully engaged in the work, learning and planning as well as running the tests. Test cases are good to the extent that they advance the tester’s knowledge in the direction of his information-seeking goal. Exploratory testing is highly goal-driven, but the goal may change quickly as the tester gains new knowledge.

Concluding Notes

There’s no simple formula or prescription for generating “good” test cases. The space of interesting tests is too complex for this. There are tests that are good for your purposes, for bringing forth the type of information that you’re seeking.

Many test groups, most of the ones that I’ve seen, stick with a few types of tests. They are primarily scenario testers or primarily domain testers, etc. As they get very good at their preferred style(s) of testing, their tests become, in some ways, excellent. Unfortunately, no style yields tests that are excellent in all of the ways we wish for tests. To achieve the broad range of value from our tests, we have to use a broad range of techniques.

Correlation

Correlation? What’s that?

If you think correlation has something to do with the fit of data points to a function curve on a graph, and the word has no meaning to you in the context of LoadRunner then this document is for you. It explains what correlation in LoadRunner is, why you have to do it, how to do it, and what to do when it goes wrong. If this is the first time that you have used LoadRunner, or if you have been using it a little but are not a guru, then read on.



Introduction

LoadRunner when recording a script simply listens to the client (browser) talking to the server (web server) and writes it all down. The complete transcript of everything that was said, the dates/time, content, requests and replies can be found in the Recording Log. (View-> Output Window-> Recording Log) The script is sort of an easier to read version of this. The main difference is that the script only contains the client’s communication.

If you imagine that LoadRunner is an impersonator pretending to be the client (browser) the script is LoadRunner’s note that tell it what to say to the server to successfully fool it. We want the server to believe that LoadRunner is a real client, and so send it the information requested.

This script has the hard coded information of the original conversation (Browser session) that occurred between the client and server. This hard coded information may not be enough to fool the server during replay however. It may have to be correlated.


What is correlation?

Correlation is where the script is modified so that some of the hard coded values in the script are no longer hard coded. Rather than have LoadRunner send the original value to the server, we may need to send different values.

For example, the original recorded script may have included the server sending the client a session identification number. Something to identify the client during that particular session. This session ID was hard coded into the script during recording.

During replay, the server will send LoadRunner a new session ID. We need to capture this value, and incorporate it into the script so we can send it back to the server to correctly identify ourselves for this new session. If we leave the script unmodified, we will send the old hard coded session ID to the server. The server will look at it and think it invalid, or unknown, and so will not send us the pages we have requested. LoadRunner will not have successfully fooled the server into believing it is a client.

Correlation is the capturing of dynamic values passed from the server to the client and back. We save this captured value into a LoadRunner parameter, and then use this parameter in the script in place of the original value. During replay, LoadRunner will now listen to what the server sends to it, and when it makes requests of the server, send this new, valid value back to the server. Thus fooling the server into believing it is talking to a real client.

Why do I have to correlate?

If you try to replay a script without correlating first, then most likely the script will fail. The requests it sends to the server will not be replied to. Either the session ID is invalid, so the server won’t allow you into the site, or it won’t allow you to create new records because they are the same as existing ones, or the server won’t understand your request because it isn’t what it is expecting.

Any value which changes every time you connect to the server is a candidate for correlation. A correlated script will send the server the information it is looking for, and so allow the script to replay. This will allow many Vusers to replay the script many times, and so place load on your server.



What errors mean I have to correlate?

There are no specific errors that are associated with correlation, but there are errors that could be caused because a value hasn’t been correlated. For example, a session ID. If an invalid session ID is sent to a web server, how that server responds depends on the implementation of that server. It might send a page specifically stating the Session ID is invalid and ask you to log in again. It might send an HTTP 404 Page not found error because the requesting user didn’t have permissions for the specified page, and so the server couldn’t find the page.

In general any error message returned from the server after LoadRunner makes a request that complains about permissions can point to a hard coded value that needs to be correlated.


The tools (functions) used to correlate.

In LoadRunner 7.X there four functions that you can use for correlation. A list of them, along with documentation and examples can be found in the on-line documentation. From VuGen, go to Help-> Function reference-> Contents-> Web and Wireless Vuser Functions-> Correlation Functions.

The first two functions are essentially the same, and I will talk about then together. The third function, the web_save_reg_param function differs in implementation, and the parameters it takes, but does the same job, and is used in much the same way. The last function is associated with the first two, and isn’t directly a correlation function, but rather a LoadRunner setting. It will be talked about later in a different section.

Web_create_html_param :-

This is the standard correlation function in LoadRunner 6.X and 7.X. This function takes three parameters.

web_create_html_param ( “Parameter Name”, “Left Boundary”, “Right Boundary” );

Each of these parameters is a pointer to a string. That means that if they are entered as literal text, they need to be enclosed in “quotes“. Each parameter is separated by a comma.

Parameter Name:- This is the name of the parameter, or place holder variable that LoadRunner will save the captured value into. After successfully capturing the value, the parameter name is used in the script in place of the original value. LoadRunner will identify the parameter / placeholder, and substitute the captured value for the placeholder during replay. This name should have no spaces, but apart from that limitation, it is entirely up to you what name you give.

Left Boundary:- This is where we tell LoadRunner how to find the dynamic value that we are looking for. In the Left Boundary we specify the text that will appear to the left of the changing value.

Right Boundary:- This is where we tell LoadRunner how to identify the end of the dynamic value we are looking for. Here we place the text that will appear after the value we are looking for.

web_create_html_param_ex

web_create_html_param_ex ( “Parameter Name”, “Left Boundary”, “Right Boundary”, “Instance”);

This function is the same as the web_create_html_param function, except it doesn’t look for the first instance of the boundaries, but rather the nth instance of those boundaries. The first three parameters are the same, name, left, right, the last parameter is a pointer to a string, so it must be enclosed in double quotes. It is the number of the occurrence. If you place the number one here (i.e. “1”) then the function behaves exactly as the web_create_html_param function. It looks for the first occurrence. If you put the number three here (i.e. “3”) it will look for the 3rd occurrence of the left and right boundaries, and place what appears in between into the parameter.

web_reg_save_param

web_reg_save_param ( “Parameter Name” , , LAST );

The first thing to note about this function as different from the web_create_html_param functions is that the number of parameters it takes can vary. The first one is still the name, but after that there are different attributes that can be used. These attributes can appear in any order because they contain within them what they are. For example, the attribute to identify the left boundary is “LB= followed by the text of the left boundary. I won’t be talking about all of the options for this function, they are listed in the documentation. Please have a look at it. (Help-> Function Reference)

The first parameter is the name, then the list of attributes or parameters, then the keyword LAST. This identifies the end of the function. The keyword is not enclosed in quotes, all parameters are. All parameters and keywords are separated by commas.



Identifying values to correlate.

So we have the tools, and we know why we need to use them, but how do we know what to use them on? What values in the script need to be correlated. The simplest answer is, “Any value that changes between sessions required for the script to replay.”

A hypothetical example. We are logging onto a web site. When we send the server our user name and password, it replies to us with a session ID that is good for that session. The session ID needs to be correlated for replay. We need to capture this value during replay to use in the script in place of the hard coded value.

To identify values to correlate, record the script, and save it. Open a new script, and record the same actions, and business process again. As much as possible, during recording, enter the same values in both scripts. For example, user ID, password, and fields and edit selections. Save the second script, and then run it with Extended log. (Vuser-> Run time settings-> Log-> Extended log. Check all three options)

Go to tools-> Compare with Vuser, and choose the first recorded script. WinDiff will open and display the two scripts side by side. Lines with differences in them will be highlighted in Yellow. Differences within the line, will be in red.




If WinDiff gives an error here, dismiss the error, WinDiff will be minimized in the task bar. Right click on it. Choose restore. Then go to File-> Select Files/Directories, and manually select the action sections for the two scripts.

Differences like "lr_think_time" can be ignored. They are load runner pacing functions, and don’t represent data sent to the server.

Locate the first difference and take note of it and search the script open in VuGen for that difference. That is the original value hard coded into the script that was different in the second script. Highlight it, and copy it.



Go to the Recording log, and place your cursor at the top. Hit Control F (Ctrl+F) to do a search and paste in the original value. We are looking for the first occurrence of this value in the recording log. If you don’t find the value in the recording log, check you are looking in the right scripts recording log. Remember you have two almost identical scripts here.


If you find the value, scroll up in the log, and make sure the value was sent as part of a response from the server. The first header you come across while looking up the script should be preference with a receiving response. This indicates that the value was sent by the server to the client. If the value first appears as part of a sending request, then the value originated on the client side, and doesn’t need to be correlated, but rather parameterized. That is a different topic all together. The response will have a comment before it that looks like this

*** [tid=640 Action1 2] Receiving response ( 10/8/2001 12:10:26 )



So, we have a value that is different between subsequent recordings, it was sent from the server to the client. This value most likely needs to be correlated. If the value you were looking for doesn’t meet these criteria,

1. Different between recordings
2. Originated first on the server and sent to the client

It probably doesn’t need to be correlated.



Now that we know Why and What, How do we parameterize?

Step 1.
After confirming that the first occurrence was part of a received response from the server, we need to now figure out where to place the web_create_html_param( ) function. The web_create_html_param statement needs to go immediately before the request that fetched the dynamic value from the server. In order to find this request or URL in the script, we need to replay the script once with extended log and all the three options (In Vuser->Runtime Settings->Log) turned on.

In the recording log, pick up the text that is before the dynamic value. This text should remain constant no matter how many times you replay the script and highlight it and copy it. This is the text that will identify to LoadRunner where to find the start of the value we are capturing.




Now, go to the execution log and search for the text that you just copied from the recording log.




You should see a corresponding Action1.c() at the beginning of that line with a number in the brackets. That is the number of the line the script where you need to put the web_create_html_param( ) function. The function should go right above that line in the script.



So, add a couple of blank lines to your script before the function at that line, and then type in web_create_html_param(“UserSession” but give it a name that means more to you than UserSession.



Step 2.
Go back to the execution log and highlight the text to the left of the dynamic value and copy it. This should be some of the same text we searched for in the Execution log.

The amount of text you highlight should be sufficient so that it is unique in this reply from the server. I would suggest copying as much as possible without copying any special characters. These show in the execution log as black squares, and the actual character they represent is uncertain. After selecting a boundary, go to the top of the Servers reply, and hit Ctrl+F and do a search for that boundary. You want to make certain what you have selected is the first occurrence in the servers reply. If it isn’t select more text to make it unique, or consider using the web_create_html_param_ex function or the ORD parameter or the web_reg_save_param function.

Once you have finalized the static text that represents the left boundary, copy it into the web_create_html_param (or web_reg_save_param) statement. If it contains any carriage returns, place it all on one line. If there are any “ in the text, place the escape character before it so LoadRunner doesn’t incorrectly think it is the end of the parameter, but rather a character to search for. For example, if the Left boundary was 'input type=hidden name=userSession value=' without the single quotes and we are using the web_create_html_param statement, then the function we have so far would be

Web_create_html_param(“UserSession”, “input type=hidden name=userSession value=”,


Step 3.
We are now going to tell LoadRunner how to identify the end of the value we are trying to capture. That is the right boundary of what we are looking for. Again look in the execution log and copy the static text that appears to the right of the dynamic value we are looking for. For example, lets say the execution log contained the following

… userSession value=75893.0884568651DQADHfApHDHfcDtccpfAttcf>…

Then the example so far to save the number into the parameter UserSession would be

web_create_html_param(“UserSession”, “input type=hidden name=userSession value=”, “>”);



In choosing a right boundary, make sure you choose enough static text to specify the end of the value. If the boundary you specify appears in the value you are trying to capture, then you will not capture the whole value.

Recap:-
That was a lot of looking through the recording and execution logs and checking of values. Lets just recap what we have done. We have identified a value that we think needs to be correlated. We then identified in the script where to place the statement that would ultimately capture and save the value into a parameter. We then placed the statement, and gave LoadRunner the text strings that appear on either side of the value we are looking for so that it can find it.

The flow of logic for this is, the correlation functions tells LoadRunner what to look for in the next set of replies from the server. LoadRunner makes a request of the server. The server replies. LoadRunner looks thorough the replies for the left and right boundaries. If it finds then, what is in-between is then saved to a parameter of the name specified.

Remember, the parameter can’t have a value till AFTER the next statement is executed. The correlation statement only tells LoadRunner what to look for. It doesn’t assign a value to the parameter. Assignment of a value to the parameter doesn’t happen till after LoadRunner makes a request of the server and looks in the reply. If you have in your script a case where a correlation statement is followed by a function that attempts to use the parameter, the statement is in the wrong place, and the script will fail.
This is always incorrect:-

web_create_html_param(…);
Web_submit (… {Parameter}…);

There needs to be in-between the two, the request of the server that causes it to reply with the value we are trying to capture.

Replacing the hard coded value in the script with the parameter.

Once we have created the parameter, the next step is to replace the hard coded occurrences with the parameter. Look through the script for the original value. Where you find it, delete out the value and replace it with the parameter. Note, only the value we want replaces is deleted. The characters around it remain.

i.e.

Change :
..... .....userSession=75893.0884568651DQADHfApHDHfcDtccpfAttcf&username=test........
.....


To :
.....
.....userSession={UserSession}&username=test........
.....



At this point, you are ready to run the script to test if it works, or if it needs further correlation, or more work on this correlation.



Common errors when correlating.

When LoadRunner fails to find the boundaries for a web_create, it will print a warning message in the execution log like this:-

Warning: No match found for the requested parameter "Name". If the data you want to save exceeds 256 bytes, use web_set_max_html_param_len to increase the parameter size

Firstly, this is a warning not an error. There are times when you might want to use the web_create_html_param function for purposes other that correlation. These require the function to not cause an error, so this is a warning.

Secondly, the advice the warning message gives is good, but I recommend thinking about it first. Was the value you were trying to capture more than 256 characters long? In the above example it was only 20 characters long. Have a look at the recording log and see how long the original value is. Have a look at the second recording made earlier and see how long it was in that script. Turn on the extended log (Run-Time settings-> Log-> Extended log-> All data returned from server) and have a look at how long it is in the execution log. If at any time any of these values was close to being, say 200 characters, then yes, add a web_set_max_html_param_len statement to the start of the script to make the maximum longer than 256 characters. If all occurrences of the script were much shorter that the max parameter length then the problem is either the web_create_html_param is in the wrong place, or that the boundaries are incorrect. Go back and look at the boundaries that you have selected, look at the placement of the web_create_html_param function. Is it immediately before the statement that causes the server to reply with the data you are looking for?

The Parameter length is longer than the current maximum.
(The web_set_max_html_param_len function)

web_set_max_html_param_len ( “length” );

This statement tells LoadRunner to look for larger matches for the left and right parameter. When it finds the left boundary, it will look ahead the max parameter length for the right boundary. This setting is script wide, and takes effect from when it is executed. It only needs appear in the script once. Having LoadRunner look for longer matches uses up more memory, and CPU to search through the returned text from the server. For this reason, don’t set it too high, or you will be making your script less scalable. That is you will reduce the number of Vusers that can run it on a given machine. Try to have the maximum parameter length no more than 100 characters greater that what you are expecting.



Special cases for the boundaries:-

There are some special characters and cases when specifying the boundaries. Double quotes should be preceded by a \ so LoadRunner recognizes then as part of the string to look for. If your text includes any carriage returns, that are part of the http, and not just part of the wrap around in the recording log, these need to be specified as a \r\n character set. If the \ character is part of the text, it too needs to be preceded by a \ to indicate it is a literal.

Recording log boundary parameters (left, right)

Value=”57685” “Value=\”” “\””

Value_”\item\”value’7875’ “Value_\”\\item\\\”value’ “’”

Value= “Value=\r\n\”” “\””
“7898756”


Debug help

Sometimes you want to print out the value that was assigned to a parameter. To do this, use the lr_eval_string function, and the lr_output_message function. For example, to print the value of the parameter to the execution log.

lr_output_message(“ Value Captured = %s", lr_eval_string("{Name}"));

If you find that the value being substituted is too long, too short, or completely wrong, printing out the value will help identify the changes you need to make to the correlation function. If you have extra characters at the start of the value, you need to add them to the end of the left boundary, if you have extra characters at the end of the parameter value, you need to add them to the start of the right boundary. If you are getting the wrong value all together, do searches in the recording log for the left boundary, and make sure that you have a unique boundary, and that LoadRunner isn’t picking up an earlier occurrence. You can then use the web_create_html_param_ex function, or add to the boundaries to make them unique.



Other Correlation help resources.

The examples in the Function reference contain a lot of data and examples on how to use these function. I would recommend looking over them.

The Customer Support site has a video for download that goes over correlation. You can get it from

http://support.merc-int.com

After logging in, go to Downloads-> Brows. Enter LoadRunner in the product selection box, Mercury Interactive downloads radio button, and click on retrieve. Under training,
click on the LoadRunner Web Script Correlation Training link.

Siebel 7.x Record and Replay for LoadRunner 8.x

Siebel 7.x Record and Replay for LoadRunner 8.x

Preparation before record – Internet Explorer Settings

To avoid problems, verify and change the following Internet Explorer’s settings before you try to record:

1. Enable all ActiveX controls and plug-ins. This option is available in Internet Explorer ® Tools ® Internet Options ® Security ® Custom Level).

2. Enable the “Use HTTP 1.1 through proxy connection” option. This option is available in Internet Explorer ® Tools ® Internet Options ® Advanced, under the “HTTP 1.1 settings” radio button.

Recording Siebel script with Auto-Correlation option

Correlation is the mechanism by which VuGen saves dynamic values to parameters during record and replay, for use at a later point in the script. For general information about correlation, you can refer to Problem ID 11806 - What is correlation and how is it done

For Siebel script, you can instruct VuGen to automatically apply correlation during recording using one of the following methods:

· VuGen Native Siebel Correlation

The native, built-in rules, work on a low level, allowing you to debug your script and understand the correlations in depth.

· Siebel Correlation Library

The Siebel correlation library automatically correlates most of the dynamic values, creating a concise script that you can replay easily. Note that this is only available for Siebel 7.7 and the library is distributed by Siebel.

How to record with VuGen Native Siebel Correlation

VuGen’s native built-in rules for the Siebel server detect the Siebel server variables and strings, automatically saving them for use at a later point within the script. It is available in VuGen recording option by default; you do not need to have any additional components installed.

Steps to record with Native Siebel correlation:

1. From the ‘New Multiple Protocol Script’ window, add ‘Siebel-Web’ and click OK.

2. Set the following Recording Options:

a. Internet Protocol: Recording:

· Select ‘ HTML based script’

· Click on ‘HTML Advanced’ and select the following

i. Script Type: a script containing explicit URLs only

ii. Non HTML-generated elements: Do not record

  1. Internet Protocol: Advanced:

i. Clear the ‘Reset context for each action’ option.

ii. Select ‘Support Charset’ then select ‘UTF-8’

  1. Internet Protocol: Correlation:

i. Make sure that you have ‘Enable Correlation during recording’

ii. Make sure that ‘Siebel’ is selected. You can expand the list to see the details about each rule if you wish.

  1. Leave other options as default.

3. Record in the following way:

· Record the login in the vuser_init section

· Record the Business Process in Action1

· Record the logout in the vuser_end section

How to record with Siebel Correlation Library

Siebel has released a correlation library file, ssdtcorr.dll, as part of the Siebel Application Server version 7.7. This library is available only through Siebel and can be found on siebsrvr\bin directory for Windows

Note: The Siebel Correlation API is supported on Windows 2000 and Windows XP only. This implies that if you correlate the script with this method, you cannot replay this script on UNIX platform. If you need to run the script on UNIX platform, use the VuGen Native Siebel Correlation method.

The library file, ssdtcorr.dll, must be available to all machines where a Load Generator, Controller, or Tuning Console resides.

Steps to record with Siebel correlation library:

1. Copy ssdtcorr.dll into the \bin directory of Controller / Tuning Console, and ALL Load Generator machines

2. From the ‘New Multiple Protocol Script’ window, add ‘Siebel-Web’ and click OK.

3. Set the following Recording Options:

a. Internet Protocol: Recording:

· Select ‘HTML based script’

· Click on ‘HTML Advanced’ and select the following

    1. Script Type: a script containing explicit URLs only
    2. Non HTML-generated elements: Do not record
  1. Internet Protocol: Advanced:

i. Clear the ‘Reset context for each action’ option.

ii. Select ‘Support Charset’ then select ‘UTF-8’

  1. Internet Protocol: Correlation:

i. Delete the default ‘Siebel’ correlation rule

ii. Click on ‘Import’, the ‘Import correlation Settings from a file’ window opens.

iii. Navigate to \dat\webrulesdefaultsetting directory, select ‘WebSiebel77Correlation.cor’ and click ‘Open’

iv. On the ‘Confirm Rule Replacement’ window, select ‘Overwrite’

Note: To revert back to the default correlation, delete all of the Siebel rules and click ‘Use Defaults’.

  1. Leave other options as default.

4. Record in the following way:

· Record the login in the vuser_init section

· Record the Business Process in Action1

· Record the logout in the vuser_end section

Note: After using the Siebel Correlation Library, a script may throw errors when run for multiple iterations. This is because the server caches some of the data after the first iteration and on the second iteration some of the data does not return from the server. The solution is to record the business process twice. First time record it into vuser_init and then into Action. This is actually the way Siebel is using Load Runner and this is the recommended way to script Siebel.

Replaying Siebel script – Run-Time Settings

Make sure that “Simulate a new user on each iteration” is not selected in the Browser Emulation options.

Common Replay Errors

Error: “We detected an Error which may have occurred for one or more of the following reasons: We are unable to process your request. This is most likely because you used the browser BACK or REFRESH button to get to this point.”

Diagnosis: A HTTP request has been sent twice to the server. This could be an individual web_url request or part of the resources being downloaded from another request. When sending the second request to the server, the Siebel 7.x server detects multiple requests and thus, issues the above error.

Example:

The following is a sample HTML-based script. Even though “start.swe3” is a frame within step “start.swe2,” you can see that an additional request is generated for “start.swe3” because of the “wait.html” step. On replay, the server may reject the second request, “start.swe3,” since it is the same for the HTTP call generated by “start.swe2.” This may be due to the SWECount or SWEC.

web_submit_data("start.swe2",

"Action=http://64.242.155.45/callcenter/start.swe",

"Method=POST",

"RecContentType=text/html",

"Referer=http://64.242.155.45/callcenter/start.swe?SWECmd=Start",

"Mode=HTML",

ITEMDATA,

"Name=SWEUserName", "Value=sadmin", ENDITEM,

"Name=SWEPassword", "Value=sadmin", ENDITEM,

"Name=SWENeedContext", "Value=false", ENDITEM,

"Name=SWEFo", "Value=SWEEntryForm", ENDITEM,

"Name=SWETS", "Value=1024549479671", ENDITEM,

"Name=SWECmd", "Value=ExecuteLogin", ENDITEM,

"Name=SWEBID", "Value=-1", ENDITEM,

"Name=SWEC", "Value=0", ENDITEM,

LAST);

web_url("wait.html",

"URL=http://64.242.155.45/callcenter/wait.html",

"TargetFrame=", "Resource=0","RecContentType=text/html","Referer=",

"Snapshot=t6.inf","Mode=HTML",

LAST);

web_url("start.swe3",

"URL=http://64.242.155.45/callcenter/start.swe?SWEFrame=top._swe&_sn={Siebel_sn_body3}&SWECmd=GetCachedFrame&SWEC=1",

"TargetFrame=", "Resource=0",

"RecContentType=text/html",

"Referer=http://64.242.155.45/callcenter/start.swe",

"Mode=HTML",

LAST);

Solutions:

  1. Change the Mode in "start.swe2" to “Mode=HTTP”

The idea behind changing the mode from HTML to HTTP is to avoid parsing the HTML page that is returned by the server, so that resources are not downloaded. This helps to avoid multiple downloads of same request.

If the script still fails on the first iteration, go to step 2. If the script fails on the second iterations onward, go to step 3.

  1. Disable the Run-Time Viewer

If the script still fails on the first iteration after the change from step 1, try to close the Run-Time Viewer. This option is in VuGen’s Tools ® General Options ® Display tab; clear the “Show Browser during Replay” option. For more information about this, refer to Problem ID 17234 - Errors in Web replay because of conflict with the runtime browser.

If problem persists, refer to step 4

  1. Correlate SWECount or SWEC

If you are able to run the first iteration, but the script fails on the second iteration or onwards, you will need to correlate SWECount (7.0.3) or SWEC (7.0.4) from the previous step “start.sweXXX.” For information about correlation, refer to Problem ID 11806 - What is correlation and how is it done.

If problem persists, refer to step 4.

  1. Run the script with the extended log

If none of the above helps, replay the script with the extended log and identify the HTTP request that is being downloaded multiple times. Search for a similar HTTP Request being sent earlier in the execution log. Once you locate the same, set “Mode=HTTP” so that the resources for that request are not downloaded, and try replaying the script again.

Creating DLLs for use with WinRunner

Creating C DLLs

These are the steps to create a DLL that can be loaded and called from WinRunner.

1. Create a new Win32 Dynamic Link Library project, name it, and click .

2. On Step 1 of 1, select “An empty DLL project,” and click .

3. Click in the New Project Information dialog.

4. Select File ® New from the VC++ IDE.

5. Select “C++ Source File,” name it, and click .

6. Close the newly created C++ source file window.

7. In Windows Explorer, navigate to the project directory and locate the .cpp file you created.

8. Rename the .cpp file to a .c file

9. Back in the VC++ IDE, select the FileView tab and expand the tree under the Projects Files node.

10. Select the Source Files folder in the tree and select the .cpp file you created.

11. Press the Delete key; this will remove that file from the project.

12. Select Project ® Add To Project ® Files from the VC++ IDE menu.

13. Navigate to the project directory if you are not already there, and select the .c file that you renamed above.

14. Select the .c file and click . The file will now appear under the Source Files folder.

15. Double-click on the .c file to open it.

16. Create your functions in the following format:

#include “include1.h”

#include “include2.h”

.

.

.

#include “includen.h”

#define EXPORTED __declspec(dllexport)

EXPORTED ( ,

,

…,

)

{

return ;

}

.

.

.

EXPORTED ( ,

,

…,

)

{

return ;

}

17. Choose Build ® .DLL from the VC++ IDE menu.

18. Fix any errors and repeat step 17.

19. Once the DLL has compiled successfully, the DLL will be built in either a Debug directory or a Release directory under your project folder depending on your settings when you built the DLL.

20. To change this setting, select Build ® Set Active Configuration from the VC++ IDE menu, and select the Configuration you want from the dialog. Click , then rebuild the project (step 17).

21. All the DLLs types that you are going to create are loaded and called in the same way in WinRunner. This process will be covered once in a later section.

Creating C++ DLLs

Here are the steps for creating a C++ DLL:

1. Create a new Win32 Dynamic Link Library project, name it, and click .

2. On Step 1 of 1, select “An Empty DLL Project,” and click .

3. Click in the New Project Information dialog.

4. Select File ® New from the VC++ IDE.

5. Select C++ Source File, name it, and click .

6. Double-click on the .cpp file to open it.

7. Create your functions in the following format:

#include “include1.h”

#include “include2.h”

.

.

.

#include “includen.h”

#define EXPORTED extern “C” __declspec(dllexport)

EXPORTED ( ,

,

…,

)

{

return ;

}

.

.

.

EXPORTED ( ,

,

…,

)

{

return ;

}

8. Choose Build ® .DLL from the VC++ IDE menu.

9. Fix any errors and repeat step 8.

10. Once the DLL has compiled successfully, the DLL will be built in either a Debug directory or a Release directory under your project folder depending on your settings when you built the DLL.

11. To change this setting, select Build ® Set Active Configuration from the VC++ IDE menu, and select the Configuration you want from the dialog. Click , then rebuild the project (step 8).

12. All the DLLs types that you are going to create are loaded and called in the same way in WinRunner. This process will be covered once in a later section.

Creating MFC DLLs

1. Create a new MFC AppWizard(DLL) project, name it, and click .

2. In the MFC AppWizard Step 1 of 1, accept the default settings and click .

3. Click in the New Project Information dialog.

4. Select the ClassView tab in the ProjectView and expand the classes tree. You will see a class that has the following name CApp; expand this branch.

5. You should see the constructor function CApp(); double-click on it.

6. This should open the .cpp file for the project. At the very end of this file add the following definition:

#define EXPORTED extern "C" __declspec( dllexport )

7. Below you will add your functions in the following format:

#define EXPORTED extern “C” __declspec(dllexport)

EXPORTED ( ,

,

…,

)

{

return ;

}

.

.

.

EXPORTED ( ,

,

…,

)

{

return ;

}

8. You will see the functions appear under the Globals folder in the ClassView tab in the ProjectView.

9. Choose Build ® .DLL from the VC++ IDE menu.

10. Fix any errors and repeat step 9.

11. Once the DLL has compiled successfully, the DLL will be built in either a Debug directory or a Release directory under your project folder depending on your settings when you built the DLL.

12. To change this setting, select Build ® Set Active Configuration from the VC++ IDE menu, and select the Configuration you want from the dialog. Click , then rebuild the project (step 9).

13. All the DLLs types that you are going to create are loaded and called in the same way in WinRunner. This process will be covered once in a later section.

Creating MFC Dialog DLLs

1. Create a new MFC AppWizard(DLL) project, name it, and click .

2. In the MFC AppWizard Step 1 of 1, accept the default settings and click .

3. Click in the New Project Information dialog.

4. Select the ClassView tab in the ProjectView and expand the classes tree. You will see a class that has the following name CApp; expand this branch also.

5. You should see the constructor function CApp(); double-click on it.

6. This should open the .cpp file for the project. At the very end of this file add the following definition:

#define EXPORTED extern "C" __declspec( dllexport )

7. Switch to the ResourceView tab in the ProjectView.

8. Select Insert ® Resource from the VC++ IDE menu.

9. Select Dialog from the Insert Resource dialog and click .

10. The Resource Editor will open, showing you the new dialog. Add the controls you want to the dialog, and set the properties of the controls you added.

11. Switch to the ClassView tab in the ProjectView and select View ® ClassWizard from the VC++ IDE menu, or double-click on the dialog you are creating.

12. The Class Wizard should appear with an “Adding a Class” dialog in front of it. Select “Create a New Class” and click .

13. In the New Class dialog that comes up, give your new class a name and click .

14. In the Class Wizard, change to the Member Variables tab and create new variables for the controls you want to pass information to and from. Do this by selecting the control, clicking , typing in the variable name, selecting the variable type, and clicking . Do this for each variable you want to create.

15. Switch to the Message Maps tab in the Class Wizard. Select the dialog class from the Object IDs list, then select the WM_PAINT message from the Messages List. Click , then . This should bring up the function body for the OnPaint function.

16. Add the following lines to the OnPaint function so it looks like the following:

void ::OnPaint()

{

CPaintDC dc(this); // device context for painting

this->BringWindowToTop();

UpdateData(FALSE);

// Do not call CDialog::OnPaint() for painting messages

}

17. Select IDOK from the Object IDs list, then select the BN_CLICKED message from the Messages

list. Click , accept the default name, and click .

18. Add the line UpdateData(TRUE); to the function, so it looks like this:

void ::OnOK()

{

UpdateData(TRUE);

CDialog::OnOK();

}

19. When you are done with this, click to close the Class Wizard dialog and apply your changes. Your new class should appear in the ProjectView in the ClassView tab.

20. In the tree on the ClassView tab, double-click on the constructor function for the CApp (see step 5).

21. At the top of the file, along with the other includes, add an include statement to include the header file for your dialog class. It should be the same name as the name you gave the class in step 13 with a .h appended to it. If you are unsure of the name, you can look it up on the FileView tab under the Header Files folder.

22. At the very end of the file, after the #define you created in step 6, create a function that looks something like this:

EXPORTED int create_dialog(char* thestring)

{

AFX_MANAGE_STATE(AfxGetStaticModuleState());

theDlg;

theDlg.=;

theDlg.DoModal();

strcpy(thestring,strVar1); //this will pass the value back to WinRunner.

return 0;

}

23. Choose Build ® .DLL from the VC++ IDE menu.

24. Fix any errors and repeat step 23.

25. Once the DLL has compiled successfully, the DLL will be built in either a Debug directory or a Release directory under your project folder depending on your settings when you built the DLL.

26. To change this setting, select Build ® Set Active Configuration from the VC++ IDE menu, then select the Configuration you want from the dialog. Click , then rebuild the project (step 23).

27. All the DLLs types that you are going to create are loaded and called in the same way in WinRunner. This process will be covered once in a later section.

Loading and Calling the Above DLLs from WinRunner

Loading and calling DLLs from WinRunner is really very simple. There are only 3 steps.

1. Load the DLL using the command load_dll.

2. Declare the function in the DLL as an external function using the extern function.

3. Call the function as you would any other TSL function.

As simple as this is, there are some things you need to be aware of.

1. WinRunner has a limited number of variable types; basically, there is string, int, and long. Windows has many different types. Two common types, which may confuse you, are HWND and DWORD. Which WinRunner type do you choose for these? You should declare these as long.

2. If you are building a function in a DLL and you are testing it in WinRunner, make sure you unload the DLL in WinRunner using the unload_dll function before you try to recompile the DLL. If you leave the DLL loaded in WinRunner and try to recompile the DLL, you will receive an error message in VC++ that looks like this:

LINK : fatal error LNK1104: cannot open file “Debug/.DLL

Error executing link.exe

To resolve this error, step through the unload_dll line in WinRunner, then compile the DLL.

3. Before shipping a DLL make sure you compile it in Release mode. This will make the DLL much smaller and optimized.