How Code Coverage Works

Index of All Documentation » Wing Pro Reference Manual » Unit Testing » Code Coverage »


This section details how Wing tracks code coverage statistics as edits are made, and how it decides which test results have been invalidated by edits to code.

Coverage Data Tracking

Wing tracks code coverage only when running unit test. It does not collect coverage statistics while debugging tests, or when running or debugging other code.

Each test run collects coverage statistics, keeping track of what code is reached by each test that is run. When the run completes, Wing merges this data into previously collected data, from earlier test runs. This process works by first removing all coverage data for the tests that were re-run, and then merging the new data into the combined coverage data file. This prevents Wing from showing previously reached lines that are no longer reached by tests as still being reached.

Depending on the size of your test suite and code base, the CPU time consumed to merge new coverage data may be noticeable. Wing tries to minimize time spent by deferring some of the processing while tests are still actively running. As a result, coverage data shown in Wing's editor may not update for some time after tests stop running. The delay will depend on the size of your unit test suite and code base; in small code bases it is near zero.

Once coverage data has been collected, Wing tracks existing coverage data as follows:

1) The coverage status of lines are immediately updated during edits. Changed lines are marked as never reached, since no test has reached them in their new form. Lines that follow an edit are tracked upward or downward in the file according to line insertion and deletion.

2) When a file is saved or changed externally, Wing looks at all changes made since coverage data was last collected for that file and makes decisions about which unit test results have been invalidated, as detailed in the next section.

Test Invalidation

The process of tracking edits and determining which unit test results have been invalidated is relatively complex. This is in part due to the fact that edits to some parts of a module's top level do not necessarily invalidate all the tests that imported that module.

The simplest example of this is an edit to a def line: Although any code that imported the module will have visited that line, the change usually only affects tests that actually called that def. Thus Wing invalidates only those tests that reached the first line of code in the function or method.

Similarly, inserting a new function, method, or class without any call to it does not invalidate any test at all.

The same is usually true for a new import statement. Tests invalidated will be those that previously reached code at points where the newly imported symbol is used, and not all those that reached the scope where the import statement is added.

Wing also looks at the content of changes made and ignores any that alter only trailing white space or comments.

When Wing determines that an edit probably does affect tests, it finds the tests to invalidate by looking backward and forward within the scope of the edit for lines at the same indent level, and then invalidates the tests that reached that line. For an inserted range of lines, this check is done both for the indentation level of the first inserted line and also again for the indentation level of the last inserted line.

These heuristics make invalidation of test results much more useful than blindly invalidating large numbers of tests. However, there are cases that Wing may miss as a result of this approach. For example if a newly added import or a default value for an argument in a def invoke code with global side effects, then Wing may fail to invalidate test results for unit tests that are affected by the change.

Even without these optimizations, code coverage cannot possibly determine every factor that impacts test results. In a dynamic language like Python, nearly anything is possible. For example, docstrings are not reached by code coverage at all, but may be read by code in a way that affects the outcome of tests.

Despite these limitations, Wing's test invalidation capabilities do make it easier and faster to verify whether code edits have introduced any problems, particularly when working with very large test suites.