Using formal methods for sophisticated static code analysis

In this Product How-To design article, Jay Abraham of Mathworks uses the company’s Polyspace code verifier to explain the use formal methods-based static code analysis to ensure high quality and verifiable embedded software.

Software underlies applications in a multitude of industries today. Aircraft, automobiles, industrial machinery, and medical devices all contain specialized software known as embedded software. This software is directly responsible for a variety of critical tasks. This software must be of high quality and must be thoroughly tested to verify it performs as expected.

In such critical systems, even simple operations performed with software can be fraught with risk. For example, consider an algorithm that requires the addition operation. If the underlying 32-bit microcontroller does not have a floating point unit, you need to be careful to avoid overflow conditions. Consider the operation in this code:

on image to enlarge.

The operation c = a + b is prone to risk because of the potential for an overflow. If the summation produces a result greater than 232 -1, the result will not fit in the 32-bit variable c, resulting in an incorrect summation. The microcontroller will not throw a fault or alarm causing the software to behave improperly.

Software developers and verification engineers are ultimately responsible for the quality of the software they produce. Although developers can use manual techniques to quantify the quality of the software, these techniques are not necessarily exhaustive. Automation tools offer more consistent and in some cases more powerful techniques for producing high-quality software, including an automation technique known as static code analysis.

Static Code Analysis

Static code analysis, also known as source code analysis or static analysis, is a software verification activity for analyzing source code for quality and reliability. This analysis enables software developers and testers to identify and diagnose errors such as overflows, divide-by-zero, and illegally dereferenced pointers.

Metrics produced by static code analysis provide a means for measuring and improving software quality. In contrast to other verification techniques, static code analysis can be performed without executing the program, developing test cases, or compiling the software program.

You can view static code analysis as an ultra-powerful code review process where code execution paths are analyzed, variable ranges are understood, and concurrent access paths to variables are known a priori.

Static code analysis tools can be developed by various means, ranging from simple checks, such as code style checks, to sophisticated formal methods. Simple static code analysis techniques can produce metrics that are indicative of the quality of software, or attempt to identify errors in the software program. These simple techniques are very fast, producing results quickly. However, they can also produce false negative and false positives.

With false negatives, the static code analyzer misses a real error in the source code. False positives are incorrectly identified errors. Both situations are problematic to software developers and verification engineers. False negatives produce a false sense of comfort and include bugs in the released product. False positives delay production of the software and result in unnecessary rework that may degrade performance or memory footprint of the software.

The following example shows the application of a simple static code analysis technique. Consider the following code fragment:

on image to enlarge.

Simple static code analysis techniques will quickly spot the divide-by-zero error in the operation b = 12/a. Compilers will identify this error and flag it during compile time. Now consider the following complex code.

on image to enlarge.

Let’s assume that function1 and function2 perform complex algorithmic and mathematical computation. In this situation, a divide-by-zero condition is possible for the operation c = b/a, but simple static code analysis techniques cannot definitely identify if this condition will occur and under what conditions the divide-by-zero might occur. At best they can flag that a divide-by-zero condition is possible, but they will not be able to conclusively state the conditions under which this might occur. If indeed a divide-by-zero will not occur, this would be known as a false positive.