The “Big Thaw” – An Agile Process for Software Certification

To achieve certification, safety-critical systems must demonstrate compliance with domain-specific standards such as DO-178 for commercial avionics. Developing a certified system consists of various interrelated activities that produce outputs ( ***a***collections of artifacts) as evidence of successful completion. For example, one of the DO-178 verification activities is a traceability analysis; its output is a report showing that each software requirement is implemented in the source code. Conducting the certification-required activities and producing the artifacts demand a major effort, much more than for conventional Quality Assurance on non safety-critical systems.

Real-world systems rarely stay unchanged. New requirements arise, bugs may be detected, better hardware may become available. Software providers have needed to deal with these issues since the earliest days, but the challenge is especially acute for safety-certified systems. The basic issue is how to assess the impact of a change. Which activities need to be carried out, which artifacts are affected? In short, what is needed to re-achieve system certification, and what will it cost?

Traditionally these questions have been difficult to answer, and as a result the general practice for certified systems has been to make few if any changes until a major upgrade is required. This situation is sometimes referred to as the “Big Freeze”. All components and tools are baselined: the application software, the platform (hardware, operating system, peripheral devices), the tools (compilers, linkers, static analysis tools, etc.). Baselining can raise other issues, of course, since support and maintenance costs for third-party provided elements will typically be much higher than for the current releases.

A new solution to the “Big Freeze” problem is proposed here, drawing on principles from Agile Development. This approach, which we call the “Big Thaw”, involves continuous (or, more precisely, frequently iterated) performance of the activities that are affected by a change, to keep all artifacts up to date and to preserve the complete system’s certifiability. The approach is based on an analysis of the artifacts’ structure and interdependencies, since the artifacts are the concrete realization of the effect of the activities.

Process- versus product-based certification
Software certification standards fall into two main categories. A process-based standard relies on evidence that the various activities associated with development have been performed successfully. DO-178B [1] is an example of such a standard. Figure 1 shows the three general process categories (Planning, Development, and Liaison) and some of the associated outputs. The standard defines a set of ten specific processes in these three categories. Each process is captured by a table that lists its associated objectives; in total there are 66 objectives. For each objective there are specific actions to be performed, and a corresponding output consisting of one or more artifacts. A system safety assessment establishes the level for each software component, ranging from E (no effect on safety) to A (anomalous behavior can cause a catastrophic failure and prevent continued safe flight and landing). The level in turn determines which objectives need to be met.

The underlying logic is that a safe system can only be achieved through the performance of sound software engineering processes, which can in turn be assessed through an evaluation of the various artifacts that they produce. Although this means that the inference of system safety is indirect, DO-178B has proved to be successful in practice. In the nearly twenty years since this standard’s inception, no commercial aircraft fatality has been attributed to DO-178B-certified software.

Standards evolve based on application experience and new technologies, and DO-178B is no exception. A new version, DO-178C [2], is expected to be finalized in the near future. It will address several modern software methodologies – in particular Object-Oriented Technology, Model-Based Design, and Formal Methods – but the overall approach remains strongly process-based.

In contrast, a product-based (or goal-based) certification standard provides a more direct assessment of software safety. Examples are the UK Ministry of Defence Standard OO-56 [3] and the FDA’s Draft Guidance on infusion pump premarket notification submissions [4], which call for an assurance case approach [5]. The developer provides assurance cases consisting of claims concerning the relevant system attributes, arguments justifying those claims, and evidence backing up the arguments. For safety-critical systems the assurance cases are known as safety cases, and the relevant attributes are properties related to the system’s safety. Safety cases are typically hierarchical, with higher-level attributes broken down into lower layers. The safety cases are the certification artifacts; the activities are implicit.

Both process- and product-based approaches result in artifacts that are affected as a system evolves. The Big Thaw approach applies to each.