Fault injection


Fault injection

In software testing, fault injection is a technique for improving the coverage of a test by introducing faults in order to test code paths, in particular error handling code paths, that might otherwise rarely be followed. It is often used with stress testing and is widely considered to be an important part of developing robust software [J. Voas, "Fault Injection for the Masses," Computer, vol. 30, pp. 129-130, 1997.] .

The propagation of a fault through to an observable failure follows a well defined cycle. When executed, a fault may cause an error, which is an invalid state within a system boundary. An error may cause further errors within the system boundary, therefore each new error acts as a fault, or it may propagate to the system boundary and be observable. When error states are observed at the system boundary they are termed failures. This mechanism is termed the fault-error-failure cycle [A. Avizienis, J.-C. Laprie, B. Randell, and C. Landwehr, "Basic Concepts and Taxonomy of Dependable and Secure Computing," Dependable and Secure Computing, vol. 1, pp. 11-33, 2004.] and is a key mechanism in dependability.

History

The technique of fault injection dates back to the 1970s [J. V. Carreira, D. Costa, and S. J. G, "Fault Injection Spot-Checks Computer System Dependability," IEEE Spectrum, pp. 50-55, 1999.] when it was first used to induce faults at a hardware level. This type of fault injection is called Hardware Implemented Fault Injection (HWIFI) and attempts to simulate hardware failures within a system. The first experiments in hardware fault injection involved nothing more than shorting connections on circuit boards and observing the effect on the system (bridging faults). It was used primarily as a test of the dependability of the hardware system. Later specialised hardware was developed to extend this technique, such as devices to bombard specific areas of a circuit board with heavy radiation. It was soon found that faults could be induced by software techniques and that aspects of this technique could be useful for assessing software systems. Collectively these techniques are known as Software Implemented Fault Injection (SWIFI).

Software Implemented Fault Injection

SWIFI techniques can be categorised into two types: Compile-Time Injection and Runtime Injection.

Compile-Time Injection is an injection technique where source code is modified to inject simulated faults into a system. One method is called Code Mutation which changes existing lines of code so that they contain faults. A simple example of this technique could be changing

a = a + 1 to a = a – 1

Code mutation produces faults which are very similar to those unintentionally added by programmers.

A refinement of code mutation is Code Insertion Fault Injection which adds code, rather than modifies existing code. This is usually done through the use of perturbation functions which are simple functions which take an existing value and perturb it via some logic into another value, for example

int pFunc(int value) { return value + 20; } int main(int argc, char * argv [] ) { int a = pFunc(aFunction(atoi(argv [1] ))); if (a > 20) { /* do something */ } else { /* do something else */ } }

In this case pFunc is the perturbation function and it is applied to the return value of the function that has been called introducing a fault into the system.

Runtime Injection techniques use a software trigger to inject a fault into a running software system. Faults can be injected via a number of physical methods and triggers can be implemented in a number of ways, such as: Time base triggers (When the timer reaches a specified time an interrupt is generated and the interrupt handler associated with the timer can inject the fault. ); Interrupt Based Triggers (Hardware exceptions and software trap mechanisms are used to generate an interrupt at a specific place in the system code or on a particular event within the system, for instance access to a specific memory location).

Runtime injection techniques can use a number of different techniques to insert faults into a system via a trigger.

* Corruption of memory space: This technique consists of corrupting RAM, processor registers, and I/O map.
* Syscall interposition techniques: This is concerned with the fault propagation from operating system kernel interfaces to executing systems software. This is done by intercepting operating system calls made by user-level software and injecting faults into them.
* Network Level fault injection: This technique is concerned with the corruption, loss or reordering of network packets at the network interface.

These techniques are often based around the debugging facilities provided by computer processor architectures.

Fault Injection Tools

Although these types of faults can be injected by hand the possibility of introducing an unintended fault is high, so tools exist to parse a program automatically and insert faults.

Research Tools

A number of SWIFI Tools have been developed and a selection of these tools is given here. Six commonly used fault injection tools are Ferrari, FTAPE , Doctor, Orchestra, Xception and Grid-FIT.

* Ferrari (Fault and Error Automatic Real-Time Injection) is based around software traps that inject errors into a system. The traps are activated by either a call to a specific memory location or a timeout. When a trap is called the handler injects a fault into the system. The faults can either be transient or permanent. Research conducted with Ferrari shows that error detection is dependent on the fault type and where the fault is inserted [G. A. Kanawati, N. A. Kanawati, and J. A. Abraham, "FERRARI: A Flexible Software-Based Fault and Error Injection System," IEEE Transactions on Computers, vol. 44, pp. 248, 1995.] .
* FTAPE (Fault Tolerance and Performance Evaluator) can inject faults, not only into memory and registers, but into disk accesses as well. This is achieved by inserting a special disk driver into the system that can inject faults into data sent and received from the disk unit. FTAPE also has a synthetic load unit that can simulate specific amounts of load for robustness testing purposes [T. Tsai and R. Iyer, "FTAPE: A Fault Injection Tool to Measure Fault Tolerance," presented at Computing in aerospace, San Antonio; TX, 1995.] .
* DOCTOR (IntegrateD SOftware Fault InjeCTiOn EnviRonment) allows injection of memory and register faults, as well as network communication faults. It uses a combination of time-out, trap and code modification. Time-out triggers inject transient memory faults and traps inject transient emulated hardware failures, such as register corruption. Code modification is used to inject permanent faults [S. Han, K. G. Shin, and H. A. Rosenberg, "DOCTOR: An IntegrateD SOftware Fault InjeCTiOn EnviRonment for Distributed Real-time Systems," presented at International Computer Performance and Dependability Symposium, Erlangen; Germany, 1995.] .
* Orchestra is a script driven fault injector which is based around Network Level Fault Injection. Its primary use is the evaluation and validation of the fault-tolerance and timing characteristics of distributed protocols. Orchestra was initially developed for the Mach Operating System and uses certain features of this platform to compensate for latencies introduced by the fault injector. It has also been successfully ported to other operating systems [S. Dawson, F. Jahanian, and T. Mitton, "ORCHESTRA: A Probing and Fault Injection Environment for Testing Protocol Implementations," presented at International Computer Performance and Dependability Symposium, Urbana-Champaign, USA, 1996.] .
* Xception is designed to take advantage of the advanced debugging features available on many modern processors. It is written to require no modification of system source and no insertion of software traps, since the processor's exception handling capabilities trigger fault injection. These triggers are based around accesses to specific memory locations. Such accesses could be either for data or fetching instructions. It is therefore possible to accurately reproduce test runs because triggers can be tied to specific events, instead of timeouts [J. V. Carreira, D. Costa, and S. J. G, "Fault Injection Spot-Checks Computer System Dependability," IEEE Spectrum, pp. 50-55, 1999.] .
* Grid-FIT (Grid – Fault Injection Technology) [ [http://wiki.grid-fit.org/ Grid-FIT Web-site] ] is a dependability assessment method and tool for assessing Grid services by fault injection. Grid-FIT is derived from an earlier fault injector WS-FIT [N. Looker, B. Gwynne, J. Xu, and M. Munro, "An Ontology-Based Approach for Determining the Dependability of Service-Oriented Architectures," in the proceedings of the 10th IEEE International Workshop on Object-oriented Real-time Dependable Systems, USA, 2005.] which was targeted towards Java Web Services implemented using Apache Axis transport. Grid-FIT utilises a novel fault injection mechanism that allows network level fault injection to be used to give a level of control similar to Code Insertion fault injection whilst being less invasive [N. Looker, M. Munro, and J. Xu, "A Comparison of Network Level Fault Injection with Code Insertion," in the proceedings of the 29th IEEE International Computer Software and Applications Conference, Scotland, 2005.] .

Commercial Tools

* ExhaustiF is a commercial software tool used for grey box testing based on software fault injection (SWIFI) to improve reliability of software intensive systems. The tool can be used during system integration and system testing phases of any software development lifecycle, complementing other testing tools as well. ExhaustiF is able to inject faults into both software and hardware. When injecting simulated faults in software, ExhaustiF offers the following fault types: Variable Corruption and Procedure Corruption. The catalogue for hardware fault injections includes faults in Memory (I/O, RAM) and CPU (Integer Unit, Floating Unit). There are different versions available for RTEMS/ERC32, RTEMS/Pentium, Linux/Pentium and MS-Windows/Pentium. [ [http://www.exhaustif.es ExhaustiF SWIFI Tool Site] ]

* Xception [ [http://www.xception.org Xception Web Site] ] is a commercial software tool developed by Critical Software SA [ [http://www.criticalsoftware.com Critical Software SA] ] used for black box and white box testing based on software fault injection (SWIFI) and Scan Chain fault injection (SCIFI). Xception allows users to test the robustness of their systems or just part of them, allowing both Software fault injection and Hardware fault injection for a specific set of architectures. The tool has been used in the market since 1999 and has customers in the American, Asian and European markets, especially in the critical market of aerospace and the telecom market. The full Xception™ product family includes: a) The main Xception™ tool, a state-of-the-art leader in Software Implemented Fault Injection (SWIFI) technology; b) The Easy Fault Definition (EFD) and Xtract (Xception Analysis Tool) add-on tools; c) The extended Xception™ tool (eXception), with the fault injection extensions for Scan Chain and pin-level forcing.

Application of Fault Injection

Fault injection can take many forms. In the testing of operating systems for example, fault injection is often performed by a "driver" (kernel-mode software) that intercepts "system calls" (calls into the kernel) and randomly returning a failure for some of the calls. This type of fault injection is useful for testing low level user mode software. For higher level software, various methods inject faults. In managed code, it is common to use instrumentation. Although fault injection can be undertaken by hand a number of fault injection tools exist to automate the process of fault injection [N. Looker, M. Munro, and J. Xu, "Simulating Errors in Web Services," International Journal of Simulation Systems, Science & Technology, vol. 5, 2004.] .

Depending on the complexity of the API for the level where faults are injected, fault injection tests often must be carefully designed to minimise the number of false positives. Even a well designed fault injection test can sometimes produce situations that are impossible in the normal operation of the software. For example, imagine there are two API functions, Commit and PrepareForCommit, such that alone, each of these functions can possibly fail, but if PrepareForCommit is called and succeeds, a subsequent call to Commit is guaranteed to succeed. Now consider the following code:

error = PrepareForCommit(); if (error = SUCCESS) { error = Commit(); assert(error = SUCCESS); }

Often, it will be infeasible for the fault injection implementation to keep track of enough state to make the guarantee that the API functions make. In this example, a fault injection test of the above code might hit the assert, whereas this would never happen in normal operation.

See also

* Bebugging
* Mutation testing

References

External links

* [http://www.cs.colostate.edu/casi/REPORTS/Bieman95.pdf Using Fault Injection to Test Software Recovery Code] by Colorado Advanced Software Institute.
* [http://www.certess.com/product/ Certitude Software from Certess Inc.]


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Fault (geology) — Part of a series on earthquakes Types Foreshock • Aftershock • Blind thrust Doublet • Interplate • …   Wikipedia

  • Irma's injection — is the name Sigmund Freud gave to a dream of his. It is the dream with which he opens his seminal work The Interpretation of Dreams , and which forms the linchpin of the analysis in that book.Freud describes the dream, which he records as taking… …   Wikipedia

  • Sécurité matérielle des cartes à puce — La sécurité matérielle des cartes à puce et des autres microcontrôleurs est l un des éléments clefs de la sécurité des informations sensibles qu ils manipulent. La littérature scientifique a produit un grand nombre de publications visant à… …   Wikipédia en Français

  • Внесение неисправностей — Внесение неисправностей  метод, используемый в тестировании программного обеспечения. Предполагает искусственное внесение разного рода неисправностей для тестирования отказоустойчивости и, в частности, обработки исключений. Обычно… …   Википедия

  • Fuzz testing — Fuzzing redirects here. For other uses, see Fuzz (disambiguation). Fuzz testing or fuzzing is a software testing technique, often automated or semi automated, that involves providing invalid, unexpected, or random data to the inputs of a computer …   Wikipedia

  • Bebugging — (or Fault Seeding) is a popular software engineering technique used in the 70s to measure test coverage. Known bugs are randomly added to a program source code and the programmer is tasked to find them. The percentage of the known bugs not found… …   Wikipedia

  • Dependability — is a value showing the reliability of a person to others because of his/her integrity, truthfulness, and trustfulness, traits that can encourage someone to depend on him/her. The wider use of this noun is in Systems engineering. Dependability as… …   Wikipedia

  • Cluster manager — A Cluster manager usually is a backend GUI or command line software that runs on one or all cluster nodes (in some cases it runs on a different server or cluster of management servers.) The cluster manager works together with a cluster management …   Wikipedia

  • Codenomicon — Type Privately held company Founded 2001 Headquarters Oulu, Finland Area served worldwide Products …   Wikipedia

  • Software testing — is an empirical investigation conducted to provide stakeholders with information about the quality of the product or service under test [ [http://www.kaner.com/pdfs/ETatQAI.pdf Exploratory Testing] , Cem Kaner, Florida Institute of Technology,… …   Wikipedia


Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.