WDT in safety standards
With the prevalence of microcontrollers (MCUs) as processing units in safety-related systems (SRS) comes the need for diagnostic measures that will ensure safe operation. IEC 61508-2 specifies self-test supported by hardware (one channel) as one of the recommended diagnostic techniques for processing units. This measure uses special hardware that increases speed and extends the scope of the failure detection, for instance, a watchdog timer (WDT) IC that cyclically monitors the output of a certain bit pattern from the MCU.
The basic functional safety (FS) standard IEC 61508-2 Annex A Table A.10 recommends several diagnostic techniques and measures to control hardware failures in the program sequences of digital devices. Such techniques include a watchdog with a separate time base with or without a time window, as well as a combination of temporal and logical monitoring of program sequences. While each of these has corresponding maximum claimable diagnostic coverage, all these techniques employ WDTs.
This article will show how to implement these diagnostic functions using WDTs. Furthermore, the article will provide insights into the differences of program sequence monitoring diagnostic measures in terms of operation and diagnostic coverage when implemented with ADI’s high-performance supervisory circuits with watchdog function.
Low diagnostic coverage
Part 2 of IEC 61508 describes simple watchdogs as external timing elements with a separate time base. Such devices allow the detection of program sequence failures in a computer device, such as MCUs, within a specified interval. This is done by having a mechanism that allows either:
- The MCU is to issue a signal to reset the watchdog before it reaches the timeout
- The watchdog timeout period to be reached so that the watchdog can issue a reset signal to the MCU
Step #1 occurs when the program sequence is running smoothly, while step #2 happens when it is not.
Figure 1a shows an example of the watchdog implementation with a separate time base but without a time window through the MAX6814. Notably, MCUs usually have an internal WDT, but it cannot be solely relied on to detect a fault if it is part of the defective MCU, which will be an issue considering common cause failures (CCF).
To address such CCF concerns, a separate WDT is used to ensure the MCU is placed in reset [1, 2]. Through a flowchart, Figure 1b illustrates the behavior of the WDT as embedded in the MCU’s program execution. Before the flow starts, it’s important to set the watchdog timeout period or the WDT’s maximum reset interval. When such a period or interval is defined, the WDT will run upon execution of the program. The MCU must be able to send a signal to the MAX6814’s WDI pin before it reaches timeout, as the device will issue a reset signal to the MCU if the timeout period is reached. When the MCU resets, the system will be placed into a safe state.
Figure 1 Simple watchdog operation showing (a) an example of the watchdog implementation with a separate time base but without a time window and (b) the behavior of the WDT as embedded in the MCU’s program execution. Source: Analog Devices
Such a WDT’s timeout period will capture program sequence issues; for example, a program sequence gets stuck in a loop, or an interrupt service routine does not return in time. For instance, only 5 of the 10 subroutines meant to be run on every loop of the software are executed.
However, the WDT’s timeout period will not cover other issues concerning program sequence issues—whether execution of the program took longer or shorter than expected, or if the sequence of the program sections is correctly executed. This can be solved by the next type of WDTs.
Medium diagnostic coverage
Since the existence of a separate time window allows for the detection of both excessive delays and premature execution, windowed WDTs prohibit the MCU from responding longer or shorter than the WDT’s open window. This is also referred to as a valid window specification. As compared to simple watchdogs, it guarantees that all subroutines are executed by the program in a timely manner; otherwise, it will assert the MCU into reset [3].
Figure 2 shows an example implementation of program sequence monitoring using the MAX6753. It comes with a windowed watchdog with external-capacitor-configurable watchdog periods.
Figure 2 Sample implementation of a windowed watchdog operation with external-capacitor-configurable watchdog periods.
Figure 3, on the other hand, shows another implementation using the MAX42500, whose watchdog time settings can be configured through I2C—effectively reducing the number of external components. This allows for the capability to increase fault coverage through a packet error checking (PEC) byte as shown in Figure 4. The PEC byte increases diagnostic coverage against I2C communication-related failures such as bus errors, stuck-bus conditions, timing problems, and improper configuration.
Figure 3 Another implementation: windowed watchdog through I2C, reducing the number of external components compared to Figure 2. Source: Analog Devices
Figure 4 PEC byte coverage to I2C interface failures, such as bus errors, stuck-bus conditions, timing problems, and improper configuration. Source: Analog Devices
While watchdogs with a separate time base and time window offer higher diagnostic coverage compared to simple WDTs, they still cannot capture issues concerning whether the software’s subroutines have been executed in the correct sequence. This is what the next type of diagnostic technique addresses.
High diagnostic coverage
Diagnostic techniques involving the combination of temporal and logical monitoring provide high diagnostic coverage to program sequences according to IEC 61508-2. One implementation of this technique involves a windowed watchdog and a capability to check whether the program sequence has been executed in the correct order.
An example can be visualized when the circuit in Figure 2 is combined with the sequence in Figure 5, where the MCU has each of its program routines employing a unique combination of characters and digits. Such unique combinations are then placed in an array each time a routine is executed. After the last routine, the MCU will only kick, or send a reset signal to, the watchdog if all words are correctly set in the array.
Figure 5 Checking the correct logic of the sequence through markers. Source: Analog Devices
Highest diagnostic coverage
In some systems, more diagnostic coverage may be required to capture failures of the MCU, which may mean simply that sending back a pulse in a windowed time is not enough. With this, it may be beneficial to require the MCU to perform a complex task, such as calculating, to ensure that it’s fully operational. This is where the MAX42500’s challenge/response watchdog can come into play.
In this watchdog mode, there’s a key-value register in the IC that must be read as the starting point of the challenge message. The MCU must use this message to calculate the appropriate response to send back to the watchdog IC, ensuring the watchdog is kicked within the valid window. This type of challenge/response watchdog operates similarly to a simple windowed one, except that the key register is updated rather than the watchdog being refreshed with a rising edge. This is shown in Figure 6. Notably, for the MAX42500’s WDT, the watchdog input is implemented using the I2C, while the watchdog output is the output reset pin.
Figure 6 A challenge/response windowed watchdog example where the MCU reads the challenge message in the IC and calculates an appropriate response to be sent back to the watchdog IC to allow it to be kicked within the valid window. Source: Analog Devices
The MAX42500 contains a linear-feedback shift key (LFSK) register with a polynomial of x8 + x6 + x5 + x4 + 1 that will shift all bits upward towards the most significant bit (MSB) and insert the calculated bit as the new least significant bit (LSB). With this, the MCU must compute the response in this manner and return it to the register of the MAX42500 through I2C. Notably, such a polynomial is identified as primitive and at the same time, a maximal length feedback polynomial for 8 bits. This ensures that all bit value combinations (1 to 255) are generated by the polynomial, and the order of the numbers is indeed pseudo-random [4][5].
Such a challenge/response can offer more coverage than the combination of temporal and logical program sequence monitoring, as it shows that the MCU can still do actual calculations. This is as opposed to an MCU just implementing decision-making routines, such as only checking whether the array of words is correct before issuing a signal to reset the watchdog.
Diagnostic coverage claims
The basic functional safety standard has maximum claimable diagnostic coverage for each diagnostic measure recommended per block in an SRS. Table 1 corresponds to the program sequence according to IEC 61508, which utilizes WDTs.
Diagnostic Technique/Measure | Maximum DC Considered Achievable |
Watchdog with a separate time base without a time window | Low |
Watchdog with a separate time base and time window | Medium |
Combination of temporal and logical monitoring of program sequences | High |
Table 1 Watchdog program sequence according to IEC 61508-2 Annex A Table A.10.
Furthermore, with the existence of different implementations that may not be covered in the standard, a claimed diagnostic coverage can only be validated through fault insertion testing.
Diagnostic measures using WDTs
This article enumerates three types of diagnostic measures that use WDTs as recommended by IEC 61508-2 to address failures in program sequence. The first type of watchdog, which has a separate time base but without a time window, can be implemented using a simple watchdog. This diagnostic measure can only claim low diagnostic coverage.
On the other hand, the second type of watchdog, which has both a separate time base and a separate time window, can be implemented by a windowed watchdog. This measure can claim a medium diagnostic coverage.
To improve diagnostic coverage to high, one can employ logical monitoring aside from the usual temporal monitoring using watchdogs. A challenge/response windowed watchdog architecture can further increase diagnostic coverage against program sequence failures with its capability to check an MCU’s computational ability.
Bryan Angelo Borres is a TÜV-certified functional safety engineer who focuses on industrial functional safety. As a senior power applications engineer, he helps component designers and system integrators design functionally safe power products that comply to industrial functional safety standards such as the IEC 61508. Bryan is a member of the IEC National Committee of the Philippines to IEC TC65/SC65A and IEEE Functional Safety Standards Committee. He also has a postgraduate diplomat in power electronics and more than seven years of extensive experience in designing efficient and robust power electronics systems.
Christopher Macatangay is a senior product applications engineer supporting the industrial power product line. Since joining Analog Devices in 2015, he has played a key role in enabling customer success through technical support, system validation, and application development for analog and mixed-signal products. Christopher spent six years prior to ADI as a test development engineer at a power supply company, where he focused on the design and implementation of automated test solutions for high-reliability products.
References
- “IEC 61508 All Parts, Functional Safety of Electrical/Electronic/Programmable Electronic Safety-Related ” International Electrotechnical Commission, 2010.
- “Top Misunderstandings About Functional Safety.” TÜV SÜD,
- “Basics of Windowed Watchdog Operation.” Analog Devices, Inc. December
- “Pseudo Random Number Generation Using Linear Feedback Shift Registers.” Maxim, June 2010.
- Mohammed Abdul Samad AL-khatib and Auqib Hamid Lone “Acoustic Lightweight Pseudo Random Number Generator based on Cryptographically Secure LFSR.” International Journal of Computer Network and Information Security, Vol. 2, February
Related Content
- Watchdog versus the truck
- Need a watchdog for improved system fault tolerance?
- WDT assumes varied roles
The post Program sequence monitoring using watchdog timers appeared first on EDN.