## **Test und Fehlersuche in** komplexen autonomen System Albert Schulz – albert.schulz@accemic.com Tagung "Echtzeit 2019" 21.11.2019





## Outline

- Introduction
- Hardware Trace
- Dynamic analysis
  - Example: safety-critical control system
- Realtime Code Coverage
- Conclusions





## Introduction

- Trend in current systems:
  - autonomous, long runtimes without human interaction
  - Increased complexity  $\rightarrow$  need for multi-core
- Increased chance of bugs, even in post-release code

McKinsey & Company: "Snowballing complexity is causing significant softwarerelated quality issues ... "





© 2019 Accemic Technologies GmbH. All Rights Reserved.

#### **Capers Jones:** ~5% Post-release defects





## Introduction

- Certification is challenging for safety-critical systems
- Software instrumentation helps, but interferes with functional Code • e.g. code-coverage adds additional code for measurements (e.g. gcov) Software tracing techniques with high overhead in time and space (printf)
- Multi-core makes static analysis challenging
- Certified code contains often additional test code Requires memory space and computation time
- Alternatives?





## Hardware Trace

**Trace-Information:** 

- Control-Flow (Branches, Function calls)
- OS-relevant events (context switches),
- Data access (address, data)\*,
- Application-specific events (lightweight instrumentation)

#### **Processors with Hardware-Trace Infrastructure:**

- Infineon Aurix: Emulation Device
- ARM Cortex-A/-M/-R: CoreSight
- Intel x86: IntelPT
- NXP QorIQ P-series, T-series: Debug Assist Module

\*depends on Processor capabilities







ARM<sup>®</sup> CoreSight<sup>®</sup> Architecture Specification Intel<sup>®</sup> 64 and IA-32 Architectures Software Developer's Manual

> T4240R2 Advanced QorlQ Debug and Performance Monitoring **Reference Manual**

#### TC29/7/6/3xED

32-Bit Single-Chip Micocontroller



#### Hardware Trace State-of-the-art: Offline Analysis (e.g. Lauterbach TRACE32)



#### Trace-Buffer limits observation time







some GBit/s



Trace data generation by processor internal hardware structures

Trace data buffer by a few GByte RAM buffer









Trace data processing usually magnitudes slower then generation

for observation result output



## Hardware Trace Processing in Realtime







### **Dynamic Analysis** Non-intrusive Continuous Timing Verification

- Use case: Safety-critical application to control breaks
- Requirement:
  - Ensure Timing Constraint from **pressing the breaks**, until their **activation**
  - Constraint: Should react within 5ms!



#### Implementation

| 17 | void run_task()                                                       |
|----|-----------------------------------------------------------------------|
| 18 | ₽ <b>{</b>                                                            |
| 19 | <pre>float break_angle = read_break_sensor();</pre>                   |
| 20 |                                                                       |
| 21 | int strength;                                                         |
| 22 | <pre>strength = calculate_break_strengh_for_angle(break_angle);</pre> |
| 23 |                                                                       |
| 24 | int motor_control;                                                    |
| 25 | <pre>motor_control = calculate_motor_control_value(strength);</pre>   |
| 26 |                                                                       |
| 27 | if (motor_control == 1) {                                             |
| 28 | activate_breaks();                                                    |
| 29 | - }                                                                   |
| 30 | <pre>else if (motor_control == -1) release_breaks();</pre>            |
| 31 | L}                                                                    |

| 33 | float read break sensor()                                                   |   |  |  |
|----|-----------------------------------------------------------------------------|---|--|--|
| 34 | l (                                                                         | l |  |  |
| 35 | <pre>float sensor_value = rand()%91;</pre>                                  | l |  |  |
| 36 | <pre>return sensor_value;</pre>                                             |   |  |  |
| 37 | -}                                                                          |   |  |  |
| 38 |                                                                             |   |  |  |
| 39 | <pre>int calculate_break_strengh_for_angle(float angle)</pre>               |   |  |  |
| 40 | {                                                                           |   |  |  |
| 41 | <pre>usleep(le3+rand()%5*le3); /// Sleep randomly between 1ms and 5ms</pre> |   |  |  |
| 42 | <pre>float strength = angle/10.0;</pre>                                     |   |  |  |
| 43 | <pre>return round(strength);</pre>                                          |   |  |  |
| 44 | }                                                                           |   |  |  |
| 45 |                                                                             |   |  |  |
| 46 | <pre>int calculate_motor_control_value(int strength)</pre>                  |   |  |  |
| 47 | 1                                                                           |   |  |  |
| 48 | <pre>usleep(le3+rand()%5*le3); /// Sleep randomly between 1ms and 5ms</pre> |   |  |  |
| 49 | <pre>if (strength &gt; 3) return 1;</pre>                                   |   |  |  |
| 50 | <pre>else if (strength == 0) return 0;</pre>                                |   |  |  |
| 51 | else return -1;                                                             |   |  |  |
| 52 | -}                                                                          |   |  |  |

- run\_task() executed periodically every second
- Calculations have variable execution durations
  - simulates dynamic events due to multicore environment
  - Breaks are only activated sometimes, depending on the break angle





### **Dynamic Analysis** Non-intrusive Continuous Timing Verification

- Use case: Safety-critical application to control breaks
- Requirement:
  - Ensure Timing Constraint from **pressing the breaks**, until their **activation**
  - Constraint: Should react within 5ms!



#### Implementation





### **Dynamic Analysis** Non-intrusive Continuous Timing Verification







### Dynamic Analysis Non-intrusive Continuous Code Coverage

Continuous and non-intrusive

- Statement Coverage
- Branch Coverage (EX/NEX)
- Performance measurement (count executed instructions)
- Measured on object code level
- ➢ Measured on release code
- ➢ No instrumentation
- $\succ$  No limitation due to trace buffers

Allows measurements on release-code



| 70 | 10062C: e51b3010 ldr r3, [fp, #-16]                   |
|----|-------------------------------------------------------|
| 70 | 100630: e3530001 cmp r3, #1                           |
| 70 | 100634: 9a000010 bls 10067c <_Z13collatz_depthj+0x68> |
| 61 | 100678: eaffffeb b 10062c <_Z13collatz_depthj+0x18>   |
| 61 | 100638: e51b3010 ldr r3, [fp, #-16]                   |
| 61 | 10063C: e2033001 and r3, r3, #1                       |
| 61 | 100640: e3530000 cmp r3, #0                           |
| 61 | 100644: 0a000005 beq 100660 <_Z13collatz_depthj+0x4c> |
| 16 | 100648: e51b2010 ldr r2, [fp, #-16]                   |
| 16 | 10064C: e1a03002 mov r3, r2                           |
| 16 | 100650: e1a03083 lsl r3, r3, #1                       |
| 16 | 100654: e0833002 add r3, r3, r2                       |
| 16 | 100658: e2833001 add r3, r3, #1                       |
| 16 | 10065C: ea000001 b 100668 <_Z13collatz_depthj+0x54>   |
| 45 | 100660: e51b3010 ldr r3, [fp, #-16]                   |
| 45 | 100664: e1a030a3 lsr r3, r3, #1                       |
| 61 | 100668: e50b3010 str r3, [fp, #-16]                   |
|    | , - , ,                                               |



### Dynamic Analysis Non-intrusive Continuous Code Coverage

| Continuous and non-intrusive                                                  |        | 12             |
|-------------------------------------------------------------------------------|--------|----------------|
| <ul> <li>Statement Coverage</li> </ul>                                        |        | 12             |
| <ul> <li>Branch Coverage (EX/NEX)</li> </ul>                                  | •      | 14<br>15       |
| <ul> <li>Performance measurement<br/>(count executed instructions)</li> </ul> | •      | 16<br>17<br>18 |
|                                                                               | •      | 19<br>20       |
| Measured on object code level                                                 | ►<br>• | 21<br>22       |
| <ul> <li>Measured on release code</li> <li>No instrumentation</li> </ul>      |        |                |
| No limitation due to trace buffers                                            | •      | 23             |
|                                                                               |        |                |

Allows measurements on release-code



24 25

26

27

|       |     | // Compute n-th Fibonacci number using recursion.            |
|-------|-----|--------------------------------------------------------------|
|       |     |                                                              |
|       |     | // – n < 2 does not trigger the else branch.                 |
|       | 275 | unsigned fib(unsigned const n) {                             |
| [+,+] | 275 | return (n < 2)? n : fib(n-2) + fib(n-1);                     |
|       | 275 | }                                                            |
|       |     | <pre>// Unfold Collatz sequence and return its length.</pre> |
|       |     | // – n <= 1 will not execute the while loop at all.          |
|       |     | // – n = 2^k will never trigger the 3*n+1 path.              |
|       | 9   | unsigned collatz_depth(unsigned n) {                         |
|       | 9   | unsigned depth = 0;                                          |
| [+,+] | 70  | while(n > 1) {                                               |
|       | 70  | 10062C: e51b3010 ldr r3, [fp, #-16]                          |
|       | 70  | 100630: e3530001 cmp r3, #1                                  |
| [+,+] | 70  | 100634: 9a000010 bls 10067c <_Z13collatz_depthj+0x68>        |
|       | 61  | 100678: eaffffeb b 10062c <_Z13collatz_depthj+0x18>          |
| [+,+] | 61  | n = (n&1)? 3*n+1 : n/2;                                      |
|       | 61  | 100638: e51b3010 ldr r3, [fp, #-16]                          |
|       | 61  | 10063C: e2033001 and r3, r3, #1                              |
|       | 61  | 100640: e3530000 cmp r3, #0                                  |
| [+,+] | 61  | 100644: 0a000005 beq 100660 <_Z13collatz_depthj+0x4c>        |
|       | 16  | 100648: e51b2010 ldr r2, [fp, #-16]                          |
|       | 16  | 10064C: e1a03002 mov r3, r2                                  |
|       | 16  | 100650: e1a03083 lsl r3, r3, #1                              |
|       | 16  | 100654: e0833002 add r3, r3, r2                              |
|       | 16  | 100658: e2833001 add r3, r3, #1                              |
|       | 16  | 10065C: ea000001 b 100668 <_Z13collatz_depthj+0x54>          |
|       | 45  | 100660: e51b3010 ldr r3, [fp, #-16]                          |
|       | 45  | 100664: e1a030a3 lsr r3, r3, #1                              |
|       | 61  | 100668: e50b3010 str r3, [fp, #-16]                          |
|       | 61  | depth++;                                                     |
|       |     | }                                                            |
|       | 9   | return depth;                                                |
|       | 9   | }                                                            |



## Conclusions

- Novel approach for test and debugging based on hardware trace presented
- New potential due to
  - Non-intrusiveness
  - Higher chance to catch sporadic issues using long-running tests
  - Code coverage on integration and system tests
- post-release defects



With goal of increased product quality, reliability and decrease fatal



## Thanks for your Attention





#### Contact:

Accemic Technologies GmbH Franz-Huber-Str. 39 83088 Kiefersfelden

www.accemic.com

Contact partner: Dr. Alexander Weiss aweiss@accemic.com +49 8033 6039795

Presenter: Albert Schulz aschulz@accemic.com

This work was funded in part by the EU H2020 Project 732016 COEMS and the BMBF Project ARAMIS II (ID 01 IS 16025)





## Backup Slides





# Dynamic Analysis @ Object Code Level

- > Non-intrusive monitoring and unlimited monitoring period (up to hours, days).
- Structural code coverage can be measured at all test levels.
- Measurable statement of the quality of High-Level Requirements.
- Measurable statement of the quality of High-Level Tests.





#### nonitoring period (up to hours, days). ed at all test levels.



#### Dynamic Analysis Object Code Level



Mind trace interface access opportunities: - in your hardware system requirement specifications and - in your buying decisions!

© 2019 Accemic Technologies GmbH. All Rights Reserved.



Hardware-based monitoring infrastructure is integrated in most processors - and already paid by you ...



### Dynamic Analysis Object Code vs. Source Code Level (see also CAST 17)

PROS

- It can demonstrate full code coverage at the object code level.
- It can support more "valid" coverage.
- It is closer to the final airborne software.
- It can be implemented with source code programming language independence.
- It can reduce time-consuming manual analysis.
- No instrumentation is required.
- It can also be used for the objective measurement of the quality of integration and system tests.
- It can reduce the test effort by substituting low-level tests.
- Incomplete requirements and tests are found at the system level



```
Listing 1: Illustrative example, source code in C
```

```
char* pass_fail(char grade) {
     static char msg[2][5] = {"pass", "fail"};
     int pass;
     if (grade=='d'||grade=='f') {
         pass = 0;
     } else if (grade=='a' || grade=='b' ||
         grade == 'c') {
         pass = 1;
     \} else { pass = -1; }
     return pass ? msg[pass] : msg[0];
10 }
```

Listing 2: x86 code compiled with -00

```
/* if (grade == 'd' || grade == 'f') */
   8048439: <u>cmpb</u> $0x64,-0x14(%ebp)
   804843d: je
                   8048445 // jump if grade=='d'
   804843f: <u>cmp</u>b $0x66,-0x14(%ebp)
   8048443: jne
                   804844e // jump if grade!='f'
   8048445: movl $0x0,-0x4(%ebp) // pass:=0
   804844c: jmp
                   8048470 // jump to return
   /* else if (grade=='a'||...||grade=='c') */
   804844e: <u>cmpb</u> $0x61,-0x14(%ebp)
   8048452: je
                   8048460 //jump if grade=='a'
   8048454: cmpb $0x62,-0x14(%ebp)
   8048458: je
                   8048460 //jump if grade=='b'
   804845a: cmpb $0x63,-0x14(%ebp)
   804845e: jne
                   8048469 //jump if grade!='c'
   8048460: movl $0x1,-0x4(%ebp) // pass:=1
16 8048467: jmp
                   8048470
                   (pass:=-1) */
   /* else
18 8048469: movl $0xffffffff,-0x4(%ebp)
   . . .
```

T. Byun, V. Sharma, S. Rayadurgam, S. McCamant, and M. P. Heimdahl, 'Toward Rigorous Object-Code Coverage Criteria', in 201 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE), 2018, vol. 00, pp. 328–338.



### Dynamic Analysis Object Code vs. Source Code Level (see also CAST 17)

CONS

- Source code to object code traceability can be difficult (depending on compiler support).
- Optimizing compiler can use difficult-to-monitor flags to process multi-conditions. (we are working on solutions...)
- Typical tools usually use the source code level.



Listing 1: Illustrative example, source code in C

```
char* pass_fail(char grade) {
     static char msg[2][5] = {"pass", "fail"};
     int pass;
     if (grade=='d'||grade=='f') {
          pass = 0;
     } else if (grade=='a' || grade=='b' ||
          grade == 'c') {
          pass = 1;
     \} else { pass = -1; }
     return pass ? msg[pass] : msg[0];
10 }
```

Listing 3: x86 code compiled with -Os

```
8048455: push %ebp
   8048456: mov $0x804a01c,%eax// %eax:=msg[0]
   804845b: mov %esp,%ebp
   804845d: mov 0x8(%ebp),%edx // %edx:=grade
  /* if (grade == 'd' || grade == 'f') */
  8048460: mov %dl,%cl
                                // %cl:=grade
  // ASCII('d')=0x64, ASCII('f')=0x66,
8 // 'f'^Oxfffd='d', 'd'^Oxfffd='d'
  8048462: and $0xfffffffd,%ecx
10 8048465: <u>cmp</u> $0x64,%cl // 'd', grade
   8048468: je 804847e // %cl=='d'->return
12 /* else if (grade=='a'||...||grade=='c') */
  /* else */
14 804846a: sub $0x61,%edx // %edx=grade-'a'
  804846d: cmp $0x3,%dl // CF=%edx<3?1:0
16 8048470: sbb %eax,%eax // %eax:=CF?-1:0
17 8048472: and $0x2,%eax // %eax:=CF?2:0
18 8048475: dec %eax
                           // %eax:=CF?1:-1
19 /* return pass ? msg[pass] : msg[0]; */
20 // %eax:=5*%eax
21 8048476: imul $0x5,%eax,%eax
22 // %eax:=msg+%eax
23 8048479: add $0x804a01c,%eax
24 804847e: ...
```

T. Byun, V. Sharma, S. Rayadurgam, S. McCamant, and M. P. Heimdahl, 'Toward Rigorous Object-Code Coverage Criteria', in 2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE), 2018, vol. 00, pp. 328–338.

