Here are details on Zephyr RTOS benchmark setup. The test was ported to Zephyr with AI assistance. Zephyr is tested twice. In one run the configuration enabled HW stack protection and in another stack protection is disabled. This is denoted by appendix “stack protected” in test results which marks test run where hardware stack protection was enabled.
Above chart shows comparison of FreeRTOS to other RTOSes in terms of “instructions per cycle”. These data is derived from FreeRTOS test score and summarized in table below. This metric estimates instruction count needed to perform one test cycle involving all the typical overhead of the running system. In fact, this metric shows CPU cycles consumed to perform one test cycle rather than instructions. Yet the term “IPC” is used for this purpose and understood in the industry, so we will stick to this use as well.
| Test Name | Zephyr 4.1.0 stack protected | Zephyr 4.1.0 |
|---|---|---|
| Calibration | 9 | 9 |
| Message Passing | 208 | 208 |
| Synchronization | 125 | 125 |
| Cooperative Scheduling | 330 | 184 |
| Preemptive Scheduling | 798.04 | 442.82 |
Initial set of Zephyr benchmarks was done with different than one outlined below. Most importantly, the configuration was missing CONFIG_TIMESLICING=n. The default value for this configuration option in Zephyr is y which causes roughly 6x bigger thread switching overhead. Below is chart comparing Zephyr performance with and without MPU stack protection with timeslicing enabled and disabled. Difference is huge!
Purpose: Measures raw processing throughput without kernel overhead.
Method: A single thread continuously increments a counter in a tight loop.
What It Measures: Maximum achievable loop iterations, establishing a baseline for comparing kernel overhead in other tests.
Purpose: Measures voluntary context switch efficiency.
Method: Five threads at equal priority take turns executing. Each thread increments its counter and yields control.
Kernel Primitives Used:
k_yield() - voluntary context switchWhat It Measures: Cost of voluntary context switches when threads explicitly yield CPU to peers of equal priority.
Purpose: Measures priority-based preemptive context switch efficiency.
Method: Five threads with different priorities (6-10) form a semaphore chain. Thread 0 (lowest priority) triggers thread 1, which triggers thread 2, cascading up to thread 4 (highest priority). Control then cascades back down.
Kernel Primitives Used:
k_sem_init() - semaphore initializationk_sem_give() - semaphore signal (triggers preemption)k_sem_take() - semaphore waitWhat It Measures: Cost of involuntary (preemptive) context switches driven by priority differences and semaphore signaling.
Purpose: Measures message queue operation overhead.
Method: A single thread sends a message to a queue and immediately receives it back, measuring round-trip message passing cost.
Kernel Primitives Used:
K_MSGQ_DEFINE() - static message queue definitionk_msgq_put() - send message to queuek_msgq_get() - receive message from queueWhat It Measures: Combined cost of enqueueing and dequeueing messages (16-byte messages, queue depth 16).
Purpose: Measures mutex operation overhead.
Method: A single thread repeatedly acquires and releases a mutex, measuring lock/unlock cycle cost.
Kernel Primitives Used:
K_MUTEX_DEFINE() - static mutex definitionk_mutex_lock() - acquire mutexk_mutex_unlock() - release mutexWhat It Measures: Cost of uncontended mutex acquisition and release (no actual contention since single thread).
CONFIG_HW_STACK_PROTECTION=n
CONFIG_MAIN_STACK_SIZE=2048
CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=1024
CONFIG_IDLE_STACK_SIZE=512
CONFIG_NUM_MBOX_ASYNC_MSGS=16
CONFIG_SYS_CLOCK_TICKS_PER_SEC=1000
CONFIG_CONSOLE=n
CONFIG_UART_CONSOLE=n
CONFIG_SERIAL=n
CONFIG_PRINTK=n
CONFIG_THREAD_NAME=n
CONFIG_THREAD_MONITOR=n
CONFIG_DEBUG_INFO=y
CONFIG_ASSERT=n
CONFIG_EXCEPTION_STACK_TRACE=n
CONFIG_SPEED_OPTIMIZATIONS=y
CONFIG_BOOT_BANNER=n
CONFIG_TIMESLICING=n
CONFIG_MPU_STACK_GUARD is set to y for build with stack protection and to n for plain build.
CONFIG_TIMESLICING=n published as it was pointed out to me that this configuration option has a huge performance impact.