Q: What is the difference between a preemptive RTOS scheduler and a cooperative scheduler, and what determines task priority assignment?

Preemptive: the scheduler can interrupt a running task when a higher-priority task becomes ready, giving bounded response latency but requiring careful synchronization for shared data. Cooperative: tasks run until they yield, simpler and no mid-task preemption races, but a misbehaving task starves others. Priority assignment is typically rate-monotonic (shorter period / tighter deadline gets higher priority) for hard real-time, balanced against avoiding starvation and priority inversion. Mention deadline-driven reasoning and keeping high-priority tasks short.

Q: Explain priority inversion and how a priority inheritance mutex solves it.

A high-priority task blocks on a mutex held by a low-priority task; a medium-priority task then preempts the low-priority holder, so the high-priority task is indirectly blocked by a medium task that should not outrank it. Priority inheritance: while the low-priority task holds a mutex a higher-priority task wants, it temporarily inherits that higher priority so it runs, releases quickly, and is then restored. Mention the Mars Pathfinder watchdog resets, and alternatives like priority ceiling protocol. Note inheritance mutexes versus plain semaphores.

Question 1

What is the difference between a microcontroller and a microprocessor, and when would you choose one over the other for a product?

Accepted Answer

Microcontroller integrates CPU, RAM, flash, and peripherals (timers, ADC, UART) on one die; microprocessor is just the CPU and needs external memory and peripherals. Choose an MCU for cost-sensitive, low-power, deterministic control tasks (motor control, sensors). Choose an MPU when you need an OS, high compute, lots of RAM, or a rich UI (Linux gateway, camera). Mention BOM cost, board complexity, boot time, and power as decision drivers.

Question 2

Explain what the volatile keyword does in C and give a concrete embedded scenario where omitting it causes a bug.

Accepted Answer

volatile tells the compiler the value can change outside the normal program flow, so it must re-read from memory on every access and not cache it in a register or reorder/eliminate the access. Scenario: polling a status register bit in a while loop, or a flag set inside an ISR and read in main; without volatile the compiler hoists the read out of the loop and you spin forever or never see the flag. Note volatile does NOT provide atomicity or memory ordering across cores.

Question 3

Walk me through what happens, step by step, from a hardware interrupt firing to your ISR code executing on an ARM Cortex-M.

Accepted Answer

Peripheral asserts an interrupt line to the NVIC. NVIC checks the interrupt is enabled and its priority beats the current execution priority (and BASEPRI/PRIMASK masks). Cortex-M automatically stacks R0-R3, R12, LR, PC, xPSR onto the current stack. It loads the handler address from the vector table at the exception number's slot, sets LR to an EXC_RETURN magic value, and branches. Your ISR runs, ideally short: clear the interrupt flag, do minimal work, signal a task. On return, the magic LR triggers automatic unstacking and resumes the interrupted code. Mention tail-chaining and late-arrival optimizations.

Question 4

How do you safely share a multi-byte variable, like a 32-bit counter, between an ISR and your main loop on an 8-bit or 16-bit MCU?

Accepted Answer

The read or write is non-atomic on an 8/16-bit core, so an interrupt mid-access causes a torn value. Options: briefly disable interrupts (critical section) around the main-loop access; or use a double-read / read-compare-reread loop for a monotonic counter; or use a hardware atomic if available. Keep the critical section as short as possible to bound interrupt latency. Mark the variable volatile so the compiler does not cache it. On Cortex-M, LDREX/STREX or disabling via PRIMASK are alternatives.

Question 5

Compare SPI, I2C, and UART. Given a design, how do you pick between them?

Accepted Answer

UART: asynchronous, no clock, point-to-point, two wires (TX/RX), needs matched baud rate, good for logs/modems. I2C: synchronous, two wires (SDA/SCL), multi-drop with 7/10-bit addresses, open-drain with pull-ups, slower (100k/400k/1M), great for many low-speed sensors sharing a bus. SPI: synchronous, full-duplex, four wires plus one chip-select per device, fast (tens of MHz), no addressing so wiring grows with devices. Pick by speed, pin count, number of devices, duplex needs, and whether you need addressing versus raw throughput.

Question 6

Describe how I2C clock stretching works and a situation where it causes a hard-to-find bug.

Accepted Answer

A slave that needs more time holds SCL low after the master releases it, stalling the master until it releases SCL. Bug: some master peripherals (or bit-banged masters) do not support clock stretching, or have buggy stretching, so a slow slave's stretch is misread as a stuck bus or a timeout, causing corrupted transactions. Also a slave can get stuck holding SDA low after a glitch, hanging the bus. Mitigations: check the master's stretching support, add a bus-recovery routine that clocks SCL nine times to free a stuck slave, use timeouts, and verify with a scope/logic analyzer.

Question 7

What is switch bounce, and walk me through both a hardware and a software approach to debouncing.

Accepted Answer

Mechanical contacts make and break several times over a few milliseconds, producing spurious edges. Hardware: RC filter plus Schmitt trigger, or an SR latch with a SPDT switch. Software: sample the pin periodically (e.g., every 5-10 ms) and only register a state change after N consecutive stable samples, or take an edge then ignore further edges for a debounce window. Mention that ISR-on-every-edge is the naive trap; prefer a timer-based sampling approach or debounce inside the ISR plus a lockout.

Question 8

Explain memory-mapped I/O. How would you write a driver to set a specific bit in a 32-bit peripheral control register in C?

Accepted Answer

Peripheral registers live at fixed physical addresses in the same address space as memory. Access via a volatile pointer: define the register address, cast to volatile uint32_t*, and do read-modify-write: *REG |= (1u << BIT) to set, *REG &= ~(1u << BIT) to clear. Stress volatile so reads/writes are not optimized away. Note read-modify-write is non-atomic, so an ISR touching the same register can corrupt it; use bit-band or set/clear registers (BSRR on STM32) where available. Prefer a struct overlay matching the register map for clean drivers.

Question 9

What is the difference between a preemptive RTOS scheduler and a cooperative scheduler, and what determines task priority assignment?

Accepted Answer

Preemptive: the scheduler can interrupt a running task when a higher-priority task becomes ready, giving bounded response latency but requiring careful synchronization for shared data. Cooperative: tasks run until they yield, simpler and no mid-task preemption races, but a misbehaving task starves others. Priority assignment is typically rate-monotonic (shorter period / tighter deadline gets higher priority) for hard real-time, balanced against avoiding starvation and priority inversion. Mention deadline-driven reasoning and keeping high-priority tasks short.

Question 10

Explain priority inversion and how a priority inheritance mutex solves it.

Accepted Answer

A high-priority task blocks on a mutex held by a low-priority task; a medium-priority task then preempts the low-priority holder, so the high-priority task is indirectly blocked by a medium task that should not outrank it. Priority inheritance: while the low-priority task holds a mutex a higher-priority task wants, it temporarily inherits that higher priority so it runs, releases quickly, and is then restored. Mention the Mars Pathfinder watchdog resets, and alternatives like priority ceiling protocol. Note inheritance mutexes versus plain semaphores.

Question 11

How do you communicate between an ISR and a task safely in an RTOS? What can and cannot be called from interrupt context?

Accepted Answer

Keep ISRs short: do the minimum, then signal a task to do the heavy work (deferred interrupt handling / bottom half). Use ISR-safe APIs like FreeRTOS xQueueSendFromISR or a task notification, and handle the higher-priority-task-woken yield request. You cannot block in an ISR, cannot call APIs that may sleep, and generally avoid malloc, printf, and floating point unless the context saves FPU state. Watch interrupt priority versus the RTOS max-syscall priority threshold.

Question 12

Walk me through the memory layout of a typical bare-metal firmware: where do .text, .data, .bss, the stack, and the heap live, and what does the startup code do before main?

Accepted Answer

.text (code) and .rodata in flash; .data (initialized globals) has its init values in flash but lives in RAM; .bss (zero-initialized globals) in RAM. Startup/reset handler: set the stack pointer, copy .data from flash to RAM, zero .bss, call libc init / constructors, configure clocks, then call main. Stack typically grows down from the top of RAM, heap grows up; collision is the classic stack overflow into heap. The linker script defines the memory regions and section placement.

Question 13

What are the main differences between using DMA versus interrupt-driven versus polled I/O for moving data from a peripheral?

Accepted Answer

Polled: CPU busy-waits, simple but wastes cycles and scales poorly. Interrupt-driven: CPU does other work and is notified per byte/event, good for moderate rates but per-interrupt overhead hurts at high throughput. DMA: hardware moves data between peripheral and memory without CPU involvement, ideal for high-rate or bulk transfers (ADC streams, SPI displays, audio), with one interrupt at half/full transfer. Tradeoffs: DMA frees the CPU but adds setup complexity, cache-coherency concerns, and bus contention.

Question 14

You have a firmware bug that only appears after several hours of running in the field and never on your bench. How do you debug it?

Accepted Answer

Characterize: what fails, how often, under what conditions (temperature, load, timing). Hypothesize common slow-burn causes: memory leaks/heap fragmentation, stack overflow, counter or timer overflow/wraparound, race conditions, watchdog interactions, ESD/brownout, or accumulating sensor drift. Instrument without changing timing too much: add a circular RAM log / trace buffer, capture min/max stack high-water mark, log free heap, enable fault handlers that dump registers and the stacked PC. Reproduce by accelerating (stress, raise temperature, speed up time base). Use a debugger with non-stop trace (ITM/SWO, ETM) or a watchdog that captures state. Bisect recent changes.

Question 15

Explain the difference between the low-power sleep modes on a typical MCU and how you decide which to use.

Accepted Answer

Modes trade current against what stays alive and how fast you wake. Sleep/idle: CPU clock gated, peripherals and RAM live, fast wake, microamps-to-milliamps. Stop/deep-sleep: most clocks off, RAM retained, wake from a few sources (RTC, EXTI), longer wake latency, low microamps. Standby/shutdown: most of the chip powered down, RAM usually lost, wake resets the part, nanoamps-to-microamps. Decide by the duty cycle: how long you sleep, how fast you must respond, what state you must retain, and the wake source available. Quantify with average current = active*ton + sleep*toff over the period.

Question 16

What is a watchdog timer, and how do you use it correctly without masking real bugs?

Accepted Answer

A hardware timer that resets the MCU if not periodically kicked, recovering from hangs/lockups. Correct use: kick it from a single trusted place (e.g., a supervisor task) only after verifying all critical tasks have checked in, so a hung task is actually caught. Anti-pattern: kicking it in an ISR or a tight loop unconditionally, which keeps petting it even when the system is dead. Capture the reset cause and log a watchdog reset so you can diagnose. Consider a windowed watchdog to catch too-fast as well as too-slow. Save state before reset if possible.

Question 17

How do you debug a hard fault on an ARM Cortex-M? What information is available and how do you find the offending instruction?

Accepted Answer

In the HardFault handler, inspect the fault status registers: HFSR, CFSR (with UFSR/BFSR/MMFSR sub-fields), and BFAR/MMFAR for the faulting address. Recover the stacked frame: figure out whether MSP or PSP was active (from EXC_RETURN bit), then read the stacked R0-R3, R12, LR, PC, xPSR; the stacked PC points at (or near) the offending instruction. Look up that address in the .map / disassembly. Common causes: null/wild pointer dereference, unaligned access, stack overflow, dividing by zero with trap enabled, calling a function pointer that is null or has a bad LSB (Thumb bit). Use a debugger to break in the handler.

Question 18

Explain the difference between a pointer and the value it points to in C, and what these declarations mean: const char *p, char *const p, const char *const p.

Accepted Answer

A pointer holds an address; dereferencing reads/writes the pointed-to object. const char *p: pointer to const char, you can change p but not *p (data is read-only through this pointer, useful for ROM strings). char *const p: const pointer to char, you cannot change p (it always points at the same place) but can modify *p. const char *const p: cannot change either. Read right-to-left or use the spiral rule. This matters for putting data in flash, for API contracts, and to let the compiler catch accidental writes.

Question 19

What is endianness, and how does it bite you when sending a multi-byte value over SPI or serializing a struct to a network?

Accepted Answer

Endianness is the byte order of a multi-byte value in memory: little-endian stores the least significant byte first, big-endian the most significant first. It bites when two systems with different endianness exchange raw bytes, or when you cast a byte buffer to a multi-byte type. Fixes: define a wire byte order (network/big-endian is common), serialize/deserialize byte by byte or with htons/ntohl-style helpers, and never memcpy a struct across the wire and assume layout. Also watch struct padding and alignment in addition to endianness.

Question 20

How does an analog-to-digital converter (ADC) work conceptually, and what do resolution, sampling rate, and reference voltage mean for your measurement?

Accepted Answer

An ADC samples an analog voltage and quantizes it to a digital code. Resolution (bits) sets the number of code steps; the LSB voltage = Vref / 2^N, so 12-bit at 3.3V Vref gives ~0.8 mV per step. Vref sets full scale and directly affects accuracy and noise; a noisy or wrong reference scales your reading. Sampling rate must satisfy Nyquist (at least 2x the signal bandwidth) to avoid aliasing, so anti-alias filtering is needed. Also account for input impedance/sample-and-hold settling time, INL/DNL, and oversampling/averaging to gain effective bits.

Question 21

Explain how PWM works and how you would use it to dim an LED versus control a servo or a motor.

Accepted Answer

PWM produces a square wave with a fixed period and a variable duty cycle. For LED dimming, average power scales with duty cycle; pick a frequency above flicker perception (>~200 Hz, often kHz) and the eye integrates it. For a hobby servo, the information is in the pulse width (typically 1-2 ms within a 20 ms / 50 Hz frame), not the average. For a DC motor, PWM into an H-bridge controls average voltage/speed; choose a frequency above audible range to avoid whine and consider inductance. Mention resolution = timer clock / (frequency * steps).

Question 22

What is a circular (ring) buffer, why is it ubiquitous in embedded UART drivers, and how do you implement one that is safe between an ISR producer and a main-loop consumer?

Accepted Answer

A fixed-size buffer with head and write indices that wrap around, giving O(1) enqueue/dequeue with no dynamic allocation, ideal for streaming UART RX/TX. For single-producer (ISR) / single-consumer (main), you can make it lock-free: the ISR only writes the head index, the consumer only writes the tail index, each side reads the other's index. Use a power-of-two size for cheap masking, mark indices volatile, and ensure index updates are atomic on your architecture (or use a memory barrier). Handle full/empty with a spare slot or a separate count.

Question 23

How would you design a robust firmware update (OTA / bootloader) mechanism for a deployed device so a failed update never bricks it?

Accepted Answer

Use a small immutable bootloader plus dual application banks (A/B) or a download-then-swap scheme. Download new image to a spare slot, verify integrity (CRC/hash) and authenticity (signature) before activation, then atomically switch the active bank by flipping a flag. Keep the previous image so a failed boot triggers automatic rollback; use a boot-count/watchdog confirmation handshake so an image must prove it runs before being marked good. Handle power loss at every step (interrupted download, mid-erase). Encrypt the image in transit, never trust unsigned firmware, and protect the bootloader with read-out protection.

Question 24

What is the difference between flash, EEPROM, and RAM in an embedded system, and how does flash wear affect how you store frequently changing data?

Accepted Answer

RAM is volatile, fast, byte-writable, used for runtime data. Flash is non-volatile, holds code and constants, erased in blocks/sectors and written in pages, with limited erase/write endurance (often ~10k-100k cycles). EEPROM is non-volatile and byte-erasable with higher endurance, used for small config. Flash wear: rewriting the same sector repeatedly wears it out, so use wear leveling, journaling, or write-counters, and avoid erase-on-every-update patterns. For frequently changing values use EEPROM, FRAM, an external EEPROM, or an emulated-EEPROM library with rotation across flash pages.

Question 25

Your UART is receiving garbled bytes at higher baud rates but works fine slowly. How do you diagnose and fix it?

Accepted Answer

Garbling that scales with baud usually points to clock/timing or overrun. Check baud-rate error: the peripheral clock and divisor may not produce an accurate baud at high speed (>~2-3% error corrupts framing); verify with a scope on the bit period. Check for RX overrun: at high rates the ISR/DMA may not keep up, so move to DMA or a bigger FIFO/ring buffer. Verify signal integrity: scope the line for rounded edges, reflections, noise, ground bounce, or missing common ground. Confirm framing/parity/stop-bit settings match both ends. Look at framing error flags. Fix by adjusting clock source, enabling DMA, shortening/terminating the cable, or lowering baud.

Question 26

Explain race conditions in embedded systems and walk me through a critical section. When is disabling interrupts the wrong tool?

Accepted Answer

A race occurs when two contexts (ISR and main, or two tasks) access shared state and the outcome depends on timing, e.g. a non-atomic read-modify-write interrupted halfway. A critical section protects the shared access; disabling interrupts is the simplest on bare metal but must be short because it raises worst-case interrupt latency and can break real-time guarantees. It is the wrong tool when: you have an RTOS (use a mutex/semaphore so you do not block higher-priority ISRs), when the section is long, on multi-core (disabling interrupts on one core does not stop the other; you need a spinlock), or when an ISR is the contender (you cannot mutex against an ISR). Prefer atomics or lock-free designs where possible.

Question 27

Tell me about a time you had to debug a particularly difficult hardware or firmware issue. How did you approach it?

Accepted Answer

STAR. Situation: a specific elusive bug (e.g., intermittent resets in the field, or sensor data corruption). Task: root-cause and fix it without a reliable repro. Action: describe forming hypotheses, isolating variables, instrumenting with a scope/logic analyzer/trace buffer, reading datasheets/errata, and bisecting changes; show you didn't just shotgun-debug. Result: identified the true cause (e.g., a brownout, a missed pull-up, a stack overflow), the fix, and a preventive measure (added monitoring, errata workaround, test). Emphasize what you learned and how you de-risked future designs.

Question 28

Describe a situation where you had to ship firmware under a tight deadline with a known limitation or technical debt. How did you handle the tradeoff?

Accepted Answer

STAR. Situation: a deadline (e.g., a customer demo or production run) with an unfinished or imperfect feature. Task: decide what is safe to ship. Action: assess the risk of the limitation, contain it (feature flag, conservative defaults, extra validation, documented known issue), communicate the tradeoff clearly to stakeholders, and ensure a safe fallback (e.g., OTA path to patch later). Result: shipped on time without compromising safety/reliability, then paid down the debt on a planned schedule. Emphasize transparent communication and that you never hid the risk.

Question 29

How do you measure and reduce interrupt latency in a real-time system, and why does it matter?

Accepted Answer

Interrupt latency is the time from the hardware event to the first instruction of your ISR (and full response includes ISR execution and any deferred task). Contributors: longest period interrupts are disabled (critical sections), higher-or-equal priority ISRs running, context save/stacking time, and on some cores wait states/flash latency. Measure with a GPIO toggle at event and at ISR entry on a scope, or use cycle counters (DWT CYCCNT) and trace. Reduce by minimizing and bounding critical sections, raising the priority of the critical interrupt, keeping ISRs short with deferred processing, avoiding long atomic blocks, and using zero-wait-state RAM for hot handlers. It matters because missing a hard deadline can mean data loss or a safety failure.

Question 30

What is stack overflow in an embedded context, how is it different from heap exhaustion, and how do you detect and prevent it?

Accepted Answer

Stack overflow: the call stack grows past its allotted region (deep recursion, large local arrays, deep ISR nesting) and corrupts adjacent memory, often .bss/heap or another task's stack. Heap exhaustion: malloc fails or fragmentation prevents an allocation. Detection: paint the stack with a known pattern and check the high-water mark; use an MPU guard region or a redzone; enable RTOS stack-overflow hooks; on Cortex-M use the PSPLIM/MSPLIM stack-limit registers (v8-M) or a MemManage fault. Prevention: size stacks from worst-case analysis, avoid recursion and large stack buffers, prefer static allocation, measure high-water marks in test, and avoid dynamic allocation in long-running firmware to dodge fragmentation.

Question 31

Why might you avoid dynamic memory allocation (malloc/free) in long-running embedded firmware, and what do you use instead?

Accepted Answer

Problems: heap fragmentation over time can make an allocation fail even with enough total free memory; malloc has non-deterministic timing (bad for real-time); failure handling is awkward in deeply embedded code; and it complicates worst-case memory analysis. Alternatives: static/global allocation sized at build time, fixed-size memory pools / block allocators, stack allocation for short-lived data, and arena allocators. Many standards (e.g., MISRA, automotive/aerospace) restrict or forbid dynamic allocation after initialization. If you must allocate, do it once at startup.

Question 32

Explain what a logic analyzer and an oscilloscope each tell you, and how you would use them together to debug an I2C sensor that returns wrong data.

Accepted Answer

An oscilloscope shows the analog shape of a signal over time: voltage levels, rise/fall times, ringing, noise, and signal integrity. A logic analyzer shows many digital lines as decoded protocol over time: it can decode I2C transactions, addresses, ACK/NACK, and data, great for protocol-level bugs. Workflow: first use the logic analyzer to decode the I2C exchange, confirm the right slave address, register pointer, ACKs, and whether the data bytes match expectations or a NACK appears. If the protocol looks valid but data is wrong, or you see NACKs, switch to the scope to check pull-up strength, rise times, voltage levels, and noise on SDA/SCL. Together they separate protocol/firmware bugs from electrical/integrity bugs.

What is the difference between a microcontroller and a microprocessor, and when would you choose one over the other for a product?

What they’re really asking

Strong answer structure

Likely follow-ups

Explain what the volatile keyword does in C and give a concrete embedded scenario where omitting it causes a bug.

What they’re really asking

Strong answer structure

Likely follow-ups

Walk me through what happens, step by step, from a hardware interrupt firing to your ISR code executing on an ARM Cortex-M.

What they’re really asking

Strong answer structure

Likely follow-ups

How do you safely share a multi-byte variable, like a 32-bit counter, between an ISR and your main loop on an 8-bit or 16-bit MCU?

What they’re really asking

Strong answer structure

Likely follow-ups

Compare SPI, I2C, and UART. Given a design, how do you pick between them?

What they’re really asking

Strong answer structure

Likely follow-ups

Describe how I2C clock stretching works and a situation where it causes a hard-to-find bug.

What they’re really asking

Strong answer structure

Likely follow-ups

What is switch bounce, and walk me through both a hardware and a software approach to debouncing.

What they’re really asking

Strong answer structure

Likely follow-ups

Explain memory-mapped I/O. How would you write a driver to set a specific bit in a 32-bit peripheral control register in C?

What they’re really asking

Strong answer structure

Likely follow-ups

What is the difference between a preemptive RTOS scheduler and a cooperative scheduler, and what determines task priority assignment?

What they’re really asking

Strong answer structure

Likely follow-ups

Explain priority inversion and how a priority inheritance mutex solves it.

What they’re really asking

Strong answer structure

Likely follow-ups

How do you communicate between an ISR and a task safely in an RTOS? What can and cannot be called from interrupt context?

What they’re really asking

Strong answer structure

Likely follow-ups

Walk me through the memory layout of a typical bare-metal firmware: where do .text, .data, .bss, the stack, and the heap live, and what does the startup code do before main?

What they’re really asking

Strong answer structure

Likely follow-ups

What are the main differences between using DMA versus interrupt-driven versus polled I/O for moving data from a peripheral?

What they’re really asking

Strong answer structure

Likely follow-ups

You have a firmware bug that only appears after several hours of running in the field and never on your bench. How do you debug it?

What they’re really asking

Strong answer structure

Likely follow-ups

Explain the difference between the low-power sleep modes on a typical MCU and how you decide which to use.

What they’re really asking

Strong answer structure

Likely follow-ups

What is a watchdog timer, and how do you use it correctly without masking real bugs?

What they’re really asking

Strong answer structure

Likely follow-ups

How do you debug a hard fault on an ARM Cortex-M? What information is available and how do you find the offending instruction?

What they’re really asking

Strong answer structure

Likely follow-ups

Explain the difference between a pointer and the value it points to in C, and what these declarations mean: const char *p, char *const p, const char *const p.

What they’re really asking

Strong answer structure

Likely follow-ups

What is endianness, and how does it bite you when sending a multi-byte value over SPI or serializing a struct to a network?

What they’re really asking

Strong answer structure

Likely follow-ups

How does an analog-to-digital converter (ADC) work conceptually, and what do resolution, sampling rate, and reference voltage mean for your measurement?

What they’re really asking

Strong answer structure

Likely follow-ups

Explain the difference between a pointer and the value it points to in C, and what these declarations mean: const char p, char const p, const char *const p.