# 

# RVMC/NOEL3 A new RISC-V microcontroller CPU core

2025-04-02







- 1 Overview
- 2 NOEL3 configurations
- 3 IP cores ecosystem

- 4 Fault tolerance features
- 5 Software ecosystem



### **RVMC – NOEL3**

- Developing microcontroller as part of RISC-V Hardware/Software
  Ecosystem under the name of RVMC, RISC-V Microcontroller
- RVMC will be available as part of ESA IP core portfolio
- Frontgrade Gaisler will make a commercially available version under product name NOEL3







# **NOEL3 – RISC-V processor IP core**

#### **Characteristics:**

- RISC-V processor core (32-bits)
  - Barrel architecture
- Deterministic, in-order pipeline
- Fault tolerance features
- Small area footprint, suitable also for very small FPGAs
- RISC-V software and tool support, plus <u>our own</u> <u>ecosystem and toolchains</u>

#### **Primary features:**

- RV32IMAFCB
  - Suitable for RTOS
  - Configurable number of threads
  - Internal tightly-coupled memories
- AHB and deterministic bus support



#### Performance

• CoreMark\*/MHz (per thread): 0.9\*\*

\* -O3 -finline-functions --param max-inline-insns-auto=20 --param inline-min-speedup=10 -funswitch-loops -funroll-all-loops -fgcse-after-reload -fpredictivecommoning -fipa-cp-clone -falign-jumps=8 -falign-functions=8 --param=11-cache-line-size=32 --param=11-cache-size=62 -mcmodel=medany -static std=gnu99 -ffast-math -ffreestanding -fno-common -fno-builtin-printf -fno-tree-loop-distribute-patterns -march=r/32i\_zmmul\_zicsr -mabi=ilp32 -nostartfiles -Im -lgcc -T ../riscv64-baremetal/link.ld -I../riscv64-baremetal -I. -DFLAGS\_STR=\""-O3 -finline-functions --param max-inline-insns-auto=20 --param inlinemin-speedup=10 -funswitch-loops -funroll-all-loops -fgcse-after-reload -fpredictive-commoning -fipa-cp-clone -falign-jumps=8 -falign-functions=8 -param=11-cache-line-size=32 --param=11-cache-size=62 -mcmodel=medany -static -std=gnu99 -ffast-math -ffreestanding -fno-common -fno-builtin-printf fno-tree-loop-distribute-patterns -march=r/32i\_zmmul\_zicsr -mabi=ilp32 -nostartfiles -Im -lgcc -T ../riscv64-baremetal/link.ld -Wno-implicit-int -Wno-

implicit-function-declaration

<sup>\*\*</sup> Performace may change



#### **Barrel architecture?**

- Fine-grained multithreading processor
- Only one instruction per thread is executed in the pipeline at the same time
- Number of stages = Number of threads



| Cycle | Thread retiring<br>instruction |  |
|-------|--------------------------------|--|
| 0     | Thread 0                       |  |

- Small area
  - No data dependencies/forwarding, no branch prediction, no speculation
- Easier verification
  - Mostly 1-iteration instructions
  - In-order pipeline, instruction retires before next one starts
- High multithreading performance



#### **Barrel architecture?**

- Fine-grained multithreading processor
- Only one instruction per thread is executed in the pipeline at the same time
- Number of stages = Number of threads



- Small area
  - No data dependencies/forwarding, no branch prediction, no speculation
- Easier verification
  - Mostly 1-iteration instructions
  - In-order pipeline, instruction retires before next one starts
- High multithreading performance

| Cycle | Thread retiring<br>instruction |  |
|-------|--------------------------------|--|
| 0     | Thread 0<br>Thread 1           |  |
| 1     |                                |  |



#### **Barrel architecture?**

- Fine-grained multithreading processor
- Only one instruction per thread is executed in the pipeline at the same time
- Number of stages = Number of threads



- Small area
  - No data dependencies/forwarding, no branch prediction, no speculation
- Easier verification
  - Mostly 1-iteration instructions
  - In-order pipeline, instruction retires before next one starts
- High multithreading performance

| Cycle | Thread retiring<br>instruction |  |
|-------|--------------------------------|--|
| 0     | Thread 0                       |  |
| 1     | Thread 1                       |  |
| 2     | Thread 2                       |  |



#### **Barrel architecture?**

- Fine-grained multithreading processor
- Only one instruction per thread is executed in the pipeline at the same time
- Number of stages = Number of threads



- Small area
  - No data dependencies/forwarding, no branch prediction, no speculation
- Easier verification
  - Mostly 1-iteration instructions
  - In-order pipeline, instruction retires before next one starts
- High multithreading performance

| Cycle | Thread retiring<br>instruction |  |
|-------|--------------------------------|--|
| 0     | Thread 0                       |  |
| 1     | Thread 1                       |  |
| 2     | Thread 2                       |  |
| 3     | Thread 3                       |  |



#### **Barrel architecture?**

- Fine-grained multithreading processor
- Only one instruction per thread is executed in the pipeline at the same time
- Number of stages = Number of threads



- Small area
  - No data dependencies/forwarding, no branch prediction, no speculation
- Easier verification
  - Mostly 1-iteration instructions
  - In-order pipeline, instruction retires before next one starts
- High multithreading performance

| Cycle | Thread retiring<br>instruction |  |
|-------|--------------------------------|--|
| 0     | Thread 0                       |  |
| 1     | Thread 1                       |  |
| 2     | Thread 2                       |  |
| 3     | Thread 3                       |  |
| 4     | Thread 4                       |  |



#### **Barrel architecture?**

- Fine-grained multithreading processor
- Only one instruction per thread is executed in the pipeline at the same time
- Number of stages = Number of threads



- Small area
  - No data dependencies/forwarding, no branch prediction, no speculation
- Easier verification
  - Mostly 1-iteration instructions
  - In-order pipeline, instruction retires before next one starts
- High multithreading performance

| Cycle      | Thread retiring<br>instruction |  |
|------------|--------------------------------|--|
| 0          | Thread 0                       |  |
| 1          | Thread 1                       |  |
| 2          | Thread 2                       |  |
| 3          | Thread 3<br>Thread 4           |  |
| 4          |                                |  |
| 5 Thread 0 |                                |  |



#### **Barrel architecture?**

- Fine-grained multithreading processor
- Only one instruction per thread is executed in the pipeline at the same time
- Number of stages = Number of threads



- Small area
  - No data dependencies/forwarding, no branch prediction, no speculation
- Easier verification
  - Mostly 1-iteration instructions
  - In-order pipeline, instruction retires before next one starts
- High multithreading performance

| Cycle      | Thread retiring<br>instruction |  |
|------------|--------------------------------|--|
| 0          | Thread 0                       |  |
| 1          | Thread 1                       |  |
| 2          | Thread 2                       |  |
| 3          | Thread 3                       |  |
| 4          | Thread 4                       |  |
| 5          | Thread 0                       |  |
| 6 Thread 1 |                                |  |



#### **Barrel architecture?**

- Fine-grained multithreading processor
- Only one instruction per thread is executed in the pipeline at the same time
- Number of stages = Number of threads



- Small area
  - No data dependencies/forwarding, no branch prediction, no speculation
- Easier verification
  - Mostly 1-iteration instructions
  - In-order pipeline, instruction retires before next one starts
- High multithreading performance

| Cycle | Thread retiring<br>instruction |  |
|-------|--------------------------------|--|
| 0     | Thread 0                       |  |
| 1     | Thread 1                       |  |
| 2     | Thread 2                       |  |
| 3     | Thread 3                       |  |
| 4     | Thread 4                       |  |
| 5     | Thread 0                       |  |
| 6     | Thread 1                       |  |
| 7     | Thread 2                       |  |



#### **Barrel architecture?**

- Fine-grained multithreading processor
- Only one instruction per thread is executed in the pipeline at the same time
- Number of stages = Number of threads



- Small area
  - No data dependencies/forwarding, no branch prediction, no speculation
- Easier verification
  - Mostly 1-iteration instructions
  - In-order pipeline, instruction retires before next one starts
- High multithreading performance

| Cycle | Thread retiring<br>instruction |  |
|-------|--------------------------------|--|
| 0     | Thread 0                       |  |
| 1     | Thread 1                       |  |
| 2     | Thread 2                       |  |
| 3     | Thread 3                       |  |
| 4     | Thread 4                       |  |
| 5     | Thread 0                       |  |
| 6     | Thread 1                       |  |
| 7     | Thread 2                       |  |
| 8     | Thread 3                       |  |

# NOEL3 primary features



# **NOEL3 – Primary features**



<u>https://www.gaisler.com/products/noel3</u>



### **NOEL3 – Determinism**

### Why?

- Real-time systems
- System predictability

#### What?

- Accurate, predictable execution time regardless of what other threads are doing
- · Threads do not affect each other execution-wise
- Access to deterministic bus peripherals

### What?

- Accesses to external memory
  - TCMs should suffice
- Interrupts
  - Response time is bounded
  - Possibility to allocate one of the threads to only service interrupts
- AHB accesses

### How?

- TCMs
- Deterministic bus interface
- Time slotted accelerator interface





## **NOEL3** comparison

| Architecture                      | RV32I    | RV32IZmmul | RV32IMAFCBZfhZfa |
|-----------------------------------|----------|------------|------------------|
| Hardware execution threads        | 3+1      | 3+1        | 6+1              |
| Tightly Coupled<br>Memory         | 32+ 8 KB | 256+256 KB | 256+256 KB       |
| Floating point                    | No       | No         | Yes              |
| Multiply/Divide                   | No       | Multiply   | Yes              |
| Atomics                           | No       | No         | Yes              |
| Max frequency                     | 208 MHz  | 202 MHz    | TBD              |
| CoreMark/MHz (for<br>each thread) | 0.35     | 0.9        | TBD              |
| LUTs                              | 1.5k     | 1.8k       | TBD              |
| BRAMs                             | 32       | 134        | TBD              |

Example implementation figures for a Kintex Ultrascale FPGA. The figures have been extracted for the current development stage and are subject to change.

More NOEL3 configurations will be available.

**TCM** – Tightly Coupled Memory. Directly connected to the CPU for low latency and predictable timing

#### Compressed –

RISC-V 16-bits instructions to reduce the size of compiled programs

**Atomics** – RISC-V instructions for atomic operations, crucial for multithreaded and multi-core processing

**Bit Manipulation** – RISC-V instructions to directly manipulate individual bits or groups of bits, improving performance and code density





### **NOEL3 – IP ecosystem**

#### • Memory controllers (with EDAC):

- PROM/IO/SRAM/SDRAM
- SINGLE/DUAL/QUAD SPI
- DDR2/DDR3 SDRAM
- NAND Flash memory controller

#### • Peripherals:

- UART
- SpaceWire (controller/router)
- High Speed Serial Links (SpaceFibre, Wizardlink)
- 32-bit master/target PCI interface
- MIL-STD-1553B
- CAN, CAN FD
- Gbit Ethernet
- FPGA supervisor
- ...and many more, see <u>https://www.gaisler.com/grlib-ip-library</u>

| High Speed<br>Serial Links  | Gbit Ethernet              | MIL-STD-1553B            |
|-----------------------------|----------------------------|--------------------------|
| SpaceWire                   | <b>NOEI3</b>               | CAN/CAN FD               |
| 32-bit PCI<br>master/target | UART                       | NAND flash<br>controller |
| PROM<br>controller          | DDR2/3 SDRAM<br>controller | SPI<br>controller        |
|                             |                            |                          |

SYSTEM-ON-CHIP IP LIBRARY

# Fault tolerance features



# **NOEL3 Fault tolerance**

#### SECDED ECC:

- Protection of memory blocks (tightly coupled memories, register files) using error-correcting codes
- Two means of protection are supported:
  - Native ECC in FPGA fabric (Frontgrade Certus, AMD 7-series/Ultrascale/Versal, Microchip RTG4/Polarfire, NanoXplore)
  - Generic RTL-based ECC (ASIC, other technologies)
- Protection of flip-flops and logic is left to the underlying technology
  - Rad-hard FPGAs and ASIC libraries
  - TMR for SRAM-based FPGAs

#### Scrubbing:

- TCMs hardware scrubber
- Register file hardware scrubber



# Software ecosystem



### **NOEL3 Software ecosystem**

### Hardware debugger:

- RISC-V standard specification
- GRMON4

### **Target software:**

- Planned examples for software to take advantage of deterministic multithreading
- Planned RTEMS
- Planned Zephyr ports







## **NOEL3** Availability

- First implementation of RV32IZmmul running software
- Development continues with planned efforts on verification and feature additions
- Target release of RV32IMAFCBZfhZfa is Q3 2025
- **NOCLB RISC-V®**

• Sign up to our newsletter to receive updates, <u>www.gaisler.com/newsletter</u>