Investigation of real-time capable embedded SMT processor techniques


420 / Accepted / Finalized / Evaluated

Report of research results

Processors in current embedded systems are characterised by a simple architecture, encompassing short pipelines and in-order execution. These type of pipelines make the computation of the worst case execution time more easy. On the other hand, current embedded systems place an increasingly demand on the performance they require from the processor. This requires the use of hardware components - such as caches and branch predictors - that improve the performance. However, these hardware features complicate the computation of the worst case execution time.

Because embedded processors must be low in cost, obtaining as much performance as possible from each resource is desirable. Hence, a viable option is a simultaneous multithreading (SMT) processor, which shares many resources between several threads for a good cost-performance trade-off. In an SMT processor instructions of several threads can be issued simultaneously to the back-end pipelines of a processor yielding a dynamic, cycle-by-cycle reconfiguration of the assignment of pipelines to threads, based on the decision of scheduler hardware. Multithreading has so far been mainly implemented to increase processor performance by latency hiding - the most prominent examples of which are the Intel Pentium 4 processor and the IBM Power 5. In the embedded world the use of SMT processors is complex because although SMT processors have high throughput but, threads in an SMT
share a lot of resources. As a consequence the interaction between threads and hence the execution time of each thread becomes highly unpredictable, which is an undesirable feature in embedded systems. Moreover, appropriate hardware scheduling techniques should be integrated within the SMT pipeline to ensure WCET analysability of one or more hard-real-time threads.

The aim of this cluster is to propose an SMT architecture for real-time systems that meets the following objective:

* First, a key point is to provide predictability or at least low variability in the execution time of the threads executed on the SMT. This objective can be fulfilled either by:

a) Avoid any penalty due to common resource use by different threads, which allows us to use classical WCET estimation technques, or

b) Upper-boundinig the interaction in resource use that threads can suffer from other threads 

* Another key point is to maintain the energy use and power consumption as low as possible (a highly desirable feature in embedded systems). Hence, an analysis should be done about the power consumption distribution in an SMT processor, proposing new techniques to reduce it.


Research cluster

Requested: € 48900
Granted: € 48900

Requested: € 13200
Granted: € 13200

4 Cluster meetings (2p from UPC, 2p from Augsburg, 2p from Tolouse)

4 x 6 x 1000 = 24.000 EUR

Cédric Landet exetnded visit to Augsburg (travel + 3 months): 5300 EUR

Prof. Sainrat visit to Augsburg: 1400 EUR

Prof. Ungerer's extended visit to UPC (travel + 2 months lodging):  5000 EUR

1 Student fellowship at UPC (12 months): 12 x 1100 = 13200 EUR


Requested: 0 month(s)
Granted: 0 month(s), starting on: Tue, January 1, 1980

VALERO Mateo (UPC) (--member--)
UNGERER Theo (University of Augsburg) (--member--)
SAINRAT Pascal (CNRS) (--member--)
RAMIREZ Alex (UPC) (--member--)
NORDEN Erik (INFINEON)

Francisco Cazorla, Barcelona Supercomputing Center
Cédric Landet, University of Tolouse
Sascha Uhrig, University of Augsburg
Florian Kluge, University of Augsburg



Go to the group page of this cluster