The embedded enterprise

It is now possible to put together sophisticated and powerful embedded and control systems that are mostly composed of pre-existing working software. One of the demonstrations we showed at AMD’s recent embedded workshop was a small two processor (4 core) box running RTLinux and Windows XP. Embedded development teams for RTCore based products can mirror the decoupling of the RTCore architecture – one team can do the low level control application and other teams can work on all the non-time-dependent “enterprise” software. A printer controller might be developed by connecting a real-time component under RTCore that talks to sensors and actuators, a database and remote update/maintenance subsystem under Linux and an operator interface running under Windows XP. These three components can be developed independently by different teams as long as architects have correctly designed the interfaces. The RTCore component on current Opterons can guarantee either sub 10 microsecond or 1 microsecond worst case interrupt latency – depending on whether you share the processor core with Linux or dedicate a processor core to real-time. The Linux application could be made web accessible using Apache, for example, with a backend database either MySQL or Oracle or even DB2. The Windows application can go right to the video and our experience is that the double buffering of the Linux file system and Windows itself improves performance when running a virtualized windows kernel. A Sun Java application on the Linux side may connect to C# under Windows on one side and C++ or C executing in the RTCore environment on the other.

Let’s look at two example applications – a printer and an automobile “infotainment” system. One way to build a printer is as a system [FPGA: RTCore: Management: Operator] where RT software in RTCore manages a controlling FPGA on one side (that manages the sub microsecond requirements of the hardware) and provides paths for the Linux based Management and Windows based Operator subsystems to submit commands and get data. For a lighter infotainment system, perhaps using a single processor, we might need [Devices: RTCore: Display] where the RTcore component acts as a device accelerator, replacing dedicated hardware (such as a smart MPEG ) with software, a Windows based Display manages multimedia, and Linux is just used essentially as a BSP.

Security can be easily improved by making the three components cross check each other. For example, a remote update connecting to the Java based server under Linux may ask the Windows based and Rtcore based components to cross check certificates. An attacker would have to deal with three different operating system environments working together. Reliability can also be improved – e.g. by having Rtcore force a reboot of Linux or Windows if progress is not being made.

The underlying theme here which shows how far this new type of embedded software has moved from the traditional “hand crafted” embedded development process. Programmers have large functional components available to use in development instead of starting from the traditional impoverished low level embedded environment. The implications for both productivity and creativity are significant.

Speculation on modularity and information theory

One way of defining modules is by what engineers have to know. If a program X consists of components X1 … Xn then we can say Xi is a module if a programmer can modify it without learning “much” about any other module. A second, related measure can be in terms of information hiding – a module essentially encapsulates some data and exports interfaces for manipulating the data. This second definition seems quantifiable. Given a system state S, let Xi(S) be the state of component Xi – essentially all the data that Xi in state S will use to go to the next state. We need for Xi(S) to be minimal – it must contain only information that Xi uses to compute next state – no extra information at all. For example, if we have a poorly designed component that is just a single variable with methods set variable and get variable then Xi(S) would just be the contents of the variable. Now if looking at Xj(S) for all the other components Xj would let us know which of the other components last wrote a value to Xi and what that value was, we could recreate Xi(S) from the other components. In other words, we would see that Xi hid nothing. If Xi did something as simple as guard its variable against values outside a certain range, then we could not recreate Xi(S) from the other Xj(S) states. In that case, Xi is modular in the information hiding sense. We could then look at Xi and see how many bits are needed to reproduce that range information outside of Xi. And that count of bits is, in a sense, the measure of “how modular” Xj can be considered to be. And then maybe its worth doing the same exercise for the first definition of modularity.

multicore and virtualization and real-time provisioning

We added the “reservation” capability to RTCore real-time POSIX threads a couple of years ago. Normally, the client platform operating system shares the processor with RTCore threads as a completely pre-emptible low priority thread. This works exceptionally well. Worst case interrupt latencies of under 10 microseconds on Opterons and under 20 on Arm9s are possible – and this is worst case under load and measured over days or weeks not the usual lipstick-on-pig numbers common in the industry. For timer driven threads, we can make these times better by wasting a little compute time using the timer advance feature of RTCore, but for true “every microsecond matters” pedal-to-the-metal hard real-time, processor reservation is the way to go and the widespread availability of multi-core processors makes the case more compelling. Reservation works by making a processor core disappear from the view of the symmetric multiprocessing (SMP) client operating system so that only real-time threads run there. Reservation on RTCore is a mature technology that has been in use production systems for years. Interrupts are redirected, cores can be rebruceserved and unreserved dynamically, reservation works for user-space memory protected real-time threads and each real-time core runs its own cleanly separated real-time scheduler. The performance improvements are dramatic on true multiprocessors and nearly as dramatic on multi-core. With multi-core processors, we can start treating processor cores as allocatable resources. Consider a 2 core system where Windows XP runs under a virtual machine and real-time threads reserve the second core – executing out of L2 cache. The screen-shot in this note shows XP running while a real-time thread runs the “jitter” test on the second processor. Because processor reservation is designed within the POSIX API and on the basis of the “decoupled real-time” paradigm at the heart of RTCore, it offers the same interface for SMP and multi-core and has scaled up to multi-core along with the client operating systems.