Soft real-time and QOS (revised)

[ revised version of an older post]

“Soft real-time” is a perfect example of the “soft design” noted in an earlier post. There are perfectly good ways of characterizing quality of service (QOS) assurances precisely. Doug Jenson proposes one possible definition:

The general case of a deadline (which is a soft deadline) has utility measured in terms of lateness (completion time minus deadline), tardiness (positive lateness), or earliness (negative lateness). Larger positive values of lateness or tardiness represent lower utility, and consequently larger positive values of earliness represent greater utility.

That is, he says a real-time system will have a function

0 =< Utility(Error) =< 1

where, Error > 0 for a late computation and Error < 0 for an early computation. A hard real-time system will have Utility(Error)=0 whenever |Error| is greater than some acceptable limit. That’s a good start, and there are other obvious ways to quantify. In practice, utility functions may require history: dropping the fourth frame of video during a 1 second interval is different than dropping the first frame during that period. We might specify 75 frames/second which means about 13 milliseconds a frame for flicker free video. Then we could say that the average is x frames per second and there are no outliers more than n standard deviations from the mean. Or we could require that over any interval of time t seconds there will be at most n delays of more than E milliseconds. There will be a big difference between requirements for editing (which may permit no frame delays more than 100 microseconds) and consumer viewing which will be a moving target but may allow dropping frames that are more than 2 milliseconds late, but not permit more than 1 frame to be dropped every 2 seconds or something like that. Note that one of the distinguishing characteristics of any type of real-time system is that timing is not subject to amortization. Being 10 seconds early and then 10 seconds late, does not mean that the system is perfectly on time.

In practice, we rarely see quantitative specifications of real-time behavior in “soft real-time” systems probably because such specifications would reveal the engineering flaw in most soft-real-time systems. In order to make any QOS assurances you need to be able to make hard assurances and as soon as specifications are written down in any detail at all, this inconvenient problem becomes all too clear. If our specification is that over a one second period, no more than 2 frames will be more than 100 microseconds late, then if the first two frames come in 110 microseconds after deadline, the all of the remaining frames in the second must be under 100 microseconds late. The distinguishing characteristic of “soft” real-time, in practice, is that a soft-real-time system can tolerate some timing errors before falling back on a more rigid timing requirement. Here’s a summary.

A hard real-time system has firm worst case timing properties that must always be met to avoid failure

A soft real-time system is a hard real-time system with some recoverable error modes

But if we accept the above definition, then designing “soft real-time” systems will be seen to be more demanding than designing hard real-time systems. Specifications like “seems peppy” will no longer be acceptable and the types of sloppy mechanisms that can be shown to reduce the tail of distributions in some circumstances will be less appealing.


Hollywood writers are smarter than programmers

The strike in Hollywood is about royalties. The writers, who are used to getting a share of the royalties on their work, want to keep getting royalties on work shown on the internet. The producers are making the claim that since they give away views on YouTube and other internet sites, you know free content, they don’t need to pay anything. However, the writers, a suspicious bunch, have realized that the producers are selling ads around the free content – in other words, they are still making money from the creative work of the writers. And the writers want to share in that revenue stream.


Sparc T2 (niagra 2)

UT architecture seminar today was by Greg Grohoski from Sun – an updated version of his Hot Chips talk. I’m not a big fan of this approach to chip architecture: 8 processors, each with 8 threads, but they are working hard on a real problem. The problem is the usual one of high speed processors waiting around for memory. It was interesting to see how OS limitations continue to cripple processor design. Cort Dougan’s nice work on memory management showed that careful tuning of the operating system mapping code got software table walk to blow away hardware table walk. But Sun chip designers seem to have not had an option of improving Solaris code – or even with getting rid of horrible errors like the four page tables that might all have to be searched to resolve a page miss! The interrupt hardware had to emulate 20 years of junk and the nightmarish register windows are still there. Even worse, the chip designers are trying to fairly schedule threads – although I’m not sure whether that is over-reach by chip architects or limitations of the OS. So, to me it looks like a huge amount of smarts, effort, and invention, compensating for problems that could have been solved in the operating system. One interesting note was that their measurement apparently do not show that the doubling the shared tiny L2 cache would make much of a difference. My conjecture is that modern applications are not showing locality of reference – due to a long chain of decisions that look unsmart in retrospect.