Happy new year and validation

Updated below!

Years ago I proposed the following code snippet as a minimal standard for a useful verification method. Still not quite there.

/* you are not expected to understand this */
if(save()){
          load_memory_management(); /* map in new current */
          return;
}else {
         current = new;
        resume();
        panic("RESUME RETURNED! KERNEL BROKENn");
}

Not so complicated, but if you can’t do this, you cannot handle operating systems at all- or any threading.

In response to the comment. This is a paraphrase of Dennis Ritchie’s  Unix switch code and comment.  “Save” saves the registers and the return address for “current” and always returns 0. “Resume” restores the saved registers and return address for current (by doing so, changes stacks) and returns 1 – to that saved return address which is the test of the “if”. So the “if” statement is always evaluated as false when saving and is always evaluated as true when resuming and “resume” never returns to the else.  I always took Ritchie’s comment to be a warning that the code and the previous comments were not easy to understand. The state change here is basic thread switching but it is slippery.

Cell phone handsets and linux and software value

Update at the end.

Two FSMLabs alumni are leading the charge for Linux handsets in two different companies. Jason Whitmire is now GM of cell phones for Wind River. Jason’s blog is here and a Linuxdevices article discusses his first post which concerns the, well, machinations of Google and the two big cell-phone consortia Limo and OHA. I’ve argued earlier that Apple has the advantage of clear mission and business model in this space. They want to create and sell handsets and handset software based on their existing user interface and operating system software. Everyone else in the game has more complicated stories or is constrained to work with bigger players that have more complicated stories. And nobody has any reason to trust anybody else in the game. Apple can just make an alluring product, for which there is undeniable customer demand, and push it through the sales channels – they control the entire process and have something that end users want. If you make a cell phone Linux platform, on the other hand, or some middleware, or something else – you have have to sell operators and/or phone makers on the value proposition. And the standard problem in the embedded market is that there is irrational resistance to paying much for weightless invisible software. The impulse for device makers is to simply treat software as a BOM component and squeeze pricing – no matter what the cost in quality, time to market, or upgrade path. An operating system or a windowing system or an entire platform is not a part like a capacitor that can be graded on one or two indicators and replaced by any other one with a similar grade. These are enormously complex engineering systems that have pervasive effects on performance, battery life, ease of use, scalability and so on. One of the things we like about the enterprise space is that, for the most part, customers make decisions about how much money an end product will make them or save them versus cost, and do not rely on their intuition about how much things should cost. And that’s another advantage for Apple: they know the value of software.

Disclaimer: I don’t even know much about WindRiver’s handset strategy beyond what I read in these articles and certainly do not speak for them on this topic.

Response to question in comments:  Krish Kupathil.  Actually, there are a couple more, but I am not too sure exactly what they are doing. Some of our technical folks who went to WindRiver are probably working on phone architecture – I guess. Alumni of our former offshore group in Pune are also probably working on phones and so are a couple of ex-interns.

Not me – at least on the handest.

formal methods considered harmful and more on soft real-time

[fixed a couple of typos, Dec. 20 2007]

John Regehr writes:

On the other hand, there is plenty of useful work to be done on supporting time sensitive applications (I’ll just avoid saying “soft real-time”) even when no guarantees are made. Let’s take the example of an OS that disables interrupts when any process is executing inside the kernel. This is not hypothetical, I was recently talking to people designing such an OS because they are formally verifying it; they did not believe they were capable of verifying an OS that accepts interrupts more liberally. Anyway, needless to say the average-case response time to interrupts of this OS will suck. Now if we have some breakthrough in verification that permits interrupts to be accepted most places in this kernel, average case responsiveness will improve though worst-case may
not. Is this useless? Of course not, if it makes it possible to watch videos or listen to music or similar, on this OS. For applications of this kind a mathematical guarantee is simply overkill and what we really care about is more like “works almost all of the time in practice, given actual workloads.”

This is perfectly reasonable, in fact, there are many interesting properties of operating systems that are not real-time properties at all if we define real-time properly so that it does not end up meaning “any performance properties.” What’s unreasonable is this idea that crippling a piece of software to make it possible to “prove” that it “works” gains much of anything. That story about the drunk looking for his keys under the light-post is supposed to be a joke, not a design methodology. It’s fine to argue that by simplifying a system one can validate it while not losing too much performance or functionality for the purpose – but all too often the second part of the argument is omitted. What is the point of a system that is provably “correct” at a performance level that assures it cannot be used for its ostensible purpose? Worse, this methodology is often used in order to make it possible to apply “mathematical” methods which have not been ever shown to be more reliable than testing or code reviews or even good luck charms (the claim that “formally validated code is better than other code is, at best, unproven).

The phrase “mathematical guarantee ” is worth some deeper analysis.

  • Mathematical. The types of “hard specifications” that I am advocating do not need to be overly “mathematical” and certainly don’t need to be “formal” to be useful. On the contrary. The “formal methods” advocates have made a fetish of formalization to the detriment of useful engineering specification and that causes people to shy away from attempting to be more precise. What we want from a specification, at minimal, is that it should be falsifiable. Perhaps an incremental approach to adding fast paths to operating system services to reduce interrupt latencies can make the system described above “more responsive” for “standard workload”. If so, we should be able to characterize “more responsive” in some falsifiable way. If the specification is just “there is no wall-clock observable delay on the echoing of typed characters while stress program S runs in the background and a tester types a file in the ed program” or that “programs A,B, and C start within 5 seconds of double clicking the icon” then it is at least a falsifiable specification and is a concrete goal for the software developer. In any case, we can see that “average” is not what is needed, since “average” implies that if the system runs for 1 hour, a 5 minute total stop can be amortized away by 55 minutes of high speed operation. Perhaps what we want is as simple as “the video player on an otherwise unloaded system will not stop more than 10 seconds cumulatively over a one hour period where stops are delays in frame display that are over 1/60th of a second.” But this “soft” real-time requirement carries a real-time property with it – as argued previously.
  • Guarantee. There is a difference between a design guarantee and an evaluation guarantee. A TCP/IP network stack is designed to produce an in-order, reliable, stream of data between sender and receiver. But it is possible that in some situations, under some loads and with some definition of “reliable”, a simple datagram stack can produce the same property. The TCP/IP stack is designed to guarantee a property that the simpler stack may be shown to have in some circumstances. An evaluation guaranty is an attempt to show that a system has some property. One of the most useful ideas in Common Criteria is the evaluation assurance level (EAL) which is supposed to indicate the reliability of such a guaranty. One might find “we ran it a couple of times and it looked good” to be less or more reassuring than “we produced a 300 page document that strenuously exercised the symbol production features of LaTex” but these are both “evaluation guarantees”. Note that design guarantees should be backed up by evaluation guarantees that the design is correct and implemented correctly.

“Real-time software” normally refers to software that offers design guarantees on latency and scheduling times, data rates, or other time sensitive operations. Validating that the design succeeds requires some combination of testing, mathematical proof, and design and code review. But we may also be able to validate real-time properties of code that has no design real-time properties at all. One could take an instance of a LAMP stack (Linux + Apache + MySQL + PHP) and subject it to extensive testing on a particular hardware platform and observe that in practice it satisfies a real-time specification. Suppose our measurements show that all pages of a certain limited complexity will be “served” within 1 second over a 1GB network that is reserved for the LAMP system and where requests are at a rate bounded by R and there are no more than 30 seconds cumulative failure on this property over 1 hour. That’s a falsifiable, concrete, real-time specification – mathematical, even [this is in response to a note from B., a very knowledgeable system designer who may or may not be quotable].