John Regehr’s question below gets to one of the basic problems I see in the field of “formal methods” -Â the general failure of researchers in the field to look at experimental data and the operation of actual systems.Â Â The normal operation of science in, say physics or biology or even analysis of algorithms, involves interplay between the development of theory and practice. Theoretical developments that conflict with experimental data are considered to be problems for the theory and strenuous efforts are made to fix the theory. “Formal methodists”Â have notÂ been, in general, willing to dig into theÂ data enough to follow this approach.Â The question of “liveness” provides a good example.
The informal concepts of “liveness” and “safety”Â seem interesting for understanding programs. The usual definitions [ e.g. from Manna and Pnueli’s “Temporal Verification of Concurrent Programs”] are something like:
- A liveness property says someÂ assertionÂ Â about system state is eventually true – for example, a program terminates or a program advances.
- A safety property says some assertion about system state is neverÂ true – for example, the systemÂ operating system deadlocks.
These are possibly usefulÂ concepts with some intuitive appeal and we should be able to represent these intuitive concepts mathematically. For me, the next step is to look in detail at the data and to try to formalize what engineersÂ find pertinent about these properties in actual systems. Â But 25 years of follow-on research (M&P’s paper is 1981) involves all sorts of wacky but clever and technicallyÂ difficultÂ attempts to find a mathematical interpretation in which “liveness” is a property of infinite state sequences. For example, there is an influential paper by Alpern & Schneider (the tech report is 1986) which uses BÃ¼chi automataÂ and infinite state sequences to model liveness properties.Â Â Motivation from empirical data is entirely absent from that work. I can’t imagine a “liveness” property of an operating system that has no bound. There are properties where we don’t really need to know the bound or whereÂ the bound depends on some partially known factors (e.g. how much memory will be available), but there is still a bound. When programmersÂ say that every process makes progress, the natural liveness property of a time-shared system,Â they mean eventually to be within some time frame. Scheduling processes in the limit is not an interesting property of operating system. Perhaps there are infinite state properties of interest, but I have not seen them described in the literature.
If a mathematical method involves counter-intuitive concepts or strange infinities or recourse to non-euclidean geometries we don’t normally consider part of the operation of programs, but it helps solve a problem then itÂ would beÂ justifiable to use it.Â Nobody runsÂ algorithms in the limit butÂ analyzing them in the limit gives us a rough idea of performance and, in many cases, a good intuition about how real implementations may perform. Here’s Knuth
The O-notation is a big help in approximation work, since it describes briefly a concept that occurs often and it supresses detailed information that is usually irrelevant. Furthermore it can be manipulated algebraically in familiar ways … [ Volume 1, 3rd edition. Page 108]
We can look in vain for credible arguments of this sort in the formal methods literature. Instead all sorts of exotic concepts primarily from meta-mathematics are brought in without justification from the field of study – because formal methodsÂ became its own field of study only superficially related to actual computer engineering.
Real computer systems are Â finite state. Either the system can loop without ever satisfying the assertionÂ or it cannot and must enter a state where theÂ assertion is true. (And calling a loop an “infinite path” only makes something simple into something complicated.) Boundedness isÂ a property of the actual objects we are attempting to model. A reasonable argument for using BÃ¼chi automata or well-founded sets or lattice-theoretical methods would need toÂ show howÂ some properties of software systems that are important to engineers could be represented with suppression of “detailed information that is usually irrelevant” and advantages in algebraic manipulation. Instead, formal methods researchers generally justify these methods on the simplicity of the mathematics needed to formalize the concepts.Â The argument that some unbounded notion of eventuality can be captured by a certain simplified but complete logical construct that only needsÂ 5 axioms and 3 deductive rulesÂ may be an interesting exercise, but it is a result about formal logic, not a result about computer science – unless that notion of eventuality can be shown to be important in actual systems design.
So when I looked at liveness properties of scheduling in the second chapter, the implicit bound is a one of the primary issues to analyze and it is interesting to examine what happens when our high level constraint that “all processes must eventually progress” comes in contact with resource constraints that cost microseconds. Liveness properties remain usefully distinct from safety properties even if we discard unbounded liveness. “Every process eventually makes progress” is a perfectly sensible abstraction of one property of a time-shared operating system and it is different from safety properties like “the system never deadlocks”.Â We can use the bound to transform liveness into safety – “there is no reachable state where more than T nanoseconds have passed since a process has executed for any process P for some T that we do not specify now.” But I think this just reflects the way systems actually work and it is useful for designers to understand that thereÂ may beÂ underlying real-time constraints even for time-shared processors.Â In fact, I believe that many of theÂ most difficultÂ problems of practical system design (e.g. the “fork bomb” and “OOM” in UNIX systems) are harder to solve because our design methods “supress detailed information that is usually irrelevant” at the high level, but do not properly unsuppress this information when we need it.Â [This last paragraph may be particularly obscure, but I will try to clarify it more later]