monads and the delusional basis of "functional programming"

I’m struggling to try to understand something more about functional programming but so far have seen nothing to persuade me that programs should be considered to be mathematical “functions”. We don’t have mathematical objects like integers and reals and functions in programming, we have bit patterns that represent  mathematical objects and other kinds of objects and we have algorithms that implement functions, usually they only approximate those functions. Pretending otherwise creates complexity for no reason.  Programs are mechanisms for producing or modifying bit patterns.  It’s obviously useful to be able to treat, for example, a variable of unsigned integer type as if it were an integer, but it’s important to remember that it’s a bit pattern representing an integer, not an actual integer.

It’s also worth noting that the notion of types in working mathematics is much more fluid and “informal” than the type systems that theoretical computer scientists prefer.  There is no reason to believe that even if types can be organized in some sort of hierarchy, doing so produces any illumination at all.

Russell introduced what is often viewed as his most original contribution to mathematical logic: his theory of types—which in essence tries to distinguish between sets, sets of sets, etc. by considering them to be of different “types”, and then restricts how they can be combined. I must say that I consider types to be something of a hack. And indeed I have always felt that the related idea of “data types” has very much served to hold up the long-term development of  programming languages. (Mathematica, for example, gets great flexibility precisely from avoiding the use of types—even if internally it does use something like them to achieve various practical efficiencies.) – Wolfram.

Current reading

img_20161121_152216238
Wadler’s influential “monads” paper for Haskell. It seems like a classic case of making something simple sound profound and mysterious.

McCarthy’s original LISP paper.  Just terrible.

A problem FSMLabs detected with GPS on London LD4 datacenter has been fixed. 

Dennis Ritchie and Albert Meyer on classification of recursive functions. I wish I could find a full copy of this.

The SEC Consolidated Audit is pushing clock sync requirements, but needs improvement.

The most successful functional language – Mathematica.

A critique of Haskell

A Haskell project at Facebook.

Some real engineering: the gcc optimizers.

Categories as algebra: An essential ingredient in the theory of monoids (via @joelvanderwerf )

 

Understanding Paxos and Distributed Consensus

vase

(minor wording correction and more complaining added 10/2/2016,
 minor edits 10/5/2016)

Multi-proposer Paxos is a very clever and notoriously slippery algorithm for obtaining distributed consensus. In this note I try to explain it clearly and provide a correctness proof that gives some intuition why it works – to the extent that it does work. I am specifically concerned with Multi-Proposer Paxos, the first algorithm discussed in “Paxos Made Simple”.  What is often called “Paxos” involves single Proposer variants which are much simpler and less interesting.

I think this is right – let me know if you spot an error.

Rules for how Paxos works

There is a finite set of processes or network sites, some of which are Proposers and some Acceptors (the sets can intersect). Each proposer has a unique id, confusingly called a sequence number. A proposal is a pair consisting of a proposal value and the sequence number of the Proposer. The goal is to ensure that if two Proposers believe that they have convinced the Acceptors to come to a consensus on a value, they must both agree on the same value, even though they may disagree on sequence number. The most clever part of Paxos is the observation that since we don’t care which value wins, even though we do care that some unique value wins, we can force Proposers to inherit values of the most likely previous proposal.

  1. Proposers can ask Acceptors to approve sequence numbers and to accept proposals which include a value and the Proposer’s sequence number. Acceptors do not have to approve or accept but are limited to approving and accepting what Proposers send them.
  2. When an Acceptor approves a sequence number it:
    1. Promises to not approve any smaller sequence numbers
    2. Promises to not accept any proposals with smaller sequence numbers
    3. Returns to the Proposer the proposal with the highest sequence number it has already accepted, if any.
  3. The Proposer cannot send any proposals or select a value for a proposal until it gets approval for its sequence number from a majority of Acceptors.
  4. Once the Proposer has approval from a majority of Acceptors it must select the value of the proposal with the highest sequence number sent to it during the approval phase (the inherited proposal). If the approval phase did not turn up any accepted proposals, the Proposer can pick any value. In this case the Proposer “originates” the value.
  5. Once the value is selected, the Proposer can never change the value and can only propose the pair of that value and its sequence number – unless it increases its sequence number, abandons the proposal and value, and starts over.
  6. The choice of a new sequence number must preserve the property that each sequence number belongs to only one Proposer, ever.

(see the  code for what this looks like in a simulation)

Why it works intuition

The first thing to see is that individual Acceptors are forced to order their approvals and accepts by sequence number. If an Acceptor has approved j and accepted (i,v) and j>i then we know that it must have approved j after it accepted (i,v). The sequence of operations for that Acceptor must act like this:

Continue reading “Understanding Paxos and Distributed Consensus”

Chang-Maxemchuk atomic broadcast

The Chang-Maxemchuk algorithm (US Patent 4,725,834 ) solves atomic broadcast (and in-order broadcast) problems for distributed networks in a far simpler and more efficient way than some popular alternatives. In fact, the obscurity of this method is hard to understand given the current interest in distributed consensus.

The basic idea is simple algebra. A source site or process broadcasts “data messages” to a list of sites n sites. Data messages are tagged with sequence numbers and each sequence number is associated with exactly one “responsible”destination site so that  n consecutive sequence numbers map to n sites (the entire list).  For example, if the list sites are numbered 0 … n-1, then sequence number q could be mapped to responsible site q mod n.  Sites on the list broadcast numbered acknowledgment messages to all sites on the list and the source. Only the responsible site for sequence number can create an acknowledgment message numbered  i and the responsible site will only create the acknowledgment if it has received data message i and all lower numbered data messages and acknowledgment messages.  As a result, when the source sees acknowledgment message n+i it is assured that all sites have received the data message numbered  and the acknowledgment.

That’s normal operation mode. There is a reformation mode which is used to create  a list after a failure.  Reading the reformation mode description in the original paper is a good education in how to describe standard “leader election” clearly:

Any site that detects a failure or recovery initiates a reformation and is called an originator. It invites other sites in the broadcast group, the slaves, to form a new list. The reformation process can be described in terms of the activities of sites joining and committing a valid list. A valid list satisfies a set of specific requirements, as explained below. When the reformation starts, a site is invited to join a new list and eventually commits to a valid list. When all of the sites in a valid list are committed to this list, the list will be authorized with a token and the reformation terminates. This list becomes the new token list. Multiple originators can exist if more than one site discovers the failure or recovery. During the reformation, it is possible that acknowledged messages from the old token list have been missed by all sites that join a new list.

To guarantee that there is only one new list and that this list has all of the committed messages, the list must be tested before it can be considered a valid list. Specifically, a list becomes valid if it passes the majority test, the sequence test, and the resiliency test.

Majority Test. The majority test requires that a valid list has a majority of the sites in the broadcast group. During the reformation, a site can join only one list. The majority test is necessary to ensure that only one valid list can be formed.

Sequence Test. The sequence test requires that a site only join a list with a higher version number than the list it previously belonged to. The version number of a token list is in the form of (version #, site number). Each site has a unique site number. When a new list is formed, the originator chooses the new version # to be the version # of the last list it has joined plus one. Therefore, token lists have unique version numbers.

The originator always passes the sequence test. If any of the slaves fail the sequence test, it tells the originator its version number. The originator increments the higher version # the next time it tries to form a new list. The combination of the majority and the sequence test ensures that all valid lists have increasing version numbers. This is true because any two valid lists must have at least one site in common, and a site can join a second list only if the second list has a higher version number. Therefore, the version numbers indicate the sequence in which token lists were formed.

This paper was published 1984 and the first Paxos paper was from 1988. In my opinion Paxos is a big step backwards from CM.

 

Time out of joint

Financial trading venues and trading systems operate so quickly and rely on clocks so deeply that events like the one noted in this FINRA report are more common than many understand

The findings stated that the firm transmitted to OATS New Order Reports and related subsequent reports where the timestamp for the related subsequent report occurred prior to the receipt of the order,

In electronic trading such errors are easy to make. Two computer servers split the work in some data center and the clock on one is 10 milliseconds faster than the clock on the second. The faster device sends an order to a market and stamps it with the time. The slower device gets the response from the market and stamps it with the time.

Real time Server One Server Two
12:00 Send order clock=12:00.010 Clock=12:00
12:00.05  Clock=12:00.15 Get confirmation. Clock=12:00.05

In fact, for many trading organizations this is scenario does not even require two servers because their clocks can jump backward.

 

Data base design criteria: ease of use

Regarding ease-of-use, it’s often struck me when reviewing data systems papers that the evaluation sections are full of performance and correctness criteria, but only rarely is there any discussion of how well a system helps its target users achieve their goals: how easy is it to build, maintain, and debug applications?; how easy is it to operate and troubleshoot? Yet in an industry setting, systems that focus on ease of use (even at the expense of some of the other criteria) have tended to do very well. What would happen if a research program put ease of use (how easy is it to achieve the outcome the user desires) as its top evaluation criteria?  Adrian Colyer

Distributed consensus and network reliability

All of the distributed consensus algorithms I have been reviewing recently (Paxos, Raft, Zab, Chang Maxemchuck, Viewstamped, … ) are based on a number of assumptions about the network environment, including the assumption that messages may be lost but are not silently corrupted.  Is that a good assumption? Perhaps:

Real data 1995

Checksums disagree 2000 

Koopman