Zot is a variant of Jot, and this document should be read after reading the description of Jot. Iota and Jot were designed to be as simple as possible, but neither one provides a reasonable strategy for dealing with input or output. Zot is slightly more complicated, but provides a rather natural approach to input, and at least some way of dealing with output. It has some other rather intriguing properties related to universal machines.
Acknowledgments: the work here reports on a wonderful three-way correspondence I've had with (in alphabetical order) Ben Rudiak-Gould and John Tromp. Zot emerged from a discussion of John's work on combinatory logic, (especially his paper on Kolmogorov complexity), and my thinking about input and output follow closely on Ben's approach to input and output in his Lazy K interpreter. My approach to input and output is a little bit different; I am not sure it is better, but I am not sure it is worse, either. John tells me that "zot" means "zany" in Dutch, which seems appropriate enough!
Motivation: Iota reads one well-formed program description and then stops. Jot does the same, but then keeps reading, with the binary digits that follow the program description considered as input:
Iota: 1101010100100101010... ^ignored from here on Jot: 11111100011100100100... ^from here on interpreted as input, but each input can be an arbitrarily complex combinator corresponding to an arbitrarily long sequence of binary digits. Zot: 1101010100100101010... ^from here on each binary digit treated as a separate character available as inputUnlike Iota (though like Jot), Zot keeps consuming input beyond the end of the initial program description; unlike Jot, in Zot, each binary digit beyond the initial program description corresponds to a distinct character in the input stream.
Defining Zot: Zot is syntactically identical to Jot. The only semantic difference is that the values have all been type-lifted using continuations.
Zot Syntax: Semantics: F --> F B [F]([B]) F --> e ^c.cI B --> 0 ^c.c(^f.fSK) B --> 1 ^cL.L(^lR.R(^r.c(lr)))Program descriptions in Zot are identical to Iota programs. In particular, combinators corresponding to the usual CL primitives can be constructed exactly as in Iota:
I = 100 KI = 10100 K = 1010100 S = 101010100If you prefer other universal combinators, such as Rosser's (^f.fKSK) or Fokker's (^f.fS(^xyz.x)), just adjust the value for 0.
Why continuize? The point of the continuization is to allow the semantics for the recursive case to always treat the next binary digit as an argument (i.e., the B in "[F]([B])"). (Unfortunately, I don't know of any good place to learn about continuations on the web; the Unlambda page is not a bad place to start. In any case, everything I have to say about continuations is in these papers.)
Like Jot, every possible sequence of binary digits (including the empty sequence) is well-formed. At any given point in the input, the value of the next binary digit serves as input to the combinator corresponding to the meaning of the prefix string to its left. That is, the meaning of the string "10010" is the meaning of "1001" applied as a functor to the meaning of "0". Similarly, the meaning of "1001" is the meaning of "100" applied to the meaning of "1", and so on.
Program vs. Input: Because each binary digit is treated as input to what comes before, there is a sense in which Zot makes no distinction between "program" and "input". However, we can still impose a distinction between program and input similar to the one in Jot. First, we define a program description in Zot as the smallest initial substring (if any) for which the number of 0's is greater than the number of 1's. This is the same class of strings that constitute well-formed inputs for Iota. Then it makes sense to interpret everything that follows the initial program description as input. In the example given above, the string 1101010100100101010 divides into the program 1101010100100 followed by the input 101010.
Positive Zot: The view just described separating program from input is tidy and familiar. But the line between program and input can be blurred even further. Note that one other small difference between Zot and Jot is that in Jot, the empty string that begins every sequence is interpreted as the identity function, whereas in Zot, this empty string is interpreted as ^c.cI. The role of this function is to undo the type-lifting corresponding to the continuization transformation; in fact, ^c.cI is just the first-order continuization of the identity function. But providing this function is merely a convenience. Here is a variant of Zot in which there is no empty string at all:
Postitive Zot Syntax: Semantics: F --> F B [F]([B]) F --> B [B] B --> 0 ^c.c(^f.fSK) B --> 1 ^cL.L(^lR.R(^r.c(lr)))I call it Positive Zot because the empty string is not allowed. In this language, 1100100 denotes ^c.cI, so starting with this and continuing with any Iota program description produces a string whose meaning is the same as the Iota program. The point is that 1100100 itself is a complete program, since it contains more 0's than 1's and there is no shorter prefix that is a complete program; that means that what we normally think of as the program description is always (necessarily?) treated in Positive Zot as input!
A simple universal machine: Zot has a universal machine that provides an interesting way of distinguishing program from input. In the current context, a universal machine is a combinator U such that if u is a binary string whose meaning [u] = U, then for any string x, the meaning of ux = [ux] = U(x) = [x]. (Here, "ux" is the string-concatenation of u and x; see the paper of John's mentioned above for definitions and discussion.) The idea is that U is universal if feeding it x as input produces the same combinator x would have denoted all by itself. In some sense, then, U behaves as if it were a Turing-complete interpreter in its own right.
One of the interesting properties of Zot is that it has a remarkably simple universal machine: 1100. In order to prove that this is in fact a universal machine, I need to show that for any string x, [x] = [1100x]:
e: ^c.cI e1: [^c.cI] (^cL.L(^lR.R(^r.c(lr)))) = ^L.L(^lR.R(^r.I(lr))) = ^L.L(^lR.R(^r.lr)) e11: [^L.L(^lR.R(^r.lr))] (^cL.L(^lR.R(^r.c(lr)))) = [^cL.L(^lR.R(^r.c(lr)))](^lR.R(^r.lr)) = ^L.L(^lR.R(^r.[^lR.R(^r.lr)](lr))) e110: [^L.L(^lR.R(^r.[^lR.R(^r.lr)](lr)))] (^c.c(^f.fSK)) = [^c.c(^f.fSK)](^lR.R(^r.[^lR.R(^r.lr)](lr))) = [^lR.R(^r.[^lR.R(^r.lr)](lr))](^f.fSK) = ^R.R(^r.[^lR.R(^r.lr)]([^f.fSK]r)) = ^R.R(^r.[^lR.R(^r.lr)](rSK)) e1100: [^R.R(^r.[^lR.R(^r.lr)](rSK))] (^c.c(^f.fSK)) = [^c.c(^f.fSK)](^r.[^lR.R(^r.lr)](rSK)) = [^r.[^lR.R(^r.lr)](rSK)](^f.fSK) = [^lR.R(^r.lr)]([^f.fSK]SK) = [^lR.R(^r.lr)](SSKK) = [^lR.R(^r.lr)](SK(KK)) = [^lR.R(^r.lr)] I = ^R.R(^r.Ir) = ^R.R(^r.r) = ^R.RI = eThus starting a string with 1100 has the same effect as starting it with the empty string e, i.e., with nothing at all.
Actually, it is possible to prove a much stronger claim about 1100: it has no effect at all no matter where it is inserted within the initial complete program. This follows from the following lemma, which I state without proving (the proof is not difficult, I'm just lazy):
And I mean absolutely anywhere, including inside itself. Each of the following strings is related to the preceeding string by insertion of one instance of 1100, and they all mean the same thing (in this case, the identity function):
100 1100100 11100100100 111001110000100 1110011111000000100 11100111111100000000100So here is one property that distinguishes between program in and input: in the program portion of a string, inserting 1100 is guaranteed to not change meaning, but inserting 1100 in the input portion of a string may or may not change meaning (depending on the denotation of the program).
Incidentally, 1100 is a universal machine in Iota too, and can also be placed literally anywhere in an Iota program without disturbing the result.
Input is usually conceived of as a list of values that the program manipulates. John's paper and Ben's Lazy K interpreter provide an elegant implementation of this approach based on a way to represent lists using combinators. I will propose here instead to turn this view on its head: instead of programs manipulating input, I will have input manipulating programs---for me, input is something that happens to a program, not vice-versa. This is certainly in the spirit of continuations, and, I hope, also in the spirit of functional programing.
^f.[^f.I(fK)](fS) = ^f.I((fS)K) = ^f.fSKThus my favorite universal combinator (^f.fSK) turns out to be a sequence of length 2, the sequence <S, K>.
According to this approach, an input stream takes a program as an argument, not the other way around. This is not as radical as it seems, for the following reason: thanks to the definition of a string, if W is a combinator and x is the string <x1, x2, x3, ..., xn>, then
x(W) == (...(((W x1) x2) x3) ... xn)i.e., programs consume their input one character at a time left-to-right. This seems to me to be in the spirit of functional languages.
The way to provide input to a program in Zot is to simply add the binary digits to the end of the program description, exactly as we have been doing above. In other words, the definition of the language itself embodies this notion of input, without needing to first transform the input into a list representation as in John and Ben's systems.
Output: Output is more vexed. Naturally, output will also be in the form of a string as defined above, i.e., output will be a combinator waiting to feed a series of 1's and 0's to whatever combinator we give it. In order to view the output, we need a program that is capable of telling a 1 from a 0. One way to do this is to apply the following combinator to the input character:
Using interrogate, we can define a print function as follows:
(define print ((lambda (x) (x x)) (lambda (self) (lambda (char) (display (((interrogate char) "0") "1")) (self self)))))To print a string, then, simply apply the string to the print function. If we want to print the string <1,0,1,1>, we apply the corresponding combinator to the print function, which is the same as evaluating ((((print 1) 0) 0) 0). No need to worry about providing an end-of-string character: because each string ends with the null string, which is the identity function, printing naturally stops all by itself.
The awkwardness comes from trying to get access to the output. If a program is capable of recognizing when its input has come to an end based on the nature of the input (for instance, if the program models an Iota interpreter), then the program need only evalaute to the desired output string combinator, and the resulting big picture is ((input(program))(print)). In general, however, programs may need to produce output before the end of the input has been reached. It is necessary, then, to provide some way of asking a program whether it has any output ready to be printed. By assumption, the program is still waiting for input, so the program is going to assume that the next input it sees is probably an input character. Therefore what I will suggest doing (until I invent or hear of a better solution) is to designate a special input character called OUTPUT which instructs a program to surrender anything it wants to print. Here's one candidate for output:
To see how this can work in pratice, below is a Javascript machine that takes an arbitrary string x and evaluates [x](output)(print). Conceptually, it reads in a program description, builds the described program, feeds the program any remaining input, then requests (by feeding it the output combinator) that the program give up any output it cares to provide, and prints that output. In the window is a program that reads exactly two characters on the input, ignores the next input (which will be the output combinator just described) and returns a string consisting of the first two inputs in reverse order:
(lambda (c1) (lambda (c2) (lambda (x) (lambda (print) ((print c2) c1)))))
Here is a more robust reverser. It consumes an arbitrary number of binary digits. As soon as it receives the output combinator as input, it returns a string with the previous input in reverse order:
(define reverse ((lambda (x) ((x x) (lambda (x) x))) (lambda (self) (lambda (remainder) (lambda (c) ((((interrogate c) (self self)) (self self)) (lambda (print) (remainder (print c)))))))))
% echo -n "1101000" | cat reverse.zot - | guile -s zot.scm 0001011 %This appends the string 1101000 to the program description before sending it to the interpreter. Note that the output is 0001011, the reverse of the input string.]
This example makes the output combinator seem like an end-of-file marker, but there is an important difference: processing input can continue after receiving the output combinator. Output doesn't mean "there is no more input"; rather, it means "at this point in processing the input, do you have any messages you'd like to send to the outside world?". This allows an appropriately designed system to have orderly dialogs of input and output. In Ben's discussion of Lazy K, he suggests ("the fall from grace") that timing input and output dialogs is somehow incompatible with a pure functional approach, but I think that my approach here provides a principled solution to this problem.