I have just submitted a new paper to ICFP 2013:
Programming and Reasoning with Algebraic Effects and Dependent Types,
One often cited benefit of pure functional programming is that pure code is easier to test and reason about, both formally and informally. However, in order to be useful, programs must interact with the outside world. Haskell solves this problem using monads to capture details of possibly side effecting computations — it provides monads for capturing State, I/O, exceptions, non-determinism, libraries for practical purposes such as CGI and parsing, and many others, as well as monad transformers for combining multiple effects.
Unfortunately, useful as monads are, they do not compose very well. Monad transformers can quickly become unwieldy when there are lots of effects to manage, leading to a temptation in larger programs to combine everything into one coarse-grained state and exception monad. In this paper I describe an alternative approach based on handling algebraic effects, implemented in the Idris programming language. I show how to describe side effecting computations, how to write programs which compose multiple fine-grained effects, and how, using dependent types, we can use this approach to reason about states in effectful programs.
You can find the draft here. As usual, comments are very welcome.
This week I’ve been giving a course on Dependently Typed Functional Programming in Idris at IT University, Copenhagen. Various course materials are available:
- Lecture 1, Introduction, slides, video
- Lecture 2, Embedded DSLs, slides, video
- Lecture 3, Effect management, slides, video
- Lecture 4, Implementing Idris, slides, video
- Example code
- Idris web page
I had an email from a student asking why Idris supported partial functions, saying that it was surprising that a dependently typed language supported partial functions, because types are supposed to be precise specifications and you can’t be sure you have a precise implementation if your functions aren’t total. This is a very good question, and my answer was quite long, so following these academic blogging tips, I’ll reproduce my answer here. The tl;dr version is that, while I agree totality is important and I aim to make all of my programs total (I don’t think I’ve ever wanted to write a function with neither terminates nor produces results occasionally, after all) it’s a bit more complicated than that in practice.
I’m going to use “total” in a fairly informal sense here. By “total function” I mean either:
- A function which terminates for possible inputs, or
- A function which is guaranteed to produce some output before making a recursive call
In particular, this means you can still write an interpreter or operating system, for example. That is, you can write programs which run forever, but at least they achieve something while doing so.
It is a good idea to aim to make all of your functions total, as David Turner argues – when functions are total, you really can believe that “well-typed programs don’t go wrong” because you know you’re going to get an answer eventually (or in the case of coinductive functions, you know you’re going to keep making progress). And, indeed, dependent types can help you achieve that goal because you can use the extra structure the types give you as additional evidence for the termination checker. Here is a nice example.
Then, of course, some programs are really proofs, that you never actually run but which do demonstrate some properties of other functions. In this case, it’s vitally important that they are total, to guarantee that you haven’t missed any cases, or relied on circular reasoning, and so every possible input leads to a proof object.
Having said all that, Idris still supports partial functions, for two main reasons.
Firstly, you can divide programs in general into two categories:
- Programs you haven’t finished writing yet
- Programs which are complete
The first category is significantly larger than the second. Even more so if you replace “Programs” with “Applications”.
In the case that your program isn’t finished yet, you don’t want your type checker shouting at you that it’s not finished, because you already know, but you still might want to test it (sometimes we like to run our programs, as well as read and write them, after all). You might want to try running a program even with an incomplete proof, to help you develop an intuition for how the proof might work, or even to see if there are test cases which might invalidate your hypothesis. Alternatively, you might have a nice natural recursive algorithm which is not obviously structurally recursive (the functional version of quicksort is an obvious example of this) and not want to hold up the development of the rest of your application while you develop the termination proof.
The fact that programs spend most of their lives in an unfinished state is one reason people are interested in gradual types, and why a recent GHC extension allows you to run programs with type errors. You don’t want to release software with type errors (or, in the case of an Idris program with an associated correctness proof, release software with an incomplete proof) but it might be convenient during development.
Secondly, I don’t believe it’s a language’s job to tell a program what level of precision their programs’ types should have. Rather, a language and its features are there to help a programmer do their job. If I want the language to help me write a totally correct program with a precise specification, then I certainly want a totality checker, but if I’m just writing a quick script to do a simple task, maybe I’d rather just get the job done. I would like the library code the script invokes to be total, but I might not be so bothered about the script itself, at first.
Related to this, remember that there’s more you can do with dependent types than prove functional correctness. Just being able to give types to more programs is already a benefit which doesn’t require programs to be total. Dependent types can help you write generic libraries – the Eff library is a developing example of this, as are embedded domain specific languages in general – where the library user is making use of dependent types without necessarily being aware of it.
Idris aims to be a general purpose programming language, rather than a theorem prover, so aims to support all uses of dependent types, or even to support programming with conventional types if that’s all you need, so the default is to support partial functions. If the goal was primarily to be a theorem prover, I think the opposite choice would have been appropriate. Whatever happens, though, functions are always checked for totality, and you can make partial functions a compile-time error by default with the “–total” flag.
If you want complete safety, you certainly have to work for it, and depending on your application that may be important. Sometimes, though, you just want to get things done. In that case, Idris will let you – but at least the type system and totality checker make it clear what assumptions you have made to do so.
I’m busy revising the “How Idris works” paper, and you can find the latest draft here. Any further comments on how to improve it would be most welcome! Abstract:
Many components of a dependently typed programming language are by now well understood, for example the underlying type theory, type checking, unification and evaluation. How to combine these components into a realistic and usable high level language is, however, folklore, discovered anew by successive language implementations. In this paper, I describe the implementation of a new dependently typed functional programming language, IDRIS. IDRIS is intended to be a general purpose programming language and as such provides high level concepts such as implicit syntax, type classes and do notation. I describe the high level language and the underlying type theory, and present a method for elaborating concrete high level syntax with implicit arguments and type classes into a fully explicit type theory. Furthermore, I show how this method, based on a domain specific language embedded in Haskell, facilitates the implementation of new high level language constructs.
I have put slides from my PLPV 2013 invited talk online here. You can also find the example code.
Of course, the slides won’t make as much sense as with the accompanying commentary, but at least you’ll see what I talked about, and maybe you’ll get something from them anyway. In them, you’ll find:
- An introduction to Idris
- Some examples: verified binary arithmetic, lightweight verification of a virtual machine, and
my plan to never have to write a monad transformer again.
- Some notes on compilation issues
As a weekend hack, I’ve been implementing the Whitespace programming language, in Idris. You can find it on github. Whitespace, behind the syntax, is a stack based language with unbounded integers and a single global heap, and control flow implemented using labels, conditional jumps, and subroutine calls. There’s no types (just integers for everything) So what do we gain by implementing it in a language with dependent types? It turns out there’s some interesting things we can do, and problems we have to solve:
- Various operations are only valid if the stack is in the right form (e.g. arithmetic requires at least two items at the top of the stack, and leaves at least one item). We can often statically check that the stack is in the right form before executing such instructions, and at least statically check that the necessary dynamic checks are carried out.
- We need to consider how to represent the stack – a
Stack n is the type of stacks with at least
- There’s a lot of bounds checking of stack references, and as much of this as possible is made static.
- A jump is only valid if the given label is defined. We can statically check this, too, and guarantee that a program we execute has well-defined labels.
- Although whitespace is Turing complete, we can still use the totality checker to verify that executing individual instructions terminates.
If nothing else, we get to see how to use dependent types to make assumptions and invariants explicit in the code. Even if those invariants aren’t used to guarantee total correctness, we at least get the type checker to tell us where we’ve violated them. In fact, I avoided a few errors this way and found a couple of missing or badly specified parts of the whitespace tutorial… Perhaps I’ll write more on this at some point…
I’ve written a draft paper describing the implementation of Idris,
Implementation of a General Purpose Programming Language with Dependent Types, abstract as follows:
Many components of a dependently typed programming language are by now well understood, for example the underlying type theory, type checking, unification and evaluation. How to combine these components into a realistic and usable high level language is, however, folklore, discovered anew by succesive language implementations. In this paper, I describe the implementation of a new dependently typed functional programming language, Idris. Idris is intended to be a general purpose programming language and as such provides high level concepts such as implicit syntax, type classes and do notation. I describe the high level language and the underlying type theory, and present a method for elaborating concrete high level syntax with implicit arguments and type classes into a fully explicit type theory. Furthermore, I show how this method, based on a domain specific language embedded in Haskell, facilitates the implementation of new high level language constructs.
There’s still some polishing to do, but comments on the content, form, and so on, would be most welcome. Enjoy!