As a weekend hack, I’ve been implementing the Whitespace programming language, in Idris. You can find it on github. Whitespace, behind the syntax, is a stack based language with unbounded integers and a single global heap, and control flow implemented using labels, conditional jumps, and subroutine calls. There’s no types (just integers for everything) So what do we gain by implementing it in a language with dependent types? It turns out there’s some interesting things we can do, and problems we have to solve:

- Various operations are only valid if the stack is in the right form (e.g. arithmetic requires at least two items at the top of the stack, and leaves at least one item). We can often statically check that the stack is in the right form before executing such instructions, and at least statically check that the necessary dynamic checks are carried out.
- We need to consider how to represent the stack – a
`Stack n`

is the type of stacks with *at least *`n`

elements.
- There’s a lot of bounds checking of stack references, and as much of this as possible is made static.
- A jump is only valid if the given label is defined. We can statically check this, too, and guarantee that a program we execute has well-defined labels.
- Although whitespace is Turing complete, we can still use the totality checker to verify that executing individual instructions terminates.

If nothing else, we get to see how to use dependent types to make *assumptions* and *invariants* explicit in the code. Even if those invariants aren’t used to guarantee total correctness, we at least get the type checker to tell us where we’ve violated them. In fact, I avoided a few errors this way and found a couple of missing or badly specified parts of the whitespace tutorial… Perhaps I’ll write more on this at some point…

I’ve written a draft paper describing the implementation of Idris,

Implementation of a General Purpose Programming Language with Dependent Types, abstract as follows:

Many components of a dependently typed programming language are by now well understood, for example the underlying type theory, type checking, unification and evaluation. How to combine these components into a realistic and usable high level language is, however, folklore, discovered anew by succesive language implementations. In this paper, I describe the implementation of a new dependently typed functional programming language, Idris. Idris is intended to be a general purpose programming language and as such provides high level concepts such as implicit syntax, type classes and do notation. I describe the high level language and the underlying type theory, and present a method for elaborating concrete high level syntax with implicit arguments and type classes into a fully explicit type theory. Furthermore, I show how this method, based on a domain specific language embedded in Haskell, facilitates the implementation of new high level language constructs.

There’s still some polishing to do, but comments on the content, form, and so on, would be most welcome. Enjoy!

Last Friday I gave a talk at Scottish Theorem Proving in which I sketched how Idris worked.

Here are the slides. There’ll be a paper explaining in more detail at some point fairly soon…

My goal for Idris is to make it a *practical* dependently typed programming language – suitable for general purpose, systems programming, with precise types when you need them for increasing confidence in correctness of your program. One thing that a modern, practical language really needs, however, is a sane, safe, general way of overloading functions. Haskell‘s solution to this, type classes, is by far the neatest I’ve seen, so in the new implementation of Idris, that’s what I’ve done. They work much as you’d expect if you’re used to Haskell. For example, the `Show`

class could look a bit like this:

class Show a where {
show : a -> String;
}

This generates a function `show : Show a => a -> String`

, i.e. a function which will display any `a`

under the constraint that `a`

is an instance of `Show`

. We can declare instances…

instance Show Nat where {
show O = "O";
show (S k) = "s" ++ show k;
}

…and instances can themselves have constraints (and, indeed, be instances for dependent types):

instance Show a => Show (Vect a n) where {
show xs = "[" ++ show' xs ++ "]" where {
show' : Show a => Vect a n -> String;
show' VNil = "";
show' (x :: VNil) = show x;
show' (x :: xs) = show x ++ ", " ++ show' xs;
}
}

As you’d expect, classes can have constraints too:

class Monad (m : Set -> Set) where {
return : a -> m a;
(>>=) : m a -> (a -> m b) -> m b;
}
class Monad m => MonadPlus (m : Set -> Set) where {
mplus : m a -> m a -> m a;
mzero : m a;
}

Note that, in this case, we have to give an explicit kind for the parameter `m`

. If you don’t give this, Idris will assume that the parameter is of kind `Set`

.

The prelude and builtins modules now have a couple of the more obviously useful type classes. However, I’m not going to rush into extending this – we should probably think carefully about what classes we need and how they relate to each other, rather than do things “just because Haskell did it that way”!

### How it works

Type classes classify types according to which operations can be overloaded for them. Internally, therefore, we can represent type classes as a record of functions. For example, for show:

data Show : Set -> Set where
instanceShow : (show : a -> String) -> Show a;
show : Show a -> a -> String;
show (instanceShow s) x = s x;

An instance for a type `a`

is then a value of type `Show a`

; for example for `Nat`

:

showNat : Show Nat;
showNat = instanceShow show' where {
show' O = "O";
show' (S k) = "s" ++ show' k;
}

All the elaborator has to do on seeing a `class`

declaration is to generate the appropriate data type, and on seeing an `instance`

declaration generate the appropriate instance function.

Class constraints are a special kind of implicit argument (much like Agda’s instance arguments). On seeing a function call `show x`

, the elaborator will attempt to resolve the implicit argument first by looking in the local context, then attempting to refine by any global class instances which match, recursively.

### Still to do…

Default definitions of class methods will be very useful, as will a deriving mechanism. The former is easy, and so likely to happen soon. The latter is hard, and so isn’t.

Note also that this mechanism is intended entirely for a safe general overloading mechanism, and many of the type class tricks which are possible in Haskell are unlikely to work (though, at least, Idris classes can have more than one parameter). Such tricks, I believe, ought to be unnecessary anyway since the dependent type system is very flexible. It will, however, be interesting to see what *can* be done that I haven’t thought of…

About a month ago, I started working on a complete reimplementation of Idris, because there are a number of things about the current implementation that have been bothering me for a while. It’s now got to a point where it’s beginning to work (in particular, the standard well-typed interpreter is up and running). You can start playing with it now, and follow progress at the github repository.

I’m usually very reluctant to start something again – especially so with Idris since I’ve been working on it for a few years now. But there’s quite a few problems with the current design. In fact, “design” is too strong a word – I didn’t really know when I started what I wanted the language to do and which features I’d need. Moreover, it’s evolved out of a theorem proving system Ivor that was originally meant to do something else, and it’s getting harder and harder to add the new features I want (e.g. mod cons like `where`

clauses, destructuring `let`

, `case`

, and essential gadgets for developing large systems like separate compilation and a decent module system) without having to jump through hoops.

Pleasingly, the new version is not far off catching up with where the old version was. Basic syntax, type checking with implicit syntax and evaluation are all working (mostly better than before…) There is no `with`

rule yet, but it won’t be long. There’ll be a compiler not long after I’ve decided how to handle IO (I want to keep the IO primitives as primitive as I can so that it will be possible to experiment with different high level representations).

The main, significant, distinction between the previous implementation and this is the type checker. In the new version, the type checker is kept extremely simple – no implicit syntax, no unification, no generating proof obligations or coercions. Instead, we have a core type theory with metavariables, and an EDSL (that is, an embedded domain specific language) called `Elab`

which constructs programs using Coq-style tactics, directed by the high level source text, and filling in metavriables by unification where it can. This idea will be unsurprising to anyone who is following the progress of Epigram.

Making `Elab`

an EDSL has several advantages. Adding a new language construct (e.g. pattern matching, now, or maybe coercions, or type classes, some day…) is simply a matter of writing a sequence of tactics to build that construct in the core type theory. To check a pattern match clause, for example, we elaborate the left hand side inferring types for the pattern variables as we go, and use this information to elaborate the right hand side. Another advantage is that it will be easy to expose the EDSL to an Idris programmer for building domain specific decision procedures (e.g. a Ring Solver would be nice). Finally, making the elaborator do all of the work and keeping the type checker simple means that we can be far more confident that only well-typed programs are accepted.

I’ll go back into my hacking shed now, and I’m staying there until all the programs that worked with the previous version compile. Hopefully this won’t take long :). In the meantime, if you have a go, please let me know what you come up with. In particular, if anything behaves badly, please shout – I want this version to work well!

A new draft, Resource-safe Systems Programming with Embedded Domain Specific Languages, by Edwin Brady and Kevin Hammond, submitted September 2011.

We introduce a new overloading notation that facilitates programming, modularity and reuse in Embedded Domain Specific Languages (EDSLs), and use it to reason about safe resource usage and state management. We separate the structural language constructs from our primitive operations, and show how precisely-typed functions can be lifted into the EDSL. In this way, we implement a generic framework for constructing state-aware EDSLs for systems programming.

As usual, comments and questions are most welcome!

Slides from my invited talk at DTP 2011 are now available.

In this talk, I give a brief introduction to the Idris programming language, and show how it can be used to implement embedded domain specific languages (EDSLs) with strong correctness properties. I introduce a `dsl`

construct which gives these EDSLs a usable (i.e. readable and writable) concrete syntax, and show how it can be used to tackle realistic resource management problems in systems programming.

Code for the resource management EDSL is on github, though I don’t promise much in the way of comments or documentation yet. There is a paper on the way soon which will explain the gory details.

(Usual disclaimer for publishing slides: you really need the talk as well as the slides to understand all of the details, so questions and comments are more than welcome.)