Tuesday, August 14, 2012

GeneralizedNewtypeDeriving is Profoundly Unsafe

It is a well known fact in the Haskell community that GeneralizedNewtypeDeriving as currently implemented is unsafe. What I don't think people generally realize is just how unsafe it is.

Perhaps the most obvious problem comes when GeneralizedNewtypeDeriving is combined with TypeFamilies using this all to common pair of extensions we can start doing stuff that looks innocuous but then suddenly starts to get ever so slightly suspicious The problem comes when we combine the little functions we have built this is really full featured unsafeCoerce and can even break the most holy Haskell's type system enforced invariants try running this stuff in GHCi.

Okay, so the combination of GeneralizedNewtypeDeriving and open type functions is clearly unsound. More interesting is the fact that you don't need TypeFamilies to type unsafe coerce. Just a healthy GADT (note: I believe this is a new presentation). using GADTs we can pass around proofs of type equality and define classes that talk about them by this point you should know what comes next. We just define a type and start committing atrocities Shucks.

Okay, so now 2 of my favorite extensions (TypeFamilies and GADTs) have been shown to be incompatible with GeneralizedNewtypeDeriving. But, we still are not at the end of the story. If you look at the now 5 year old trac for the "GeneralizedNewtypeDeriving is unsafe" bug, you see that it can also be used to break system invariants in code that is otherwise Haskell98. GeneralizedNewtypeDeriving will lay waste to module boundaries, so many of the assumptions made by common Haskell libraries are trivial to break. In fact, I had wanted to be able to demo yet a third way of typing unsafeCoerce using an alternative representation of type equality proofs for this blog post, but have not been able to get it to work in time. I believe that not only may this work, but it is highly possible that Haskell 98 + GeneralizedNewtypeDeriving admits unsafeCoerce, using type classes to simulate higher ranked types. Research for another day.

The moral is that GeneralizedNewtypeDeriving really needs to be fixed. In an ideal world it would re-type check the definitions to ensure no fishy business was going on. Sadly, I suspect this would break large amounts of existing code. Until then, GeneralizedNewtypeDeriving should be treated as on par with unsafeCoerce. It might be correct, but it just as well might cause your machine to catch fire.

PS: I feel a bit guilty posting this on the same day as (the always brilliant) Bob Harper posted one of his anti-Haskell polemics on the unsafe nature of Typeable. At the Oregon Programming Language Summer School this year, several other Haskellers asked me to publish some of the more demonic Haskell code I have written. I have been traveling since then, and will be again in the coming week, but I felt compelled to show off at least some of the stuff I have done.

Monday, April 23, 2012

Memory Regions à la carte

In the previous post we saw how to implement checked exceptions using free monads. The crucial idea was that, so long as the underlying monad was written using the "à la carte" style, we could add exceptions in such a way that

  1. Programms would only compile if all exceptions were caught
  2. We could program exactly as we would in the underlying (free) monad
  3. Type inference was sufficient such that we did not require any type signatures
  4. Client code only required language extensions to write functions with uncaught exceptions

Related to checked exceptions is another kind of fine grained effect control: memory regions. The basic idea is to allow blocks of mutable memory to be accessed in such a way that the type system ensures external purity. This is already achievable in Haskell using the ST monad. But, ST has some clunky-ness. In particular ST does not compose well with other monads.

An alternative to the ST monad is the ST monad transformer. There are problems with the ST transformer though. The one we fix is that it is not well suited for code that involves multiple memory regions. The second problem is that the ST transformer does not play nice with all monads. I strongly suspect this second problem still exists with the code presented here. In fact, our implementation uses STMonadTrans internally, and I am not convinced it can be implemented in terms of ST alone. More about this at the end.

As much as possible we would like to reuse the machinery built for checked exceptions. Lets get started.
Nothing that interesting in the extensions and the API we expose is again very simple With the imports out of the way the first thing we need to do is to define a datatype to express the operations we want our monad to be able to perform. There are three essential operations involving references, so we implement all three of them as a single functor. The one complexity is that reference are quantified over a type, so our data constructors need to be existentially quantified. We this use a GADT. The meat of our implementation is a function that converts a region into a ST monad transformer. We can easily implement this using the removeStep function from the last post Now, we define the public API. The first thing is a type for references in a region The big problem implementing memory regions by stacking ST transformers on top of each other, is that we have no way of knowing which of the various STs is intended when we use newSTRef. We solve this by adding "keys" associated with each region now, when a user wants to use a new region, they will define a function taking a key and returning an appropriate action the remainder of the API is trivial

What can we do with this? Well, lets test it out running it produces the expected results.

*Main> run $ region $ testRegion1
"in region"
we can do some more interesting things which produces the expected
*Main> run $ region $ testRegion2
"set in region 2"
we can even combine regions and exceptions
*Main> run $ region $ testRegion3
"set in region 2 before exception was thrown"
Fine grained effect control. And it didn't cost too much either. Of course, the STMonadTrans library provides support for arrays, but those can easily be added by amending the definition of RG to have more constructors.

Now, we haven't solved the second problem with STMonadTrans, namely, that we can have effects happen unexpected number of times if we are careful about it. This is obvious, since we can define a simple monad transformer as a free monad: and an algebra for running it supporting functions such as now, consider the following function which has no control flow operators the problem is that if we run it

*Main> runM $ region $ unexpected
[1,3,6,10,15,21,28,36,45,55,66,78,91,105,120,136,153,171,190,210,231,253,276,300,325,351,378,406,435,465]
we see that combining mutation with a monad that represent multiple answers produces interesting results. The documentation for STMonadTrans warns that it might break referential transparency--I'm not sure that I understand why this is (how is this different than a StateT over a list?) but, this warning is enough for me to not be willing to advocate the use of these free regions just yet.

Sunday, April 22, 2012

Checked Exception for Free

Monad transformers are a very effective abstraction mechanism. But, they can make code difficult to reason about, and they sometimes suffer from a form of what C++ developers call the "n times m" problem. That is, if you separate implementation in the form of transformer stacks, from interfaces in the form of typeclasses, you potentially need an instance for each typeclass/transformer combination. The GHC developers have generally avoided transformers1. While, it has been suggested that best practice (at least most of the time) is to define custom monads, either by defining your own machinery, or by wrapping a transformer stack in a newtype.

One of my fellow students suggested that "the right way" to do it was to list the operations you wanted as functor, and then to simply use the free monad over this set of operations. This approach is quite elegantly described in Swierstra's Data types à la carte

Given this discusion, I spent a fair chunk of my weekend thinking about just how far one could go with this approach. It is known that most of the common monad transformers (really everything except Cont) can be encode as free monads. We need something a bit more interesting.

I am bit obsessed with fine grain control of side effects. Perhaps, the most widely used system for fine grained effect control is Java's checked exceptions. Checked exceptions in Java get a bad rap, and with reason. In a language without type inference they are clunky to work with. And, the hierarchical model makes it too appealing to But, Haskell does not have these problems. What's more, Haskell is flexible enough that checked exceptions can be implemented as a library. You can download a checked exception library on Hackage today. In this post we will implement a simple checked exceptions library using an interesting variant of the "À la Carte" approach.

Machinery

First thing is to define some of the machinery we will use for building the exceptions library. We need some extensions we also need one of IncoherentInstances or OverlappingInstances. It does not matter which we use for this post. But, for somethings we will want to do with this machinery we need IncoherentInstances

Right now we are only interested in the supporting machinery We need free monads. There is a library for that.

Data types a la carte heavily uses functor coproducts, as will we.

We also need a way to express that a given functor is subsumed by a coproduct. This is the one spot where we have overlapping instances.

Swierstra's model for how we use these is to define algebras over functors, and build evaluators to "run" free monads by folding over those algebras. We will make special use of one such algebra. The void functor is uninhabited. It is, though, an honest functor. In particular, it is the initial object in the category of endofunctors, and the initial object under functor coproducts, making it surprisingly useful. Often we would need to define a typeclass to encode an interesting algebra, but here we can express it trivially foldFree is just like fold over lists, but over free monads.

So far we haven't deviated much from the norm. Now though, we build some unusual machinery. One of the reasons to prefer free monads over monad transformers is to avoid the linear nature of a stack of monad transformers. Sometimes though, a stack is just what we need. We can recover the stack by assuming that the functor inside a free monad is not an arbitrary tree of coproducts, but rather is one long list (at least it has a list like head) The reason for this is that in addition to the algebraic style, sometimes we want to run one functor at a time

That is all the machinery for "A la Carte" programming. Now, we just have to use it.

Checked exceptions

We don't need much this time as far as extensions the API is small and self documenting Now it is easy. We need a functor to represent exceptions. Since, Maybe is the free monad over the functor that has only one constructor taking no parameters, Either is the free monad over the functor that has only one constructor taking a single arbitrary object as a parameter. raising exception is now trivial catching exceptions is a bit more involved, but still rather simple. The idea is to use the removeStep function to "remove" the appropriate Throws functor from the code we are catching we have to assume that this is the front of the functor co-list.

So now you are probably wondering how this works in practice. Lets get a new file with no extensions all of these work without a hitch. Type inference correctly figures them all out (we can't write the type signatures without turning some extensions on, however). The more interesting test is Which handler gets called? Turns out it is always the first (as we would hope). Interestingly enough, it compiles and uses the first even if we use OverlappingInstances instead of IncoherentInstances.

Prelude Main> testExceptions
["test1","test2, expect e1 Ex1","test3, expect e2 Ex2","test4, expect e1 Ex1","test5, first handler Ex1"]

Next up: memory regions.

1. Norman Ramsey Stack Overflow

Free Monad Primer

I plan a set of posts over the next few days on cool things you can do with free monads. This post is an attempt to get all the abstract nonsense out of the way first.

A Brief, Incomplete, and Mostly Wrong History of Programming Languages provides some background
1990 - A committee formed by Simon Peyton-Jones, Paul Hudak, Philip Wadler, Ashton Kutcher, and People for the Ethical Treatment of Animals creates Haskell, a pure, non-strict, functional language. Haskell gets some resistance due to the complexity of using monads to control side effects. Wadler tries to appease critics by explaining that "a monad is a monoid in the category of endofunctors, what's the problem?"

A monoidal category is a category C equipped with an additional operation called the tensor product $\otimes$ which is a bifunctor from $C \times C \to C$, an identity object $I$, such that forall objects $x,y,z \in C$, $x \otimes I \cong I \otimes x \cong x$, and $(x \otimes y) \otimes z \cong x \otimes (y \otimes z)$.

Hask (the category of Haskell types...ignoring non-totality) is a monoidal category with product types (two-tuples) serving as the bifunctor, and the terminal object () serving as the identity object. Conveniently, two-tuples are also products in Hask.

In such a category, a monoid object $m$ is any object equipped with two morphism $$\eta : I \to m$$ $$\mu : m \otimes m \to m$$ such that some diagrams commute. We can express this in Haskell as and the diagrams correspond to three laws this class is of course, equivalent to the usual

We can construct a category where the objects are all the endofunctors over Hask, and the morphisms are all the natural transformations over these functors.

Aside: due to parametricity any function (in a large subset of Haskell) with a type like is a natural transformation.

This category of endofunctors is a monoidal category with functor composition as the bifunctor and the identity functor so, then we should be able to define monoids in this category with the same laws as above. A moment of reflection and we see this is equivalent to which is what the monad class in Haskell should be.

Okay, so in Hask we also have coproducts of the form we also get coproducts in the cateogry of endofunctors Why is this useful? Well every Haskeller knows that given some type we can construct a monoid over that type which we can express perhaps more explitly as that is, as the coproduct of the tensor identity and the tensor product of a and List a. This is the "free monoid".

Lets translate this structure to funtors which is itself equivalent to lets define the monad instance that was easy, now okay, so the hard part is join. How do we define that?

Recall that so and now we have free monads!

Wednesday, February 29, 2012

Haskell Supports First-Class Instances

Haskell’s type classes are one of its most powerful and distinctive features. They are also, at times, rather annoying. One of the most annoying facts about Haskell’s type classes is that they are decidedly not first class. GHC 7.4 brings first class constraints to Haskell, but most of the difficulties come from the lack of first class instance. Not having first class instances has a range of negative consequences.

For example, type class instances are automatically carried over module boundaries. If you don’t think this is a problem you have never had to deal with Control.Monad.Either and Control.Monad.Error in the same program.

The ability to have only a single instance for each class associated with a type makes little theoretical sense. Integers form a monoid under addition. Integers also form a monoid under multiplication. In fact, forming a monoid under two different operations is the crux of the definition of a semi-ring. So, why does it make any sense at all that in Haskell there can be at most one definition of “Monoid” for any type?

Scala’s implicit objects do not suffer from these limitations1. Neither do Coq’s type classes2. There have been various proposals for “first class instances”3, but none have made it into Haskell proper.

Or have they? The remainder of this blog post will show how to get first class instance in GHC 7.4.1. To the best of my knowledge this has not been published by anyone else yet (although Edward Kmett's blog has gotten pretty close4).

First of all we are going to need some language extensions, and some imports for later on:

Other than the truly evil Unsafe.Coerce, there is nothing here that threatening. No Undecidable/Incoherent/CommunistInstances. No kind polymorphism, monomorphism enabling, or data type promoting hogwash. We didn’t even import GHC.Prim, which is pretty amazing considering that without it we can’t use the Constraint keyword.

Following Kmett we define a type for explicit dictionaries

Now is the magic part. We define a function that lets us use an explicit dictionary to do implicit things

This function works, but by itself isn’t very useful. So far to produce a dictionary, you have to already have a type class instance in scope. We need a way to keep instances from taking over and flooding across module boundaries. The standard Haskell solution to this is to define a newtype. So, lets define one newtype to rule them all

Okay, this is very good for proxying types of kind *, but we really should also define

What good is this proxy definition? Well, in GHC newtypes disappear completely at runtime. That means (ignoring type families) that the representation of an instance of some class for Proxy Label Type is the same as it is for Type, so we can define a truly mischievous little function

Lets add some stuff to make working with these things friendlier

And that’s it. We have a fully functional first class instances system. Here is how you use it. Lets define a new version of the Show type class

Now, normally there is only one way to Show a thing. But, our Show' wont be so lame

Some stuff to get the instance:

Now we need a function to test. Luckily, we turned on FlexibleContexts at the beginning

In GHCi we can confirm that it works.

*Main> showInt 3

<interactive>:118:1:
    No instance for (Show' Int)
      arising from a use of `showInt'
    Possible fix: add an instance declaration for (Show' Int)
    In the expression: showInt 3
    In an equation for `it': it = showInt 3
*Main> withDict stupidShowIntDict (showInt 3)
"from StupdiShow"
*Main> withDict normalShowIntDict (showInt 3)
"3"
*Main>

“Big f-ing deal” you say. “That’s what we had newtypes for. This is pretty useless.” Well, I turn to the most popular example of bad legacy design in the Haskell standard library. Monad is not a subtype of Functor, and even worse, many types have nice Monad instances, but no instances of Applicative. We can fix that.

First of all, we need to get to recreate our dictionary making machinery for Proxy1. Disturbingly, the code (up to the numerals) is identical to what we had before

Now we can define some functions to produce Functor and Applicative instances for all monads

So, now when you run into someone who for unknown reasons defined their own list type and only made it a Monad

You don’t need to be stuck. Just define a function so you don’t need to stick type signatures all over the place

And you got your Functor!

*Main> fmap (+1) $ Cons 1 Nil

<interactive>:121:1:
    No instance for (Functor List)
      arising from a use of `fmap'
    Possible fix: add an instance declaration for (Functor List)
    In the expression: fmap (+ 1)
    In the expression: fmap (+ 1) $ Cons 1 Nil
    In an equation for `it': it = fmap (+ 1) $ Cons 1 Nil
*Main> withDict listFunctor $ fmap (+1) $ Cons 1 Nil
Cons 2 Nil

I’m not quite ready to make this my first package on Hackage. The API still needs some kinks worked out. And, I need to do a lot more testing to see if this creates any opportunities for segmentation faults. Still, it is nice to know that passable instances are at least possible in Haskell.

1. Oliveira and Ordesky http://ropas.snu.ac.kr/~bruno/papers/TypeClasses.pdf
2. Sozeau and Oury http://mattam.org/research/publications/First-Class_Type_Classes.pdf
3. Oliveira and Sulzmann http://www.cs.mu.oz.au/~sulzmann/manuscript/objects-unify-type-classes-gadts.ps
4. The Comonad.Reader http://comonad.com/reader/2011/what-constraints-entail-part-1/

Hello World

Hello World!