24 days of Hackage, 2015: day 17: ansi-wl-pprint: avoiding string hacking

Table of contents for the whole series

A table of contents is at the top of the article for day 1.

Day 17

(Reddit discussion)

Today we do the inverse of what we did on day 14, which was parsing from text to an abstract syntax tree. Today we pretty-print an abstract syntax tree to text.

I think that in the wider world of programming, it’s very common to see ad hoc, inflexible solutions to this problem, using string hacking and maybe at best some string-based templates. It’s typically hard to quickly customize indentation styles, expression wrapping, and other such features when using such an ad hoc solution.

In the Haskell community, a better solution is more often used, because of the number of quality libraries out there to help in pretty-printing.

ansi-wl-pprint is one such library, that includes not only a lot of useful convenience combinators, but also provides support for colored terminal output, if you want to use that (you don’t have to).

Update

A commenter on Reddit noted that there’s another version of the pretty-printing library, annotated-wl-pprint. This looks really cool, and here’s a video by David Christiansen on its usage in Idris.

Back to our example expression type

Recall that our expression type is:

-- | Variable in an expression.
type Var = String

-- | Expression.
data Exp
  = N Int          -- ^ number
  | V Var          -- ^ variable
  | Plus Exp Exp   -- ^ sum
  | Times Exp Exp  -- ^ product

Some sample pretty-printed output

Let’s pretty-print a sample expression. (In real life, we’d create a corpus of expressions to pretty-print and verify them, and also write QuickCheck tests to confirm various properties.)

{-# LANGUAGE QuasiQuotes #-}

module SymbolicDifferentiation.AnsiWlPprintSpec where

import qualified SymbolicDifferentiation.Earley as Earley
import qualified SymbolicDifferentiation.AnsiWlPprint as AnsiWlPprint

import Test.Hspec (Spec, hspec, describe, it, shouldBe, shouldSatisfy)

import Data.String.Here (hereLit)

spec :: Spec
spec =
  describe "anti-wl-pprint for symbolic differentiation expression" $ do
    let eString = [hereLit|(x*1) + 2*y + ((3+z) * (4*b)) * (5+d+w)|]
    let ([e], _) = Earley.parses eString
    it eString $ do
      show (AnsiWlPprint.prettyPrint e) `shouldBe`
        [hereLit|x*1 + 2*y + (3 + z)*4*b*(5 + d
+ w)|]

To avoid the hassle of constructing an Exp value by hand, we’re reusing our parser.

Exercise for the reader: write a QuickCheck generator of random expressions, and write a test that verifies that when you roundtrip a random Exp through our AnsiWlPprint.prettyPrint and then back to an Exp through Earley.parses, we get the same Exp back!

The example here formats sums to be space-separated and products to be run together, with minimal parenthesizing and optional line breaking.

The code

The code is brief. The Haddock documentation for ansi-wl-pprint is excellent, so you can check it out for an explanation of the combinators used.

The single fundamental concept behind the library is that there is a data type Doc that represents a “document”, a piece of pretty-printed information, and there is an algebra of combining documents to arrange them in various ways, and then you can use one of a variety of interpreters over the document to perform the final conversion to a string at the end.

module SymbolicDifferentiation.AnsiWlPprint where

import SymbolicDifferentiation.AlphaSyntax (Exp(N, V, Plus, Times))
import Text.PrettyPrint.ANSI.Leijen
       ( Doc
       , int, text, char
       , (</>), (<//>), (<+>), (<>)
       , parens
       )

-- | Very primitive, for illustration only!
--
-- Spaces for sums, but no spaces for products.
-- Soft breaks before operators.
prettyPrint :: Exp -> Doc
prettyPrint e = p 10 e

-- | Pretty-print inside a precedence context to avoid parentheses.
-- Consider + to be 6, * to be 7.
p :: Int -> Exp -> Doc
p _ (N n) = int n
p _ (V v) = text v
p prec (Plus e1 e2) = maybeParens (prec < 7)
  (p 7 e1 </> char '+' <+> p 7 e2)
p prec (Times e1 e2) = maybeParens (prec < 6)
  (p 6 e1 <//> char '*' <> p 6 e2)

maybeParens :: Bool -> Doc -> Doc
maybeParens True = parens
maybeParens False = id

Exercise for the reader: there are many ways to improve the pretty-printer, for example to line up terms to look like

a + b*c
  + d*e
  + f*g

I would probably first transform Exp into a SugaredExp that represents an intermediate abstract syntax of terms and factors, and then write a pretty-printer over that.

Duplication of work?

You might wonder about the duplication of effort in writing the parser and the pretty-printer. Can’t one write a single high-level specification to do both simultaneously? Yes, there are tools for that, but outside the scope of this article.

Conclusion

Pretty-printing is a very important part of a text-based user interface, whether for inspecting generated code or creating good error messages. ansi-wl-pprint is a good library to use for pretty-printing.

All the code

All my code for my article series are at this GitHub repo.

comments powered by Disqus