24 days of Hackage, 2015: day 22: Shake: the dynamic build system

Table of contents for the whole series

A table of contents is at the top of the article for day 1.

Day 22

(Reddit discussion)

make, the venerable build system, is flawed in many ways. For me, the main two problems were

Many improved build systems have come along since make. For example, a couple of years ago, I discovered SCons, the Python-based dynamic build system, and it made my life so much better, because I could treat it as an embeddable library and have tasks call my Python code. I still have SCons programs in use.

However, Shake has come along, and I’m not looking back. It has the virtues of SCons and more. I already use Shake for new build setups and plan to migrate my old SCons builds to Shake the next time they need any significant reworking.

Note that the GHC build is migrating to Shake, so serious dogfooding is going on.

For information about Shake

Instead of repeating some fraction of the extensive and excellent documentation on Shake, I direct you to the Web site, which includes tutorial and reference material. There is also a Google group mailing list. The GitHub repo is extremely active always.

Also, check out creator Neil Mitchell’s blog.

A little example of Shake

I decided to dogfood Shake myself for this post. It turns out that I should have done this earlier. I had stupidly done a repetitive task every day, copying information into a growing table of contents of the whole Days of Hackage series, when I should have written a program to do it automatically.

Here’s an example Shake program that I finally wrote to do it. It is a dynamic build because the table of contents depends on running a process (the Hugo static site generator) and saving off information from that, as well as extracting information out of Markdown files. I extract the day and title from the Markdown source files, and match up the day with the generated HTML URL to create a table of contents.

Compilation

I compile my Shake programs because that makes things much faster, but you can run the shake command also.

Dynamic dependencies

Note that I could have used an oracle to save structured information, but for illustration I saved it into files instead, at the expense of “stringly typed” assumptions about the contents of the files.

Shake is all about want (target) and need (what to depend on). Some functions implicitly add a need, such as getDirectoryFiles and readFileLines. I use writeFileChanged so that Shake can decide whether the contents have stayed the same.

I apologize for the non-explicit imports. Shake being an extensive domain-specific language, I decided to just absorb the vocabulary, but I realize it makes this harder to read.

{-# LANGUAGE QuasiQuotes #-}

module Main where

import MultilineRe (multilineRe)

import Development.Shake
import Development.Shake.FilePath

import Text.Regex.PCRE.Heavy (Regex, scan)
import Data.Maybe (listToMaybe)
import Text.Printf (printf)
import Control.Monad (zipWithM)

shakeDir :: FilePath
shakeDir = "_build"

-- | File, sorted by day, containing a line for each Markdown source of a post.
daysSources :: FilePath
daysSources = shakeDir </> "days-sources"

-- | File, sorted by day, containing a line for each generated Day of Hackage post.
daysUrls :: FilePath
daysUrls = shakeDir </> "days-urls"

tocFile :: FilePath
tocFile = shakeDir </> "TOC.md"

-- | Base directory of blog source.
blogDir :: FilePath
blogDir = "/Users/chen/Sync/ConscientiousProgrammer"

-- | Base directory of generated blog.
publicDir :: FilePath
publicDir = "/Users/chen/ConscientiousProgrammer-public"

-- | Generated HTML directory for each post.
urlsGlob :: FilePattern
urlsGlob = "blog/2015/1*/*/*hackage-2015-day-*"

-- | Location of Markdown blog posts.
postDir :: FilePath
postDir = blogDir </> "content/post"

-- | Rely on naming convention here.
postGlob :: FilePattern
postGlob = "*hackage-2015-day-*"

main :: IO ()
main = shakeArgs shakeOptions{shakeFiles=shakeDir} $ do
  want [tocFile]

  tocFile %> \out -> do
    sourcePaths <- readFileLines daysSources
    urls <- readFileLines daysUrls

    toc <- liftIO $ zipWithM extractTOCEntry sourcePaths urls
    writeFileChanged out $ unlines $ map formatTOCEntry toc

  -- Run Hugo to generate a directory for each post.
  daysUrls %> \out -> do
    need [daysSources]
    unit $ cmd (Cwd blogDir) "hugo"
    Stdout stdout <- cmd Shell (Cwd publicDir) "echo" urlsGlob
    writeFileChanged out $ unlines $ words stdout

  daysSources %> \out -> do
    daysFiles <- getDirectoryFiles postDir [postGlob]
    writeFileChanged out $ unlines $ map (postDir </>) daysFiles

-- | A day's entry in the TOC.
data TOCEntry =
  TOCEntry { _day :: Int
           , _title :: String
           , _url :: String
           }
  deriving (Eq, Ord)

dayTitleRegex :: Regex
dayTitleRegex = [multilineRe|^title:.*day\s+(\d+):\s*([^"]+)|]

extractTOCEntry :: FilePath -> String -> Action TOCEntry
extractTOCEntry sourcePath url = do
  text <- readFile' sourcePath
  case listToMaybe (scan dayTitleRegex text) of
    Just (_, [dayString, title]) ->
      return $ TOCEntry (read dayString) title url
    _ ->
      error $ printf "failed to extract day and title from %s" sourcePath

formatTOCEntry :: TOCEntry -> String
formatTOCEntry entry =
  printf "- Day %d: [%s](/%s/)" (_day entry) (_title entry) (_url entry)

More resources

A nice introductory talk by Neil aimed at a non-Haskell audience to try to sell Shake as a killer app:

His talk at Haskell eXchange 2015, Defining your own build system with Shake.

Conclusion

If you need to build stuff, use Shake. I’m very excited by its active development.

All the code

All my code for my article series are at this GitHub repo.

comments powered by Disqus