24 days of Hackage, 2015: day 4: wreq: Web client programming; with notes on lens and operator syntax
Dec 4, 2015 · 11 minute read · CommentsHaskellHackagewreqJSONlensaesonsyntaxPittsburgh Code and SupplyStandard MLOCamlElmElixirtypesdomain-specific languages
Table of contents for the whole series
A table of contents is at the top of the article for day 1.
Day 4
In the late 1990s, I eagerly bought the book “Web Client Programming with Perl” and used the LWP library to scrape the Web in automated fashion. I continued doing that into the 2000s. I am happy that nowadays, I can just use Haskell to do this kind of programming, in a succinct way also.
Today’s topic is wreq
,
Bryan O’Sullivan’s high-level
library for doing Web client programming designed specifically for
usability.
wreq
makes use of the
aeson
ecosystem for JSON
and lens
and ecosystem,
including
lens-aeson
, so you
may want to check out Ollie’s 2012 Days of Hackage posts on
aeson
and lens.
Since wreq
already has an extensive
tutorial and reference documentation,
I’m not going to repeat its explanations. Instead, I’m going to give an
example of use that should be simple enough to be understood from
context, then discuss the issue of using operator syntax in Haskell.
The task
I’m a member of many groups on Meetup. It’s often useful for me to get information using the official Meetup API rather than go around clicking on a Web site on or a mobile app. Why do by hand what I can do much more efficiently and correctly with code?
Here’s a very simplified example of something I might want to do with Meetup. I’ve been active in the Pittsburgh Code and Supply community, which has a Meetup site with a packed calendar of events (it’s on hiatus now in December for the holidays, but is otherwise very active). Maybe I want to find out what upcoming events they are, and search for events of interest according to some criteria. For our toy example here, let’s say I want to find the ten upcoming events and get their names and venue names, and make sure there’s at least one event that has a name and venue name already set up (sometimes, an event is proposed but no venue has been found yet).
A test
Yesterday, day 3 of this article series, I mentioned liking using HSpec, so let’s use HSpec.
{-# LANGUAGE OverloadedStrings #-}
import WreqExample (GroupId, eventName, venueName, getMeetupEventInfos)
import Test.Hspec ( Spec, hspec, describe, it
, shouldSatisfy, shouldNotSatisfy
)
import qualified Data.Text as Text
We are using the text
packed Unicode string type, because that’s what wreq
uses. OverloadedStrings
is a convenient GHC extension that allows
string literals in code to be treated as Text
values rather than
String
. Ollie discusses this extension in his 2014 Days of GHC Extensions.
Also, since I’m operating in test-driven development style, I wrote
this test first, before writing the WreqExample
module: I only wrote
the imports for what I need for the test.
spec :: Spec
spec =
describe "wreq" $ do
it "there are named, located Pittsburgh Code and Supply events coming up" $ do
-- Warning! This is a stateful test going out to the Web.
events <- getMeetupEventInfos pittsburghCodeAndSupplyId
events `shouldNotSatisfy` null
events `shouldSatisfy` any
(\event -> (not . Text.null . eventName) event
&& (not . Text.null . venueName) event)
pittsburghCodeAndSupplyId :: GroupId
pittsburghCodeAndSupplyId = "13452572"
Module signatures
If Haskell had module signatures, like Standard ML and OCaml do, I would write an explicit module signature for the module I intend to implement that will conform to that signature, but Haskell doesn’t, so the best we can do is operate in “duck typing” manner at the module level, relying implicitly on compilation to fail on import of a conforming module implementation rather than on matching against an explicit signature without the need for an implementation.
Here are the types we need (in a pseudo-syntax as though Haskell had module signatures):
type GroupId -- abstract
type EventInfo -- abstract
-- abstract type accessors
eventName :: EventInfo -> Text
venueName :: EventInfo -> Text
getMeetupEventInfos :: GroupId -> IO [EventInfo]
Implementation
Imports
import Network.Wreq (Options, defaults, param, getWith, asValue, responseBody)
import Data.Text (Text)
import Data.Aeson (Value)
import Control.Lens (view, set, toListOf)
import Data.Aeson.Lens (key, _Array, _String)
Types
-- | Information that we care about from a Meetup event.
data EventInfo =
EventInfo { eventName :: Text
, venueName :: Text
}
deriving (Show)
-- | A valid Meetup group ID.
type GroupId = Text
The Web client part
Since we’re only making one request, and are not doing any error
handling, but letting wreq
throw exceptions instead, the Web client
part is very brief. The Meetup API allows returning information as
JSON.
meetupEventsUrl :: String
meetupEventsUrl = "https://api.meetup.com/2/events"
We perform a GET
with query parameters. wreq
uses lens as its
domain-specific language for creating options for GET
, so let’s
create a wreq
Options
value, by setting the parameters one after
another using a builder pattern starting with the wreq
defaults
:
eventsOptions :: GroupId
-> Options
eventsOptions groupId =
set (param "page") ["10"] (
set (param "order") ["time"] (
set (param "status") ["upcoming"] (
set (param "group_id") [groupId] (
set (param "format") ["json"] defaults))))
We begin by going out to the Web to get back a response, which is a
lazy ByteString
:
getMeetupEventInfos :: GroupId -> IO [EventInfo]
getMeetupEventInfos groupId = do
response <- getWith (eventsOptions groupId) meetupEventsUrl
The JSON part
Then we parse the lazy ByteString
response, including the headers
and the body, into an untyped JSON object, an aeson
Value
:
jsonResponse <- asValue response
More precisely, Value
is unityped:
type Object = HashMap Text Value
type Array = Vector Value
data Value = Object !Object
| Array !Array
| String !Text
| Number !Scientific
| Bool !Bool
| Null
The lens part
It was annoying figuring out from the official Meetup API site what fields I needed from the response and what their types were supposed to be. In practice I just saved off JSON from a representative query and looked at some events to see what I wanted. I was told where to find the automatically generated documentation of all the API methods but it was not ideal. A later Day of Hackage will discuss what I did about this problem.
We extract the list of events, using a traversal to get the whole
list, which is encoded as a JSON array in the top level JSON object’s
results
field:
let events = toListOf (responseBody
. key "results"
. _Array . traverse
) jsonResponse
Here we use toListOf
from lens with a traversal and a JSON object to
pull out everything from that traversal.
Finally, since we only want, for each event, its name and its venue’s name (the venue’s name is actually a field in a venue object):
return (map jsonToEventInfo events)
We again use lens, at the level of an individual event object, to extract what we want from it:
-- | Extract our typed data model from an untyped JSON object.
jsonToEventInfo :: Value -> EventInfo
jsonToEventInfo json =
EventInfo { eventName = view (key "name" . _String) json
, venueName = view (key "venue"
. key "name" . _String) json
}
Here we use the view
function of lens
, to apply a lens to the JSON
object to pull a field out of it.
And we’re done! We’ve written a script that looks pretty much like what you’d write in Perl or Python. It will also “fail” in similar ways, because we’re basically not using any types at all; even the final result just has strings, which may or may not be empty, whatever that’s supposed to mean. For example, if you try to find a field by a string key that doesn’t exist, the particular code here will just silently give back an empty string. Can we do better? Yes, there are various ways to do better. Stay tuned for a later Day of Hackage.
Lens operator syntax
If you’ve already used wreq
or lens
, you may have noticed
something strange above: I didn’t use any lens
operator syntax. This
was deliberate. Although the wreq
tutorial gives a
little bit of background on lens
,
the reality is that when some friends who were not experienced lensers
or Haskellers asked me how I do Web client programming in Haskell, and
I pointed to wreq
as being pretty cool, they got immediately stuck
on the lens stuff. Looking back at the tutorial, I do see that it
jumps straight into operator soup. This is unfortunate. You can
immediately use libraries like wreq
without having the lens
operators memorized already. You have to understand some facts (such
as the use of the function composition operator to compose lenses) and
have an idea of how the types work out, but one thing you don’t need
is the funny operators. I think it’s best to understand how to do
things without operators before starting to use them as a convenient
shortcut.
For example, an idiomatic way to set the options object, as presented
in the “whirlwind tour” section of the wreq
tutorial, is:
import Control.Lens ((&), (.~))
eventsOptions :: GroupId
-> Options
eventsOptions groupId = defaults
& param "format" .~ ["json"]
& param "group_id" .~ [groupId]
& param "status" .~ ["upcoming"]
& param "order" .~ ["time"]
& param "page" .~ ["10"]
I don’t like the idea of newcomers to this library just copying and pasting stuff without understanding what it does, or getting the impression that these operators are somehow built into the Haskell language or required for using the library. People really do get these impressions.
I happen to like the reverse function operator &
a lot, although
it’s not as suggestive as the exact same reverse function operator in
many other languages (such as F#, OCaml, Elm, Elixir) in the form of a pipe
instead
|>
,
so I feel OK about using it.
But the .~
is I think not very suggestive to newcomers to
lens
. Is set lens newValue object
so much worse to write or read than
object & lens .~ newValue
?
(Update of 2014-12-12) Thinking compositionally
One thing that is unfortunately lost if you use pipeline application
operators such as &
is the compositionality that underlies the
power of lenses. So here is a refactoring of eventsOptions
that
shows how to best think of what we are doing, which is creating a
“builder” and applying it:
eventsOptionsRefactored :: GroupId -> Options
eventsOptionsRefactored groupId = builder defaults
where builder = eventsOptionsBuilder groupId
-- | Recall: type is sugar for GroupId -> (Options -> Options)
eventsOptionsBuilder :: GroupId -> Options -> Options
eventsOptionsBuilder groupId =
set (param "page") ["10"]
. set (param "order") ["time"]
. set (param "status") ["upcoming"]
. set (param "group_id") [groupId]
. set (param "format") ["json"]
Note the separation of concerns here: instead of thinking of building
an Options
object as
- starting with a default
- successively applying an extra setting to it
we think of
- creating an options builder through composition
- applying the builder to the default
Partial application in functional programming is used here to
implement the builder pattern: eventsOptionsBuilder
takes one
argument, and returns an Options
transformer of type Options ->
Options
.
Code golf?
To illustrate both the up sides and down sides of using operators (but in this case mostly down sides, I think), here is a code golf version of the entire code:
import Network.Wreq (Options, defaults, param, getWith, asValue, responseBody)
import Data.Text (Text)
import Control.Lens ((&), (.~), (^.), (^..))
import Data.Aeson.Lens (key, _Array, _String)
import Control.Arrow ((>>>), (&&&))
meetupEventsUrl :: String
meetupEventsUrl = "https://api.meetup.com/2/events"
-- | A valid Meetup group ID.
type GroupId = Text
-- | For searching for events in a Meetup group.
eventsOptions :: GroupId
-> Options
eventsOptions groupId = defaults
& param "format" .~ ["json"]
& param "group_id" .~ [groupId]
& param "status" .~ ["upcoming"]
& param "order" .~ ["time"]
& param "page" .~ ["10"]
-- | Code golf version. Don't do this?
getMeetupNameAndVenues :: GroupId -> IO [(Text, Text)]
getMeetupNameAndVenues groupId =
getWith (eventsOptions groupId) meetupEventsUrl
>>= asValue
>>= ((^.. responseBody
. key "results"
. _Array . traverse)
>>> map ((^. key "name" . _String)
&&& (^. key "venue"
. key "name" . _String)
)
>>> return
)
In a way, this looks cool because the piping left to right reads well and naturally, if you know all the operators and are happy with operator sectioning syntax and point-free combinators. But when I showed this to friends who are not so fluent in Haskell, they didn’t like this. Also, note that I made concessions in order to arrange this pipeline. I lost the comments, the intermediate named sub-computations (very useful for finer-grained testing), and even my custom result type (resorting to just tupling). I feel something has been lost by writing in this style even though part of me secretly likes it.
An interview with Bryan O’Sullivan
Recently (September 2015), The Haskell Cast interviewed Bryan
O’Sullivan. I highly recommend listening to the whole thing. He
had stories to tell about how he got into Haskell, how he ended up
writing all these libraries, and how he goes about designing them and
what his goals are when implementing them. Note that aeson
and
text
, which everyone uses, are his creations. Thank you, Bryan, for
all you’ve done for the Haskell community!
Lens resources
Gabriel Gonzalez wrote a lens tutorial that is useful. Thank you, Gabriel, for writing tutorials not only on your own libraries, but for others as well!
Conclusion
For day 4, I presented a tiny example of use of wreq
with aeson
and lens
to perform a simple task of getting information from the
Web, and tried to make wreq
more accessible by not requiring use of
lens
operators up front.
All the code
All my code for my article series are at this GitHub repo.