Adventure in parserland – parsing lambda expressions in F# – Part V

-

We are now go­ing to look at a so­lu­tion which is con­cise, ef­fi­cient and gives so­phis­ti­cated er­ror mes­sages. It is also less than 20 lines of code. We’ll be us­ing FParsec.

FParsec is a port of an Haskell li­brary. It is a parser com­bi­na­tor li­brary or, as I like to think of it, an in­ter­nal DSL to build parsers in F#. My usual dis­claimer: I’m not an ex­pert in FParsec. It is likely that, if you are an ex­pert, you can come up with more main­tain­able/​ef­fi­cient/​el­e­gant ver­sion of this parser.

The whole code is be­low:

let ws = " \t\n"
let specialChars = ".)(\\\n"
let pWs = spaces
let pName = manyChars (noneOf (ws + specialChars)) |>> EName
let pExpr, pExprRef = createParserForwardedToRef<Expression, Unit>()
let curry2 f a b = f(a,b)
let pFunction = pchar '\\' >>. pipe2 pName (pchar '.' >>. pExpr) (curry2 Function)
let pApplication = pchar '(' >>. pipe2 pExpr (pWs >>. pExpr) (curry2 Application)
                          .>> pWs .>> pchar ')'
do pExprRef := pFunction <|> pApplication <|> pName
let pExpressions = sepBy pExpr spaces1

This mir­rors pretty closely the gram­mar we are try­ing to parse. A pro­gram is a bunch of ex­pres­sions sep­a­rated by white­spaces.

let pEx­pres­sions = sepBy pExpr spaces1

sepBy is a com­bi­na­tor that takes a parser that de­fines what to parse and a parser that de­fines what the sep­a­ra­tor should be. And the sep­a­ra­tor should be one or more spaces …

pExpr is ei­ther a func­tion, an ap­pli­ca­tion or a name. the op­er­a­tor <|> is a choice com­bi­na­tor that tries each parser in or­der. It tries the right parser just if the left parser fails and it does­n’t con­sume in­put. So it does­n’t back­track. If you need a back­track­ing parser, you’ll need to get ac­quainted with the at­tempt com­bi­na­tor.

do pExprRef := pFunction <|> pApplication <|> pName

A func­tion starts with a ', then comes a name, a dot and an ex­pres­sion.

let pFunction = pchar '\\' >>. pipe2 pName (pchar '.' >>. pExpr)
(curry2 Function)

I know that might look crazy (and maybe it is), but just bear with me. Someone , who I’m not sure I can name, once told me that func­tional pro­gram­ming is great to write code, but ter­ri­ble to read it and de­bug it. The phrase stayed with me as con­tain­ing a grain of truth. In any case, in the code above:

let curry2 f a b = f(a,b)

An ap­pli­ca­tion works sim­i­larly, but dif­fer­ently …

let pApplication = pchar '(' >>. pipe2 pExpr (pWs >>. pExpr) (curry2 Application)
                              .>> pWs .>> pchar ')'

The only dif­fer­ence is that now we have to con­sume the op­tional white­spaces and the )’ at the end. A rule of thumb that I use is to use >>.  to flow the re­sult through and pipeX when I need more than one re­sult.

The last thing is pName, which con­sume chars un­til it finds ei­ther a white­space or a spe­cial char.

let ws = " \t\n"
let specialChars = ".)(\\\n"
let pWs = spaces
let pName = manyChars (noneOf (ws + specialChars)) |>> EName

And there you have it, a lexer, a parser all in 20 lines of code. I don’t like the code that I wrote above much. I’m sure I could re­fine it plenty and it prob­a­bly con­tains some bugs, but it gives an idea of what is pos­si­ble with FParsec.

Tags

1 Comment

Comments

Gert-Jan van der Kamp

2011-09-19T10:49:09Z

LOL it’s just com­i­cal how lit­tle code it FParsec re­quires com­pared to hand-rolling a lexer + parser from scratch. Thanks for this se­ries, lot’s of food for thought!