Adventure in parserland – parsing lambda expressions in F# – Part II

-

The parser starts sim­ple with the fol­low­ing two func­tions to parse ei­ther a string or a file. I use the XXX_Readers_ be­cause I want to lazy read char­ac­ter by char­ac­ter.

let parseString s =
    let reader = new StringReader(s)
    parseTextReader reader
let parseFile fileName =
    let reader = new StreamReader(fileName: string)
    parseTextReader reader

The whole parser is in the fol­low­ing two lines:

let parseTextReader: TextReader -> seq<Expression> =
                    textReaderToLazyList >> tokenStream >> parseExpressions

I need to spec­ify the sig­na­ture oth­er­wise the com­piler gets con­fused : wait, does it take a StringReader or a StreamReader? You bet­ter tell me!

The func­tion is a com­pos­ite of three func­tions ap­plied in se­quence:

  1. Translate a TextReader to a LazyList
  2. Translate a LazyList to a LazyList (lexer)
  3. Translate a LazyList to a LazyList (parser)

My us­age of LazyList as the work­horse for the pro­gram is be­cause I want to match on the head of the stream of chars/​to­kens in a lazy way.

I love it when a pro­gram nat­u­rally de­com­poses in such sim­ple un­der­stand­able pieces. I im­pute some of that to func­tional pro­gram­ming. For one rea­son or an­other, in my 15+ years of ob­ject ori­ented pro­gram­ming, I’ve rarely got to the core of a prob­lem with such im­me­di­acy.

A se­quence of op­er­a­tions likes the above would be lost in a pro­tected over­rid­den im­ple­men­ta­tion of a base class some­where (or some­thing else equally long to pro­nounce). The beauty would be lost some­where in the vast ma­chin­ery re­quired to sup­port it.

In any case, TextReaderToLazyList is a triv­ial gen­er­a­tor func­tion that uses the un­fold func­tion of LazyList to read a char­ac­ter at the time.

let textReaderToLazyList textReader = LazyList.unfold (fun (ts:TextReader) ->
    let ch = ts.Read()
    if ch = -1 then None else Some(char ch, ts)) textReader

The next step is to look at ei­ther the lexer, go­ing bot­tom up, or the parser, go­ing top down.

Tags