Downloading stock prices in F# - Part III - Async loader for prices and divs

-

Other parts:

It is now time to load our data. There is a bit of un­in­ter­est­ing code to start with, but things get in­ter­est­ing af­ter­ward. Let’s start with func­tions that cre­ate the right URLs to down­load prices and div­i­dends. We’ll talk about splits in the next in­stall­ment.

let commonUrl ticker span =
    @"http://ichart.finance.yahoo.com/table.csv?s=" + ticker + "&a="
+ (span.Start.Month - 1).ToString() + "&b=" + span.Start.Day.ToString() + "&c="
+ span.Start.Year.ToString() + "&d=" + (span.End.Month - 1).ToString() + "&e="
+ span.End.Day.ToString() + "&f=" + span.End.Year.ToString() let priceUrl ticker span = commonUrl ticker span + "&g=d&ignore=.csv" let divUrl ticker span = commonUrl ticker span + "&g=v&ignore=.csv"

We will also need to con­struct an ob­ser­va­tion given a comma de­lim­i­tated line of text. Again, for spits things will be harder.

let parsePrice (line: string) =
    let tokens = line.Split([|','|])
    { Date = DateTime.Parse(tokens.[0]);
      Event = Price ({Open = money (Double.Parse(tokens.[1])) ;
High = money (Double.Parse(tokens.[2])); Low = money (Double.Parse(tokens.[3])); Close = money (Double.Parse(tokens.[4]));
Volume = volume (Double.Parse(tokens.[5]))})} let parseDiv (line: string) = let tokens = line.Split([|','|]) let date = DateTime.Parse(tokens.[0]) let amount = money (Double.Parse(tokens.[1])) {Date = date; Event = Div amount}

Nothing note­wor­thy about this code. We have a cou­ple of other infrastructure pieces be­fore we get to the Async pieces. The next func­tion is re­cur­sive. It takes a StringReader and reads lines out of it. For each line it calls a pars­ing func­tion that takes the line as in­put and re­turns an ob­ject as out­put. The func­tion gath­ers all such ob­jects in the listOfThings list. If you are new to F# the fol­low­ing con­struct (parseLineFunc line:: listOfThings) means: ex­e­cute the parse­Line­Func with ar­gu­ment line, take the re­sult and cre­ate a list that has the re­sult as head and listOfThings as tail).

let rec loadFromLineReader (reader:StringReader) listOfThings parseLineFunc =
    match  reader.ReadLine () with
    | null  -> listOfThings
    | line  -> loadFromLineReader reader (parseLineFunc line::listOfThings) parseLineFunc        

The next func­tion is rather un­in­ter­est­ing. It just con­verts a string to a StringReader, cut out the first line (header) and calls load­From­LineReader.

let loadFromLineString text listOfThings parseLineFunc =
    let reader = new StringReader(text)
    reader.ReadLine ()|> ignore // skip header
    loadFromLineReader reader listOfThings parseLineFunc

We now come to the first Async func­tion. But what is an Async func­tion? There are sev­eral pos­si­ble tech­ni­cally cor­rect de­f­i­n­i­tion as: it is an in­stance of the monad pat­tern or it is a func­tion that re­turns an Async ob­ject or it is a way to re­lease your thread to the thread pool. These de­f­i­n­i­tion don’t help me much. I need some­thing in­tu­itive to latch one.

The way that I per­son­ally vi­su­al­ize it is: there are things in the world that are very good at ex­e­cut­ing cer­tain tasks and like to be hit by mul­ti­ple par­al­lel re­quests for these tasks. They’d like me to give them their work­load and get out of their way. They’ll call me when they are done with it. These things’ are disk dri­ves, web servers, proces­sors, etc Async is a way to say: hey, go and do this, call me when you are done.

Now, you can call the asyn­chro­nous APIs di­rectly, or you can use the nice F# lan­guage struc­tures to do it. Let’s do the lat­ter.

let loadWebStringAsync url =
    async {
        let req = WebRequest.Create(url: string)
        use! response = req.AsyncGetResponse()
        use reader = new StreamReader(response.GetResponseStream())
        return! reader.AsyncReadToEnd()}

This func­tion re­trieves a web page as a string asyn­chro­nously. Notice that even if the code looks rather nor­mal, this func­tion will likely be ex­e­cuted on three dif­fer­ent thread. The first thread is the one the caller of the func­tion lives on. The func­tion AsyncGetResponse causes the thread to be re­turned to the thread pool wait­ing for a re­sponse back from the web server. Once such a re­sponse ar­rives, the ex­e­cu­tion re­sumes on a dif­fer­ent thread un­til AsyncReadToEnd. That in­struc­tion re­turns the ex­e­cu­tion thread to the thread pool. A new thread is then in­stan­ti­ated when the string has been com­pletely read. The good thing is that all of this is not ex­plic­itly man­aged by the pro­gram­mer. The com­piler writes the code’ to make it all hap­pen. You just have to fol­low a set of sim­ple con­ven­tions (i.e. putting ex­cla­ma­tion marks in the right place).

The re­turn re­sult of this func­tion is an Async, which is some­thing that, when ex­e­cuted, re­turns a string. I can­not em­pha­size this enough: al­ways look at the sig­na­ture of your F# func­tions. Type in­fer­ence can be tricky

Async is some­how con­ta­gious. If you are call­ing an Async func­tion you have to de­cide if prop­a­gate the Asyncness to your callers or re­move it by ex­e­cut­ing the func­tion. Often prop­a­gat­ing it is the right thing to do as your callers might want to batch your func­tion with other aync ones to be ex­e­cuted to­gether in par­al­lel. Your callers have more in­for­ma­tion than you do and you don’t want to short-cir­cuit them. The fol­low­ing func­tion prop­a­gates aync­ness.

let loadFromUrlAsync url parseFunc =
    async {
        let! text = loadWebStringAsync url
        return loadFromLineString text [] parseFunc}

Let’s see how the func­tions pre­sented to this point com­pose to pro­vide a way to load prices and div­i­dends (splits will be shown af­ter­ward).

let loadPricesAsync ticker span = loadFromUrlAsync (priceUrl ticker span) parsePrice
let loadDivsAsync ticker span = loadFromUrlAsync (divUrl ticker span) parseDiv

This com­po­si­tion of func­tions is very com­mon in func­tional code. You con­struct your build­ing blocks and as­sem­ble them to achieve your fi­nal goal. Functional pro­gram­ming is good at al­most forc­ing you to iden­tify the prim­i­tive blocks in your code. All right, next in line is how to load splits.

Tags

10 Comments

Comments

Nice ar­ti­cle.  One po­ten­tial im­prove­ment: why not use sprintf to avoid all those an­noy­ing ToString()s in the com­monUrl func­tion?

Luca Bolognese

2008-09-15T11:59:14Z

You are so very right. My ex­cuse is that the code for URL func is cut and paste of an old C# code I have. That is not even an ex­cuse given that you can do much bet­ter in C# as well :)

Luca Bolognese's WebLog

2008-09-19T17:59:39Z

Other parts: Part I - Data mod­el­ing Part II - Html scrap­ing Part III - Async loader for prices and divs

Very nice!  I am learn­ing a lot, please keep it up.
For bet­ter read­abil­ity, I wrote your url func­tions as fol­lows:
let com­mon­Http­Query ticker span =
 let query = new StringBuilder();
 Printf.bprintf query s=”
 Printf.bprintf query %s” ticker
 Printf.bprintf query &a=”
 Printf.bprintf query %d” (span.Start.Month - 1)
 Printf.bprintf query &b=”
 Printf.bprintf query %d” span.Start.Day
 Printf.bprintf query &c=”
 Printf.bprintf query %d” span.Start.Year
 Printf.bprintf query &d=”
 Printf.bprintf query %d” (span.End.Month - 1)
 Printf.bprintf query &e=”
 Printf.bprintf query %d” span.End.Day
 Printf.bprintf query &f=”
 Printf.bprintf query %d” span.End.Year
 query.ToString()
This al­lows the same query to be used in com­monUrl and spli­tUrl:
let com­monUrl ticker span =
 let url­String query =
   let url­Builder = new UriBuilder()
   urlBuilder.Scheme <- http”;
   urlBuilder.Host <- ichart.fi­nance.ya­hoo.com
   urlBuilder.Port <- 80
   urlBuilder.Path <- table.csv”
   urlBuilder.Query <- query
   urlBuilder.ToString();
 urlString (commonHttpQuery ticker span)
let spli­tUrl ticker span page =
 let url­String query =
   let url­Builder = new UriBuilder()
   urlBuilder.Scheme <- http”;
   urlBuilder.Host <- fi­nance.ya­hoo.com
   urlBuilder.Port <- 80
   urlBuilder.Path <- q/hp”
   urlBuilder.Query <- query
   urlBuilder.ToString();
 urlString (commonHttpQuery ticker span) + sprintf &g=v&z=66&y=%d” (66 * page)

Nice, thanks. I did­n’t even know url­builder ex­isted.

Luca Bolognese's WebLog

2008-09-26T16:04:19Z

Other parts: Part I - Data mod­el­ing Part II - Html scrap­ing Part III - Async loader for prices and divs

Error1The field, con­struc­tor or mem­ber AsyncGetResponse’ is not de­fined. ???

You have to ref­er­ence the FSharp Powerpack. I’m not post­ing the code yet be­cause I’m work­ing on the UI and want to post every­thing to­gether.

Luca Bolognese's WebLog

2008-10-20T18:45:52Z

Other parts: Part I - Data mod­el­ing Part II - Html scrap­ing Part III - Async loader for prices and divs