Monadic parser in JavaScript

May 13th, 2009 by asmyczek

After exploring Haskell for some time, I find myself often adopting functional concepts in my daily work. The exposure to functional programming has even affected the set of tools and frameworks I use. For example, having to parse a custom data format I first tend to search for a Parsec clone implemented in the currently used language. This time it was for JavaScript, but a quick Google search did not reveal any relevant projects. Therefore following is the initial attempt to a probably first general purpose parser library for JavaScript, ‘p4js’.

‘p4js’ is a monadic parser library that provides the basic monadic operators return and bind, but this is maybe a topic for another blog post. Let’s start with a quick introduction (for non Parsec users).

A parser in p4js is an object that provides the basic combinators (parsing functions). The function $P() creates a parser object that can be executed on an input using the parse(input) function. The most basic combinator is $P().item(). It reads one char from the input and pushes it to the result array, for example $P().item().parse(’abc’) will return the array [’a']. If you expect to read a number char, use $P().digit(). This combinator will read the char only if it is a number, otherwise it will throw a parser exception.

Every combinator returns the parser object itself. This enables the chaining of combinators to build more complex parsers. For example the parser

var p = $P().item().item().item();

will return the array [’a', ‘b’, ‘c’] for p.parse(’abs’).
To combine the results of multiple combinators into one value, chain those between the do_() and reduce() actions, for example:

var p = $P().do_().item().item().item().reduce(
    function(result_array) { return result_array.join(''); }
);

p.parse(’abc’); will return [’abc’].The function passed to reduce() is called only with the results of the combinators executed between do_() and reduce(). The result of this function is pushed as one value to the result array. In the parser chain, every do_() action requires a matching reduce(), or one of its customization element(), join() etc.

It gets more exciting with the combinators many() and choice(). many(p) or many1(p) applies parser p on the input as many times as possible. For example $P().many($P().digit()).parse(’123abc’) will return [’1′, ‘2′, ‘3′]. Combining this with a reduce function that converts the result into an integer, we get a parser for natural numbers:

var p = $P().do_().many1($P().digit()).reduce(
    function(r) { return parseInt(r.join('')); }
);

This parser can be quite useful already, so we can register it with the default library calling p.register(’nat’) and use it from now on as $P().nat().

choice(p1, p2, …) tries to apply parsers pn in the argument order and returns the first successful parsed value or throws an exception if non of the parsers succeeded. The following example uses choice() to parse a number with optional decimal points:

var p = $P().do_().nat().choice(
    $P().symbol(".").nat().reduce(
        function(r) { while(r[2] > 1.0) r[2] /= 10; return r[0] + r[2]; }),
    $P().element(0)).register('num');

First we use nat() to read the integer part of the number. Than we try to read a ‘.’ followed by an nat() for the decimal part. In case this succeeds we compute the decimal number and return the result. Otherwise we use $P().element(0) to return the first element of the first nat() call, the integer part.

The full floating number parser I will leave as an exercise for the reader.

The current library provides only a dozen combinators, but it is enough to have quite some fun already. Following are few examples:

  • a CSV parser,
  • a small calculator and
  • the tiny math processor that uses the parser not only to parse the input into expression tree, but also to perform expression evaluation and function derivation on the tree, and to draw function graphs (on browsers that support the canvas tag).

For more details see the source code and additional examples on GitHub.

Posted in JavaScript | Permalink | Trackback

17 Responses

  1. Chris Double

    Nice work. I also did a parser combinator library in JS that you might be interested in looking at:

    http://www.bluishcoder.co.nz/2007/10/javascript-parser-combinators.html

    The comments on that post link to a couple of others as well.

  2. dsc

    Chris Double implemented a packrat parser in js not too far removed in concept from the monadic version you outline. Check it out:
    - http://www.bluishcoder.co.nz/2007/10/javascript-packrat-parser.html
    - http://www.bluishcoder.co.nz/2007/10/javascript-parser-combinators.html
    - code: http://www.double.co.nz/cgi-bin/gitweb.cgi?p=jsparse.git;a=summary

    The comments in those threads are interesting if you’re planning on improving your parser.

    Nice work!

  3. Douglas

    a1,a2,a3
    b1,b2,b3
    c1,”x,y”,c3

    should still be 3×3.

  4. Ajaxian » Monadic Parser in JavaScript

    […] announcement blog post explains some of the motivations behind the project: After exploring Haskell for some time, I find […]

  5. 墨尔本

    This is a very useful script!
    Thanks for sharing.

  6. asmyczek

    Very nice library, thanks Chris. To bad Google did not point me to your project.
    Current CSV demo supports quoted values now, thanks Douglas.

  7. Ajax Girl » Blog Archive » Monadic Parser in JavaScript

    […] announcement blog post explains some of the motivations behind the project: After exploring Haskell for some time, I find […]

  8. Monadic Parser in JavaScript | Free Tutorial 4 All

    […] announcement blog post explains some of the motivations behind the project: After exploring Haskell for some time, I find […]

  9. Monadic Parser in JavaScript | Guilda Blog

    […] announcement blog post explains some of the motivations behind the project: After exploring Haskell for some time, I find […]

  10. Monadic Parser in JavaScript

    […] announcement blog post explains some of the motivations behind the project: After exploring Haskell for some time, I find […]

  11. Tagz | "Mu Dynamics Research Labs » Blog Archive » Monadic parser in JavaScript" | Comments

    […] [upmod] [downmod] Mu Dynamics Research Labs » Blog Archive » Monadic parser in JavaScript (labs.mudynamics.com) 1 points posted 2 days, 11 hours ago by jeethu tags javascript parser […]

  12. Johan Sundström

    The CSV demo parser doesn’t support quoted values with ” in them (quoted as “” — see http://rfc.roxen.com/4180 for the text/csv specs’ full details), though. It’s one of the messier basic formats to get right, so it’s a decent challenge for a parser. :-)

  13. Arni Hermann

    I applaud this effort. I also want to emphasis the word done by Chris Double at http://www.bluishcoder.co.nz, it was really enjoyable watching his efforts at the time.

  14. Mu Dynamics Research Labs » Blog Archive » Fuzzing in JavaScript, an exercise in monadic computation

    […] I mentioned in a previous post, following is an introduction to monadic computation in JavaScript. The intent of this post is to […]

  15. Adrian

    The list of JS attributes for input tags can also be found here: http://www.web-html-php-css-js-script.com/js-script/js-input-element/

  16. EvieIh

    Thank you very much for the article about this post! This is obviously that the writing services will do the essay writing. Thus, that’s a perfect opportunity to buy term papers or custom essay referring to this topic.

  17. Konnie24dl

    When you follow the perfection, you should compose the professional custom comparison essay. However, when do not understand a correct way to do this, the paper writing services would definitely support you.

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.