Monadic parser in JavaScript
After exploring Haskell for some time, I find myself often adopting functional concepts in my daily work. The exposure to functional programming has even affected the set of tools and frameworks I use. For example, having to parse a custom data format I first tend to search for a Parsec clone implemented in the currently used language. This time it was for JavaScript, but a quick Google search did not reveal any relevant projects. Therefore following is the initial attempt to a probably first general purpose parser library for JavaScript, ‘p4js’.
‘p4js’ is a monadic parser library that provides the basic monadic operators return and bind, but this is maybe a topic for another blog post. Let’s start with a quick introduction (for non Parsec users).
A parser in p4js is an object that provides the basic combinators (parsing functions). The function $P() creates a parser object that can be executed on an input using the parse(input) function. The most basic combinator is $P().item(). It reads one char from the input and pushes it to the result array, for example $P().item().parse(’abc’) will return the array [’a']. If you expect to read a number char, use $P().digit(). This combinator will read the char only if it is a number, otherwise it will throw a parser exception.
Every combinator returns the parser object itself. This enables the chaining of combinators to build more complex parsers. For example the parser
var p = $P().item().item().item();
will return the array [’a', ‘b’, ‘c’] for p.parse(’abs’).
To combine the results of multiple combinators into one value, chain those between the do_() and reduce() actions, for example:
var p = $P().do_().item().item().item().reduce(
function(result_array) { return result_array.join(''); }
);
p.parse(’abc’); will return [’abc’].The function passed to reduce() is called only with the results of the combinators executed between do_() and reduce(). The result of this function is pushed as one value to the result array. In the parser chain, every do_() action requires a matching reduce(), or one of its customization element(), join() etc.
It gets more exciting with the combinators many() and choice(). many(p) or many1(p) applies parser p on the input as many times as possible. For example $P().many($P().digit()).parse(’123abc’) will return [’1′, ‘2′, ‘3′]. Combining this with a reduce function that converts the result into an integer, we get a parser for natural numbers:
var p = $P().do_().many1($P().digit()).reduce(
function(r) { return parseInt(r.join('')); }
);
This parser can be quite useful already, so we can register it with the default library calling p.register(’nat’) and use it from now on as $P().nat().
choice(p1, p2, …) tries to apply parsers pn in the argument order and returns the first successful parsed value or throws an exception if non of the parsers succeeded. The following example uses choice() to parse a number with optional decimal points:
var p = $P().do_().nat().choice(
$P().symbol(".").nat().reduce(
function(r) { while(r[2] > 1.0) r[2] /= 10; return r[0] + r[2]; }),
$P().element(0)).register('num');
First we use nat() to read the integer part of the number. Than we try to read a ‘.’ followed by an nat() for the decimal part. In case this succeeds we compute the decimal number and return the result. Otherwise we use $P().element(0) to return the first element of the first nat() call, the integer part.
The full floating number parser I will leave as an exercise for the reader.
The current library provides only a dozen combinators, but it is enough to have quite some fun already. Following are few examples:
- a CSV parser,
- a small calculator and
- the tiny math processor that uses the parser not only to parse the input into expression tree, but also to perform expression evaluation and function derivation on the tree, and to draw function graphs (on browsers that support the canvas tag).
For more details see the source code and additional examples on GitHub.
Posted in JavaScript | Permalink | Trackback
May 13th, 2009 at 2:18 pm
Nice work. I also did a parser combinator library in JS that you might be interested in looking at:
http://www.bluishcoder.co.nz/2007/10/javascript-parser-combinators.html
The comments on that post link to a couple of others as well.
May 13th, 2009 at 2:20 pm
Chris Double implemented a packrat parser in js not too far removed in concept from the monadic version you outline. Check it out:
- http://www.bluishcoder.co.nz/2007/10/javascript-packrat-parser.html
- http://www.bluishcoder.co.nz/2007/10/javascript-parser-combinators.html
- code: http://www.double.co.nz/cgi-bin/gitweb.cgi?p=jsparse.git;a=summary
The comments in those threads are interesting if you’re planning on improving your parser.
Nice work!
May 13th, 2009 at 3:55 pm
a1,a2,a3
b1,b2,b3
c1,”x,y”,c3
should still be 3×3.
May 14th, 2009 at 7:33 am
[…] announcement blog post explains some of the motivations behind the project: After exploring Haskell for some time, I find […]
May 14th, 2009 at 7:38 am
This is a very useful script!
Thanks for sharing.
May 14th, 2009 at 9:55 am
Very nice library, thanks Chris. To bad Google did not point me to your project.
Current CSV demo supports quoted values now, thanks Douglas.
May 14th, 2009 at 10:17 am
[…] announcement blog post explains some of the motivations behind the project: After exploring Haskell for some time, I find […]
May 14th, 2009 at 4:34 pm
[…] announcement blog post explains some of the motivations behind the project: After exploring Haskell for some time, I find […]
May 15th, 2009 at 10:02 am
[…] announcement blog post explains some of the motivations behind the project: After exploring Haskell for some time, I find […]
May 15th, 2009 at 11:10 am
[…] announcement blog post explains some of the motivations behind the project: After exploring Haskell for some time, I find […]
May 16th, 2009 at 9:01 am
[…] [upmod] [downmod] Mu Dynamics Research Labs » Blog Archive » Monadic parser in JavaScript (labs.mudynamics.com) 1 points posted 2 days, 11 hours ago by jeethu tags javascript parser […]
May 16th, 2009 at 6:33 pm
The CSV demo parser doesn’t support quoted values with ” in them (quoted as “” — see http://rfc.roxen.com/4180 for the text/csv specs’ full details), though. It’s one of the messier basic formats to get right, so it’s a decent challenge for a parser. :-)
May 29th, 2009 at 4:17 pm
I applaud this effort. I also want to emphasis the word done by Chris Double at http://www.bluishcoder.co.nz, it was really enjoyable watching his efforts at the time.
May 30th, 2009 at 10:37 pm
[…] I mentioned in a previous post, following is an introduction to monadic computation in JavaScript. The intent of this post is to […]
September 4th, 2009 at 4:37 pm
The list of JS attributes for input tags can also be found here: http://www.web-html-php-css-js-script.com/js-script/js-input-element/
January 18th, 2010 at 10:03 am
Thank you very much for the article about this post! This is obviously that the writing services will do the essay writing. Thus, that’s a perfect opportunity to buy term papers or custom essay referring to this topic.
February 2nd, 2010 at 7:59 am
When you follow the perfection, you should compose the professional custom comparison essay. However, when do not understand a correct way to do this, the paper writing services would definitely support you.