Recap
Tackling the problem from the previous post will require us to flesh out a bit more helper functions to get there; which we'll tackle in this post. Let's first try to understand the challenge; which is about the compatibility of input and output function type signatures.
Our base character parser function
pchar has the following signature:
The following combined parser:
C#:
let jma_pchar = pchar 'J' <|> pchar 'M' .>>. pchar 'A'
has the following type signature
C#:
val jma_pchar = Parser<char * char>
i.e. the wrapped result value is compounded tuple
char * char; which is incompatible with the input signature of above the
pchar function; it expects only a single
char as its input parameter.
Composition requires compatible input and output type signatures; which is not possible with the
pchar character parser. What we need is instead to create a
string -> Parser<string> parser and at the same time transform the previous tuple output signature to a single
string value.
That will be goal of this post; to flesh out the functions required to build a
pstring parser.
Steps to construct a pstring parser
A
string parser will be built on a
character parser; but to get there requires us to flesh out more flexibility with the
character parser.
Any character of parsers
Building a powerful combinator parser with our current functionality is not practical; because to specify that a prefix character can be
any letter or
any digit would be too verbose with what we currently have, for example... an
any digit parser would require the following:
C#:
let anyDigit = pchar '0' <|> pchar '1' <|> pchar '2' <|> pchar '3' <|> pchar '4' <|> pchar '5' <|> pchar '6' <|> pchar '7' <|> pchar '8' <|> pchar '9'
Now imagine what an
any letter definition would look like.
To fix this we need a more flexible character combinator; well two of them;
choice and
anyOf:
Choice
This is where the power of combinators starts; using for example the
orElse combinator, we can build even more powerful combinator that is a choice between a
list of
character parsers, rather than just two:
C#:
let choice ps =
List.reduce (<|>) ps
The type signature of the choice function is:
C#:
val choice: Parser<'a> list -> Parser<'a>
It takes a
list of Parsers and reduces (combines it) to a
single Parser.
anyOf
This allows us to define an
anyOf combinator; whose input is a
list of characters; which using choice will be reduced in a
single multi-character parser:
C#:
let anyOf cs =
cs
|> List.map pchar // convert character to a pchar
|> choice // reduce the list of pchar to a single pchar
Usage Example
Let's recreate the previous
anyDigit parser, and also create an
anyLetter parser:
C#:
let pAnyDigit = ['0'..'9']
let pAnyLowerLetter = ['a'..'z']
let pAnyUpperLetter = ['A'..'Z']
let pAnyLetter = anyLowerLetter <|> anyUpperLetter
To parse the previous two prefix characters of our two email addresses, now requires only this:
C#:
[<EntryPoint>]
let main argv =
let parseTwoPrefixes = pAnyLetter .>>. pAnyLetter
let result1 = run parseTwoPrefixes "[email protected]"
printfn "%A" result1
let result2 = run parseTwoPrefixes "[email protected]"
printfn "%A" result2
The output from this is...
Tackling the transformation of the result
To transform the result we need a function that will allows us to transform the wrapped content of a Parser, because the output from running the above parsers is
ParserResult<(char * char) * string> -- a
tuple of two matching characters and a
tuple with the remaining unparsed
string of characters.
What we really need is
ParserResult<string * string> in stead of
ParserResult<(char * char) * string>; meaning the tuple of
(char * char) must be transformed into a
string. The function to do transformations in the functional world is
map; let's build that for our
Parser type.
Map (functor compliance)
C#:
let mapP f p =
let innerF input =
let result = run p input
match result with
| Success (value, remaining) -> Success (f value, remaining) // transform value
| Failed err -> Failed err // propagate error condition
Parser innerF // wrap inner function in a Parser
mapP is a curried function that takes in a value transforming function
f and a parser
p and then has an internal function
innerF that requires some input; a
string; we then run our parser
p on our curried
input, and pattern match; if it's a
Success we apply our transform function
f to the value, otherwise if it
Failed, we propagate the error
err.
The type signature of our
mapP function is:
C#:
val mapP: f:('a -> 'b) -> Parser<'a> -> Parser<'b>
The generic variable
'a refers to value part in the type alias of:
C#:
type Parser<'a> = Parser of (string -> ParserResult<'a * string>)
i.e. our
mapP function allows us to transform the generic
'a parser result portion; the
string which represents the remaining unparsed characters is untouched.
Map operators
We'll create two operators to simplify the use of the
map function:
C#:
let (<!>) = mapP
let (<&>) x f = mapP f x // inverted parameter version of <!> operator
Usage Example
Let's revisit our previous
parseTwoPrefixes parser to transform the result from a tuple of two characters to a
string:
C#:
let parseTwoPrefixes =
let twoLetterParser = pAnyLetter .>>. pAnyLetter
let transform (c1, c2) = String [|c1; c2|]
mapP transform twoLetterParser
// or more succinctly using our map operator
let parseTwoPrefixes = (pAnyLetter .>>. pAnyLetter) <&> fun (c1, c2) -> String [|c1; c2|]
Let's now parse our two emails with this.
C#:
[<EntryPoint>]
let main argv =
let parseTwoPrefixes = (pAnyLetter .>>. pAnyLetter) <&> fun (c1, c2) -> String [|c1; c2|]
let result1 = run parseTwoPrefixes "[email protected]"
printfn "%A" result1
let result2 = run parseTwoPrefixes "[email protected]"
printfn "%A" result2
The output from this is...
Looks good so far; but we're still missing quite a few functions to be able to create a parser that matches a
char list (a
string). In the next post we'll continue adding more combinator functions that are required to achieve our goal of a
pstring parser.