Engineering

A magic date input using parser combinators in TypeScript

Written By

Max Tagher

Illustration of code snipped for Mercury Engineering blog
Copy Link
Share on Twitter
Share on LinkedIn
Share on Facebook
Linegraph tracking a Mercury account balance
Build the next generation of startup bankingExplore Openings*Mercury is a financial technology company, not a bank. Banking services provided by Choice Financial Group and Evolve Bank & Trust®; Members FDIC.
Copy Link
Share on Twitter
Share on LinkedIn
Share on Facebook

This post was originally published on Medium.

When our product team started designing a date picker to search historical transactions, they realized existing date pickers didn’t quite fit the bill. Most date pickers are designed to pick either a single date or precise date range (usually in the future, e.g. for travel), whereas a common use case for our date picker would be picking a general range of time. The final design incorporated three styles of date entry: a dropdown menu for common queries (“last month”, “this year”), a month-by-month calendar widget, and text fields for keyboard-based input.

The text field based inputs we saw on other websites used rigid formats like MM/DD/YYYY, which we found uncomfortable to use. Entering your birthday in that format isn’t that difficult, but date range queries like July 12 through the end of November (how many days are in November again?) are harder to enter. The MM/DD/YYYY style also requires more information than you would naturally think in: mentally or verbally, you would probably think about date ranges spanning “January through March” — the current year and specific days being implicit — instead of 01/01/2018–03/31/2018.

So instead we decided to make a date input that could handle almost all potential inputs †. You can try out the date input here.

Here’s a sample of the formats we accept, taken from our unit tests:

Copy Code
'1.1.2018'
'1–1–2018'
'1/1/2018'
'1,1,2018'
'1 1 2018'
'1 1' // January 1st of the current year
'Jan 1'
'January 1 2018'
'January 1st 2018'
'January 2nd 2018'
'January 3rd 2018'
'January 4th 2018'
'March 1 18'
'December 1, 18'
'December 1, 19'
'December 1, 00'
'Jan' // January of the current year (Either first or last day, depending on which date input is being used)
'Dec' // If December of the current year is in the future, then December of last year, otherwise December of this year.
'Jan 17' // January 17, current year
'Jan 2017' // January of 2017
'today'
'yesterday'
'TODAY'

For simple parsers you might turn to a regular expression, but can you imagine writing a regular expression for all that? Or worse: editing one that someone else wrote? Before I learned about parser combinators, that might have been how I approached this task, if I hadn’t written it off altogether. Even stitching together date parsing library code would have been exceedingly complex. But with parser combinators, it’s easy to start small and build up to the complex functionality needed for a great user experience.

Many languages have parser combinator libraries, but for JavaScript/TypeScript we’ll use the Parsimmon library. To get started, let’s look at the “parser” half of parser combinators.

Parsers

In this context, a “parser” is something that can take a string as an input, and either return an output (e.g. a number) or fail. So for example, if we wanted to parse an integer in a string into a JavaScript number, we’d want this behavior:

Copy Code
"abc" -> failure
"123" -> 123
"123!" -> 123 (remaining input: "!")
"!123" -> failure
"123.12" -> 123 (remaining input: ".12")

Here’s how we’d write a parser for this using Parsimmon, in TypeScript:

Copy Code
import * as P from 'parsimmon'

const numberParser: P.Parser<number> =
  P.regexp(/[0–9]+/)
   .map(s => Number(s))

1. Create a variable, numberParser, which is a Parser for JavaScript numbers

2. Accept input matching a regular expression, giving us the segment of the string that matches. This returned segment is a string, so its type is a Parser<string>.

3. Finally, we’ll use the Parser’s map function to apply JavaScript’s Number constructor to the string we’ve matched so far, giving us a Parser<number>. So if the parser is later applied to the string "123", it will output 123.

Note: this is different from the normal JavaScript map function, which maps one Array to another.

To actually use the parser, use the functions parse or tryParse:

Copy Code
numberParser.parse("123") // {status: true, value: 123}
numberParser.tryParse("123") // 123
numberParser.tryParse("!123") // Exception thrown

That’s a pretty useful parser already! But to parse a month, let’s constrain it to parsing 1 through 12:

Copy Code
const numberMonthParser: P.Parser<number> =
  P.regexp(/[0–9]+/)
   .map(s => Number(s))
   .chain(n => {
     if (n >= 1 && n <= 12) {
       return P.succeed(n)
     } else {
       return P.fail("Month must be between 1 and 12")
     }
   })

With numberParser, we just returned any number we got. With this parser, we grab the value we’ve parsed so far (the integer n) and apply some custom logic to it — in this case constraining it to 1 through 12.

To do this we’re using a new function, chain, which is slightly different than map:

  • Unlike map, chain returns a new Parser, not a regular value
  • Since it can return a new Parser, chain can fail using P.fail

Here we’re using P.succeed to return a the same value that came in, but we could return any value we wanted. The P.fail case gives a nice error message if we parsed bad data.

That parser lets us parse months in their numeric form. Let’s also make a parser for the full names of months:

Copy Code
const monthNames = {
  january: 1,
  february: 2,
  march: 3,
  april: 4,
  may: 5,
  june: 6,
  july: 7,
  august: 8,
  september: 9,
  october: 10,
  november: 11,
  december: 12
};
const namedMonthParser: P.Parser<number> = P.letters.chain(s => {
  const n = monthNames[s.toLowerCase()];
  if (n) {
    return P.succeed(n);
  } else {
    return P.fail(`${s} is not a valid month`);
  }
});

So this lets us parse months formatted as 1–12 with one parser, and months with their full name spelled out with another parser. But what if we want to try either parser on a string? For that, we need combinators.

Combinators

Combinators are utility functions that take multiple parsers as input and return a new parser; they “combine” parsers. A simple combinator is alt, which takes a list of parsers and tries each of them until one succeeds. For example, here’s how we’d try to parse a month as a number (1–12) or a name (September):

Copy Code
const monthParser: P.Parser<number> = P.alt(numberMonth, namedMonth)

When we combined two Parser<number>s using alt, we got a Parser<number> out! This “parsers all the way down” model makes it very easy to compose and reuse code.

Another useful combinator from Parsimmon is seq (short for “sequence”). It takes a list of parsers and runs all of them in order, returning the results in an array. Here’s how we might parse two months separated by a hyphen:

Copy Code
const monthRangeParser: P.Parser<[number]> = 
  P.seq(monthParser, P.string("-"), monthParser).map(
    ([firstMonth, _hyphen, secondMonth]) => {
      return [firstMonth, secondMonth];
    }
  );

With this foundation, you’re pretty much ready to go out and start writing parser combinators of your own — you can get very far with the simple combinators above, and much of what Parsimmon offers is convenience functions built using the tools above. But to continue with our story, here are some of the more interesting parsers needed for our date parser:

Day-of-the-month parser

This parser parses a numeric day of the month (1–31), with an optional suffix for 1st, 2nd, 3rd, etc.

Copy Code
// Parse a day of the month (1–31)
const dayOfMonthParser: P.Parser<number> = P.regexp(/[0–9]+/)
  .map(s => Number(s))
  .chain(n => {
    return numberDaySuffixParser // See next function
      .fallback("") // Falling back to a value and not using it makes the suffix parser optional
      .chain(() => {
        if (n > 0 && n <= 31) {
          return P.succeed(n);
        } else {
          return P.fail("Day must be between 1 and 31");
        }
      });
  });
// Accept suffixes like 1st, 2nd, etc.
// Note: For our own convenience we'll accept invalid names like 1nd or 4st
// (If you were implementing something like a programming language,
// you'd want to be more strict)
const numberDaySuffixParser: P.Parser<string> = P.alt(
  P.string("st"),
  P.string("nd"),
  P.string("rd"),
  P.string("th")
);

2- or 4-digit year parser

Copy Code
// Parse a 2 or 4 digit year, using custom logic to convert digits
// like "14" to 2015, and digits like "70" to 1970:
const yearParser: P.Parser<number> = P.regexp(/[0–9]+/)
  .map(s => Number(s))
  .chain(n => {
    if (n > 999 && n <= 9999) {
      return P.succeed(n);
    } else if (n > 30 && n < 99) {
      return P.succeed(1900 + n);
    } else if (n >= 0 && n <= 30) {
      return P.succeed(2000 + n);
    } else {
      return P.fail("Invalid year");
    }
  });

Separator

Copy Code
// Parses 0 or more separator characters between words
// Again, we're fine with 
// (and even want to accept, for greatest possible compatibility)
// inputs like "1 , 2" or "1//3", so we accept 
// arbitrarily many separators using `many`
const separatorParser: P.Parser<string[]> = P.oneOf(",-/ .").many();

Once we’ve built up parsers like month and year, we can combine them together to parse complete dates. For example, this is how our codebase accepts the “month year” format (e.g. Jan 2018, 1 18, September 19, etc.)

Copy Code
// We pass an argument to the parser to tell it to return
// the first or last day of a month,
// so a text field for the starting day in a date range
// can ask for the first day of the month,
// and the text field for the ending day can ask for the last day
export enum DayDefault {
  Start,
  End
}
const monthYear = (dayDefault: DayDefault): P.Parser<Day> => {
  return P.seq(monthParser, separatorParser, yearParser)
          .map(([aMonth, _s, aYear]) => {
            const firstDay = new Day(aYear, aMonth, 1);
            return dayDefault === DayDefault.Start
              ? firstDay
              : firstDay.toMonth().toLastDay();
          });
};

Using that pattern, we can easily build parsers for all the different formats we want to accept, then try them all with alt:

Copy Code
export const megaParser = (dayDefault: DayDefault): P.Parser<Day> => {
  return P.alt(
    wordDayParser, // e.g. "today", "yesterday", etc.
    monthDayYearParser, // e.g. "March 19th, 1992", "1 1 01", etc.
    monthDayParser, // e.g. "Dec 12", "1 1", etc.
    monthYearParser(dayDefault), // e.g. "March 1992", "1 01", etc.
    justMonthParser(dayDefault) // e.g. "March", "2", etc.
  );
};

(Note that the order matters, since formats like monthDay are a subset of formats like monthDayYear.)

Benefits of Parser Combinators

So now you know how to use parser combinators. But let me pitch the benefits of this solution a little more:

1. Each unit of parsing is testable. You can test your parser for short months (“Feb”) separately from your parser for numeric months (“2”), separately from your parser for all month formats (“Sep” or “September” or “9”), separately from your parser for complete variants like “Month/Day/Year”. Even if you aren’t unit testing, you can try all these functions in the REPL, which is enormously useful for developing and debugging.

2. Each unit of parsing is composable. You can combine them later in different ways. For example, our monthParser is used in 4 different higher level formats: month/day/year, month/day, month/year, and as a standalone month.

3. Parsers neatly abstract. I can use a parser like P.seq(monthParser, P.string(‘-’), yearParser) without thinking about all the different formats of month being accepted. I can also add new accepted month formats later (e.g. the sometimes used abbreviation “Sept”), and all code parsing months will be updated.

4. Parser combinators have the full power of your programming language. You can do whatever logic you want in them. For example, our date parser checks the current month to determine if an input like “December” means December of this or last year. You can also parse more complicated formats, like HTML, that regular expressions can’t.

5. The code is, to my eyes, highly readable. Here’s a quick reminder of what even basic regular expressions look like:

Where to go from here

  • Read the documentation of the Parsimmon library
  • You can check out our complete date parsing code here. It has a few dependencies on internal model classes we’ve made (Month, Day), so it won’t run, but the code is still fairly readable.
  • Write your own parser combinators!

Interested in applying cool programming techniques like this to create the best possible user experience? Join us at Mercury; we’re hiring developers for TypeScript, React, Haskell and Nix. Apply at mercury.com/jobs.

† Our product is primarily used in the United States, so we don't support international date formats, but that would be a good addition.


Notes
Written by

Max Tagher is the co-founder and CTO of Mercury.

Share
Copy Link
Share on Twitter
Share on LinkedIn
Share on Facebook