March 26, 2005

iso8601

Parsing W3C's ISO 8601 Date/Times in JavaScript

Many standards all over the place use the W3C's subset of the ISO 8601 standard to specify a date and time unambiguously. After a very brief search I couldn't find a simple JavaScript function to parse a string in this form, so here's one I wrote for the job. It should be bullet-proof and accept any correctly formatted date/time in the correct style.

Date.prototype.setISO8601 = function (string) {
    var regexp = "([0-9]{4})(-([0-9]{2})(-([0-9]{2})" +
        "(T([0-9]{2}):([0-9]{2})(:([0-9]{2})(\.([0-9]+))?)?" +
        "(Z|(([-+])([0-9]{2}):([0-9]{2})))?)?)?)?";
    var d = string.match(new RegExp(regexp));

    var offset = 0;
    var date = new Date(d[1], 0, 1);

    if (d[3]) { date.setMonth(d[3] - 1); }
    if (d[5]) { date.setDate(d[5]); }
    if (d[7]) { date.setHours(d[7]); }
    if (d[8]) { date.setMinutes(d[8]); }
    if (d[10]) { date.setSeconds(d[10]); }
    if (d[12]) { date.setMilliseconds(Number("0." + d[12]) * 1000); }
    if (d[14]) {
        offset = (Number(d[16]) * 60) + Number(d[17]);
        offset *= ((d[15] == '-') ? 1 : -1);
    }

    offset -= date.getTimezoneOffset();
    time = (Number(date) + (offset * 60 * 1000));
    this.setTime(Number(time));
}

To use it you'll first have to create a Date object and then invoke the method. The usage mirrors the setTime method provided by the standard JavaScript Date object.

var date = new Date();
date.setISO8601("2005-03-26T19:51:34Z");

Simple. I guess this would fit nicely into a "standard hack" catagory, if there were such a thing.

Update: Having just tripped onto BST I noticed the timezones weren't coming out correctly so I've fixed up the function and it should be right now. Working with timeshifts when your system is set to UTC can make it all too easy to miss something.

While I was at it I needed to go back again so here is the reverse function, toISO8601String.

Date.prototype.toISO8601String = function (format, offset) {
    /* accepted values for the format [1-6]:
     1 Year:
       YYYY (eg 1997)
     2 Year and month:
       YYYY-MM (eg 1997-07)
     3 Complete date:
       YYYY-MM-DD (eg 1997-07-16)
     4 Complete date plus hours and minutes:
       YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00)
     5 Complete date plus hours, minutes and seconds:
       YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)
     6 Complete date plus hours, minutes, seconds and a decimal
       fraction of a second
       YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)
    */
    if (!format) { var format = 6; }
    if (!offset) {
        var offset = 'Z';
        var date = this;
    } else {
        var d = offset.match(/([-+])([0-9]{2}):([0-9]{2})/);
        var offsetnum = (Number(d[2]) * 60) + Number(d[3]);
        offsetnum *= ((d[1] == '-') ? -1 : 1);
        var date = new Date(Number(Number(this) + (offsetnum * 60000)));
    }

    var zeropad = function (num) { return ((num < 10) ? '0' : '') + num; }

    var str = "";
    str += date.getUTCFullYear();
    if (format > 1) { str += "-" + zeropad(date.getUTCMonth() + 1); }
    if (format > 2) { str += "-" + zeropad(date.getUTCDate()); }
    if (format > 3) {
        str += "T" + zeropad(date.getUTCHours()) +
               ":" + zeropad(date.getUTCMinutes());
    }
    if (format > 5) {
        var secs = Number(date.getUTCSeconds() + "." +
                   ((date.getUTCMilliseconds() < 100) ? '0' : '') +
                   zeropad(date.getUTCMilliseconds()));
        str += ":" + zeropad(secs);
    } else if (format > 4) { str += ":" + zeropad(date.getUTCSeconds()); }

    if (format > 3) { str += offset; }
    return str;
}

This function takes two arguments, both optional. The first describes the format the resulting string should take, ie. how many components to include. This is an integer between 1 and 6, with the meanings listed above in the comment block. The second argument is an optional timezone offset. If it is not specified the timezone is set to UTC using the Z character. It takes the form +HH:MM or -HH:MM.

15 Comments:

Erik Arvidsson said...

I think you missed the milliseconds part.

2005-03-28T17:05:30.5433Z+01:00

28 March, 2005 16:07  
Paul Sowden said...

I didn't even notice you that you could specify fractions of seconds. I've fixed up the regex so now it should accept anything that conforms to the W3C's note.

I've also given the returning function, toISO8601String, a little love to bring it up to speed.

Guess the next step would be to accept more than just the W3C's subset of the standard, but I only really posted the code as an aside.

28 March, 2005 20:31  
Britt said...

I am trying to use this in Firefox and get an error - Error ``string has no properties'' from the line

var d = string.match(new RegExp(regexp));

any ideas?

05 July, 2005 21:52  
Anonymous said...

How is this code licensed?

23 July, 2005 04:01  
Paul Sowden said...

Sorry, I've been meaning to put a copyright statement on my site for a while now. This code is available under the AFL. This should mean you can use it pretty much anyway you want.

23 July, 2005 05:32  
boogs said...

Thanks Paul,

Quite useful. Just what I was looking for.

19 September, 2005 04:45  
Anonymous said...

Hi Paul,

Thanks for writing this useful function!

I am running into an issue when the date does not contain milliseconds. Here is the example:

var d = new Date();
d.setISO8601( "2005-03-26T19:51:34Z-0400");
alert( d + " str, 2005-03-26T19:51:34Z-0400" );

It appears that offset is not captured when millis are missing.

Alex

05 April, 2006 16:13  
Paul Sowden said...

These functions have now been absorbed into the Dojo Toolkit's dojo.date module, with the added advantage that they are actively maintained. If you don't fancy using the whole of Dojo, it's simple enough to copy them out of the file for your own wicked ways.

06 April, 2006 20:59  
Grauw said...

Hey,

Some comments on the regular expression:

"([0-9]{4})(-([0-9]{2})(-([0-9]{2})([T ]([0-9]{2}):([0-9]{2})(:([0-9]{2})(\.([0-9]+))?)?(Z|(([-+])([0-9]{2}):([0-9]{2})))?)?)?)?";

By using (?: you can have anonymous groups, so that their values don’t turn up in the match results. E.g. if the first part is "([0-9]{4})(?-([0-9]{2}), then you can use ‘if (d[2])’ instead of ‘if (d[3])’.

You can also replace [0-9] with \d, it’s shorter, but I suppose it’s a matter of preference. Also, I prefer to use \d\d instead of \d{2}, I think it’s slightly more clear.

Furthermore, I don’t think you’re handling a ‘Z’ suffix right now (which is a shorthand for +00:00)?

Finally, the - and : separators are optional (but the . for milliseconds is not). And the minutes are, too, and so is the time zone (unfortunately).

This leads me to the following regular expression:

(\d\d\d\d)(?:-?(\d\d)(?:-?(\d\d)(?:[T ](\d\d)(?::?(\d\d)(?::?(\d\d)(?:\.(\d+))?)?)?(?:Z|(?:([-+])(\d\d)(?::?(\d\d))?)?)?)?)?)?

I think with that, you should have pretty much covered it :) . Try fiddling with it in the JS console:

"2006-05-19 15:30:22.01+01:00".match(/(\d\d\d\d)(?:-?(\d\d)(?:-?(\d\d)(?:[T ](\d\d)(?::?(\d\d)(?::?(\d\d)(?:\.(\d+))?)?)?(?:Z|(?:([-+])(\d\d)(?::?(\d\d))?)?)?)?)?)?/)

Also, what happens if the date doesn’t match at all, and is something nonsensical like "abcdef", does it degrade gracefully?

And what happens when the time is ‘24:00’? This means ‘the end of the day’ (as opposed to ‘00:00’ which is the beginning). As the JS Date object probably isn’t able to make such a distinction, I think this case should be converted to 00:00 of the next day.


~Grauw

20 April, 2006 22:53  
Grauw said...

Oh, I forgot, there’s a [T ] instead of T in there as well, because the ISO8601 standard allows to use a space instead of a T when the advantages (human-readability) outweigh the disadvantages (possible misunderstanding).


~Grauw

20 April, 2006 22:57  
Anonymous said...

This function is in use by Dashcode, part of the official OS X 10.4.6 developer pack. To see it, create a new project, select "Blog Widget" -> "View Source" -> scroll down.

:) This one little function is in over 500 000 computers.

03 September, 2006 00:23  
Anonymous said...

I want to parse the date format "2/3/2007" in xsd format. I want to know how to use your function and the calling technique.

Regards,
Sarbashish

07 February, 2007 23:31  
Anonymous said...

If setISO8601 used the UTC functions, it could be simpler in code and test.

For UTC,
new Date(Date.UTC(Y, M', D, ...))

The rest should be obvious.

Google : merlyn date javascript

This box is inconveniently small. and the preview too narrow.

Preview said "February 11, 2007 11:49 AM" but it was really 19:47 here - 19:47Z

11 February, 2007 19:52  
The_Decryptor said...

Found a bug with the regex (i think)

If you specify a time with seconds (and without miliseconds), and a timezone offset, it screws up and gives the wrong time

1986-04-26T01:23:58.00+03:00 will work

1986-04-26T01:23:58+03:00 won't work

Can replicate it quite easily (try a date with a offset, and try one pre-corrected to UTC, run them through and the times will differ by a few hours)

17 February, 2007 22:37  
skierpage said...

I'm sure you know this, but readers like me googling for '"regular expression" iso8601' should be aware this doesn't handle dates way in the past or dates after 9999 AD. BCE dates require a leading minus. Dates after 9999 AD require a leading +. With leading + or - you can have four or more digits in the year. So I think you need a leading +- test and allow 4 or more digits for the year:
([+-]?)(\d{4,} ...

19 February, 2007 22:54  

Post a Comment

<< Home