March 26, 2005

iso8601

Parsing W3C's ISO 8601 Date/Times in JavaScript

Many standards all over the place use the W3C's subset of the ISO 8601 standard to specify a date and time unambiguously. After a very brief search I couldn't find a simple JavaScript function to parse a string in this form, so here's one I wrote for the job. It should be bullet-proof and accept any correctly formatted date/time in the correct style.

Date.prototype.setISO8601 = function (string) {
    var regexp = "([0-9]{4})(-([0-9]{2})(-([0-9]{2})" +
        "(T([0-9]{2}):([0-9]{2})(:([0-9]{2})(\.([0-9]+))?)?" +
        "(Z|(([-+])([0-9]{2}):([0-9]{2})))?)?)?)?";
    var d = string.match(new RegExp(regexp));

    var offset = 0;
    var date = new Date(d[1], 0, 1);

    if (d[3]) { date.setMonth(d[3] - 1); }
    if (d[5]) { date.setDate(d[5]); }
    if (d[7]) { date.setHours(d[7]); }
    if (d[8]) { date.setMinutes(d[8]); }
    if (d[10]) { date.setSeconds(d[10]); }
    if (d[12]) { date.setMilliseconds(Number("0." + d[12]) * 1000); }
    if (d[14]) {
        offset = (Number(d[16]) * 60) + Number(d[17]);
        offset *= ((d[15] == '-') ? 1 : -1);
    }

    offset -= date.getTimezoneOffset();
    time = (Number(date) + (offset * 60 * 1000));
    this.setTime(Number(time));
}

To use it you'll first have to create a Date object and then invoke the method. The usage mirrors the setTime method provided by the standard JavaScript Date object.

var date = new Date();
date.setISO8601("2005-03-26T19:51:34Z");

Simple. I guess this would fit nicely into a "standard hack" catagory, if there were such a thing.

Update: Having just tripped onto BST I noticed the timezones weren't coming out correctly so I've fixed up the function and it should be right now. Working with timeshifts when your system is set to UTC can make it all too easy to miss something.

While I was at it I needed to go back again so here is the reverse function, toISO8601String.

Date.prototype.toISO8601String = function (format, offset) {
    /* accepted values for the format [1-6]:
     1 Year:
       YYYY (eg 1997)
     2 Year and month:
       YYYY-MM (eg 1997-07)
     3 Complete date:
       YYYY-MM-DD (eg 1997-07-16)
     4 Complete date plus hours and minutes:
       YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00)
     5 Complete date plus hours, minutes and seconds:
       YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)
     6 Complete date plus hours, minutes, seconds and a decimal
       fraction of a second
       YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)
    */
    if (!format) { var format = 6; }
    if (!offset) {
        var offset = 'Z';
        var date = this;
    } else {
        var d = offset.match(/([-+])([0-9]{2}):([0-9]{2})/);
        var offsetnum = (Number(d[2]) * 60) + Number(d[3]);
        offsetnum *= ((d[1] == '-') ? -1 : 1);
        var date = new Date(Number(Number(this) + (offsetnum * 60000)));
    }

    var zeropad = function (num) { return ((num < 10) ? '0' : '') + num; }

    var str = "";
    str += date.getUTCFullYear();
    if (format > 1) { str += "-" + zeropad(date.getUTCMonth() + 1); }
    if (format > 2) { str += "-" + zeropad(date.getUTCDate()); }
    if (format > 3) {
        str += "T" + zeropad(date.getUTCHours()) +
               ":" + zeropad(date.getUTCMinutes());
    }
    if (format > 5) {
        var secs = Number(date.getUTCSeconds() + "." +
                   ((date.getUTCMilliseconds() < 100) ? '0' : '') +
                   zeropad(date.getUTCMilliseconds()));
        str += ":" + zeropad(secs);
    } else if (format > 4) { str += ":" + zeropad(date.getUTCSeconds()); }

    if (format > 3) { str += offset; }
    return str;
}

This function takes two arguments, both optional. The first describes the format the resulting string should take, ie. how many components to include. This is an integer between 1 and 6, with the meanings listed above in the comment block. The second argument is an optional timezone offset. If it is not specified the timezone is set to UTC using the Z character. It takes the form +HH:MM or -HH:MM.

19 Comments:

Anonymous Erik Arvidsson said...

I think you missed the milliseconds part.

2005-03-28T17:05:30.5433Z+01:00

28 March, 2005 16:07  
Blogger Paul Sowden said...

I didn't even notice you that you could specify fractions of seconds. I've fixed up the regex so now it should accept anything that conforms to the W3C's note.

I've also given the returning function, toISO8601String, a little love to bring it up to speed.

Guess the next step would be to accept more than just the W3C's subset of the standard, but I only really posted the code as an aside.

28 March, 2005 20:31  
Blogger Britt said...

I am trying to use this in Firefox and get an error - Error ``string has no properties'' from the line

var d = string.match(new RegExp(regexp));

any ideas?

05 July, 2005 21:52  
Anonymous Anonymous said...

How is this code licensed?

23 July, 2005 04:01  
Blogger Paul Sowden said...

Sorry, I've been meaning to put a copyright statement on my site for a while now. This code is available under the AFL. This should mean you can use it pretty much anyway you want.

23 July, 2005 05:32  
Blogger boogs said...

Thanks Paul,

Quite useful. Just what I was looking for.

19 September, 2005 04:45  
Anonymous Anonymous said...

Hi Paul,

Thanks for writing this useful function!

I am running into an issue when the date does not contain milliseconds. Here is the example:

var d = new Date();
d.setISO8601( "2005-03-26T19:51:34Z-0400");
alert( d + " str, 2005-03-26T19:51:34Z-0400" );

It appears that offset is not captured when millis are missing.

Alex

05 April, 2006 16:13  
Blogger Paul Sowden said...

These functions have now been absorbed into the Dojo Toolkit's dojo.date module, with the added advantage that they are actively maintained. If you don't fancy using the whole of Dojo, it's simple enough to copy them out of the file for your own wicked ways.

06 April, 2006 20:59  
Blogger Grauw said...

Hey,

Some comments on the regular expression:

"([0-9]{4})(-([0-9]{2})(-([0-9]{2})([T ]([0-9]{2}):([0-9]{2})(:([0-9]{2})(\.([0-9]+))?)?(Z|(([-+])([0-9]{2}):([0-9]{2})))?)?)?)?";

By using (?: you can have anonymous groups, so that their values don’t turn up in the match results. E.g. if the first part is "([0-9]{4})(?-([0-9]{2}), then you can use ‘if (d[2])’ instead of ‘if (d[3])’.

You can also replace [0-9] with \d, it’s shorter, but I suppose it’s a matter of preference. Also, I prefer to use \d\d instead of \d{2}, I think it’s slightly more clear.

Furthermore, I don’t think you’re handling a ‘Z’ suffix right now (which is a shorthand for +00:00)?

Finally, the - and : separators are optional (but the . for milliseconds is not). And the minutes are, too, and so is the time zone (unfortunately).

This leads me to the following regular expression:

(\d\d\d\d)(?:-?(\d\d)(?:-?(\d\d)(?:[T ](\d\d)(?::?(\d\d)(?::?(\d\d)(?:\.(\d+))?)?)?(?:Z|(?:([-+])(\d\d)(?::?(\d\d))?)?)?)?)?)?

I think with that, you should have pretty much covered it :) . Try fiddling with it in the JS console:

"2006-05-19 15:30:22.01+01:00".match(/(\d\d\d\d)(?:-?(\d\d)(?:-?(\d\d)(?:[T ](\d\d)(?::?(\d\d)(?::?(\d\d)(?:\.(\d+))?)?)?(?:Z|(?:([-+])(\d\d)(?::?(\d\d))?)?)?)?)?)?/)

Also, what happens if the date doesn’t match at all, and is something nonsensical like "abcdef", does it degrade gracefully?

And what happens when the time is ‘24:00’? This means ‘the end of the day’ (as opposed to ‘00:00’ which is the beginning). As the JS Date object probably isn’t able to make such a distinction, I think this case should be converted to 00:00 of the next day.


~Grauw

20 April, 2006 22:53  
Blogger Grauw said...

Oh, I forgot, there’s a [T ] instead of T in there as well, because the ISO8601 standard allows to use a space instead of a T when the advantages (human-readability) outweigh the disadvantages (possible misunderstanding).


~Grauw

20 April, 2006 22:57  
Anonymous Anonymous said...

This function is in use by Dashcode, part of the official OS X 10.4.6 developer pack. To see it, create a new project, select "Blog Widget" -> "View Source" -> scroll down.

:) This one little function is in over 500 000 computers.

03 September, 2006 00:23  
Anonymous Anonymous said...

I want to parse the date format "2/3/2007" in xsd format. I want to know how to use your function and the calling technique.

Regards,
Sarbashish

07 February, 2007 23:31  
Anonymous Anonymous said...

If setISO8601 used the UTC functions, it could be simpler in code and test.

For UTC,
new Date(Date.UTC(Y, M', D, ...))

The rest should be obvious.

Google : merlyn date javascript

This box is inconveniently small. and the preview too narrow.

Preview said "February 11, 2007 11:49 AM" but it was really 19:47 here - 19:47Z

11 February, 2007 19:52  
Anonymous The_Decryptor said...

Found a bug with the regex (i think)

If you specify a time with seconds (and without miliseconds), and a timezone offset, it screws up and gives the wrong time

1986-04-26T01:23:58.00+03:00 will work

1986-04-26T01:23:58+03:00 won't work

Can replicate it quite easily (try a date with a offset, and try one pre-corrected to UTC, run them through and the times will differ by a few hours)

17 February, 2007 22:37  
Blogger skierpage said...

I'm sure you know this, but readers like me googling for '"regular expression" iso8601' should be aware this doesn't handle dates way in the past or dates after 9999 AD. BCE dates require a leading minus. Dates after 9999 AD require a leading +. With leading + or - you can have four or more digits in the year. So I think you need a leading +- test and allow 4 or more digits for the year:
([+-]?)(\d{4,} ...

19 February, 2007 22:54  
Blogger Rich said...

If your string is known to be in W3C Zulu format, the following one-liner parser will work everywhere Date.UTC does:

zulu=function(s){
return Date.UTC.apply(window, s.split(/\D+/).map( function(n,i){
return Number(n)-(i==1?1:0)
}));
}

For example, you could alias the Date class as:

Zulu=function(s){return new Date(zulu(s))}

I use a very similar approach for full-blown W3C Date I/O in my Time class. There's just a few extra lines to extract the time zone before splitting, then (carefully) adding the zone hour to s[3] and the signed zone minutes to s[4] before applying the array to Date.UTC.

For non-Mozillas, you'll need to add a mapper to the Array class. Here's mine:

if(!Array.map) Array.map=function(o,f,t){
if(this!=window){t=f;f=o;o=this} if(!t)t=window;
if(typeof f!="function")throw new TypeError();
if(typeof o.length!="number")return f.call(t,o,0,o);
var l=o.length, r=new Array(l), i;
if(typeof o=="string"){
for(i=0;i < l;i++)r[i]=f.call(t,o.charAt(i),i,o);
return r.join('');
}
for(i=0;i < l;i++)if(i in o)r[i]=f.call(t,o[i],i,o);
return r
}
if(!Array.prototype.map) Array.prototype.map = Array.map;

01 June, 2007 13:21  
Blogger Olivier A said...

The_Decryptor: you are right, the function contains a sly bug. The backslash before the dot for the milliseconds part should be escaped!: \\.

19 October, 2007 11:40  
Anonymous la times classifieds said...

companies marketing mineral makeups and also get the best bargains in mineral makeup you can imagine,
find aout how to consolidate your students loans or just how to lower your actual rates.,
looking for breast enlargements? in Rochester,
homeopathy for eczema learn about it.,
Allergies, information about lipitor,
save big with great bargains in mineral makeup,

change edition interviewing motivational people preparing second
,

interviewing motivational people preparing second time
,

interviewing people motivational preparing for a second time
,

black mold exposure
,

black mold exposure symptoms
,

black mold symptoms of exposure
,

free job interview questions
,

free job interview answers
,

interview answers to get a job
,

lookfor hair styles for fine thin hair
,

search hair styles for fine thin hair
,

hair styles for fine thin hair
,

beach resort in the philippines
,

great beach resort in the philippines
,

luxury beach resort in the philippines
,
iron garden gates, here,
iron garden gates,
wrought iron garden gates
, here
,
wrought iron garden gates
,
You: The Owner's Manual: An Insider's Guide to the Body That Will Make You Healthier and Younger
,
answer from more much,
eat eating mindless more than think we we why
,
la times classified,
new york times classified


texturizer,
texturizers here,
black hair texturizer,
find aout how care curly hair,
find about how to care curly hair,
care curly hair,
lipitor rash,
lipitor reactions,
new house ventura california,
the house new houston tx,
new house washington dc,
new house washington dc,
new house ventura california,
the house new houston tx,
the house new houston tx, that you ar looking for,
new house ventura california, you need to buy,
new house washington dc,

hair surgery transplant
,

air filter allergy
,

refurbished dell laptop computers
,

hair surgery transplant
,

air filter allergy
,

refurbished dell laptop computers
,

hair surgery transplant
,

air filter allergy
,

refurbished dell laptop computers
,

chocolate esophagus heartburn study
,

chocolate esophagus heartburn study
be informed,

digestion healing healthy heartburn natural preventing way
,

digestion healing healthy heartburn natural preventing way
,


Allergies, lipitor rash,
alcohol rash,
lipitor and alcohol,
lipitor alcohol,

natural remedies to aid healing of esophagus
,

chicory heartburn
,

effectiveness of zocor vs. lipitor
,

chocolate esophagus
,
southwestern wrought iron yard gate,
exterior iron gates,
oriental wrought iron gates,
powder coated iron garden fencing,

23 December, 2007 20:28  
Blogger Jonathan said...

"I'm sure you know this, but readers like me googling for '"regular expression" iso8601' should be aware this doesn't handle dates way in the past or dates after 9999 AD. BCE dates require a leading minus."

They also require converting to the proleptic Gregorian calendar for dates before 1582, which is a less than obvious conversion...

http://www.tondering.dk/claus/cal/node4.html

28 December, 2007 18:08  

Post a Comment

<< Home