Archives Posts
December 17th, 2009 by kangax
This blog now lives at perfectionkills.com. No more weird, freakishly long URLs! Original RSS feed now points to a new domain, so things should just keep working there.
Old address — thinkweb2.com/projects/prototype — redirects all requests to a new one, but if you link or reference this blog anywhere, please update address to a new one.
If you stumble upon any problems, I’d be glad to help.
Archives Posts
August 23rd, 2009 by kangax
I am reading a Regular Expression Cookbook by Jan Goyvaerts and Steven Levithan. It’s a truly excellent book on a subject, with an incredible level of attention to details. I am only half-way through the book, but have already learned few things about regular expressions – both general and javascript-related ones.
One thing I noticed missing in the book was a mention of whitespace character class (\s) discrepancies in current ECMAScript implementations. Cookbook rightfully explains that \s in Javascript matches any character defined as whitespace by the Unicode standard. What it fails to mention is how horribly this rule is actually implemented in modern browsers. While most of the implementations correctly handle ASCII whitespace characters, such as – U+0020 (Space), U+000B (Vertical Tab) and U+000A (Line Feed) – there’s much more chaos in anything above U+2000 (EN QUAD) point.
In practice such non-conformance can lead to surprising results when implementing something like trim function. If trim were to utilize \s, than it could miss quite common characters like U+00A0 (No-Break Space); In fact, trim used in jQuery or Prototype uses exactly that – standard whitespace character class (\s) – and so fails with any of these troublesome characters. One of the solutions, of course, is to replace \s with a custom character class, e.g.: – [\u0009\u000A\u000B\u000C\u000D\u0020\u00A0\u1680\u180E\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200A\u202F\u205F\u3000\u2028\u2029]
This topic comes up once in a while on comp.lang.javascript and there have been some efforts to document these discrepancies. I wanted to make a simple table of modern browsers compliance and used a test provided once by Richard Cornford (also available online for anyone to try it out).
Here’s a table demonstrating above mentioned deviations. It’s good to see Safari 4+ and Chrome 2+ conforming to specs fully. Hopefully, upcoming versions of Firefox will also take care of the remaining “failures”.
| Code point / Browser |
Firefox 2-3.5 |
Safari 2.0-3.2.1 |
Safari 4 |
Opera 9.25, 9.64 |
Opera 10 |
IE 6-8 |
Chrome 2-3 |
Konqueror 4.2.2 |
| (0×0009) [ASCII Tab] |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
| (0×000A) [ASCII Line Feed] |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
| (0×000B) [ASCII Vertical Tab] |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
FAIL |
| (0×000C) [ASCII Form Feed] |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
| (0×000D) [ASCII Carriage Return] |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
| (0×0020) SPACE |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
PASS |
| (0×00A0) NO-BREAK SPACE |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×1680) OGHAM SPACE MARK |
FAIL |
FAIL |
PASS |
PASS |
FAIL |
FAIL |
PASS |
FAIL |
| (0×180E) MONGOLIAN VOWEL SEPARATOR |
FAIL |
FAIL |
PASS |
FAIL |
FAIL |
FAIL |
PASS |
FAIL |
| (0×2000) EN QUAD |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×2001) EM QUAD |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×2002) EN SPACE |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×2003) EM SPACE |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×2004) THREE-PER-EM SPACE |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×2005) FOUR-PER-EM SPACE |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×2006) SIX-PER-EM SPACE |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×2007) FIGURE SPACE |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×2008) PUNCTUATION SPACE |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×2009) THIN SPACE |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×200A) HAIR SPACE |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×2028) LINE SEPARATOR |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×2029) PARAGRAPH SEPARATOR |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×202F) NARROW NO-BREAK SPACE |
FAIL |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
| (0×205F) MEDIUM MATHEMATICAL SPACE |
FAIL |
FAIL |
PASS |
FAIL |
FAIL |
FAIL |
PASS |
FAIL |
| (0×3000) IDEOGRAPHIC SPACE |
PASS |
FAIL |
PASS |
PASS |
PASS |
FAIL |
PASS |
FAIL |
Tests for Firefox, Safari and Opera were performed on Mac OS X (10.5.8); IE and Chrome – on Windows XP Pro SP2 (via VMWare); and Konqueror – on Ubuntu 9.04 (via VMWare)
Edit (28/09/2009)
Clarified operating systems (and their versions) used for testing; Aligned characters in a table by code point; Updated Opera to 10RC, added Chrome 3 to results, combined FF columns into one, since they are the identical; Sorted table by code point. Thanks to Dr J R Stockton and Luke Smith for suggestions.
Edit (04/09/2009)
Updated Opera 10RC to Opera 10 (Thanks to Garrett Smith for test); tested and updated table with results of Safari 2.x and older 3.x versions; fixed a bug in a testcase where `char` identifier (one of future reserved words as per ES3) would prevent script parsing in Safari 2.x
Archives Posts
June 15th, 2009 by kangax
After a couple of months, I have finally finished an article on named function expressions. It’s meant to demystify some of the common misconceptions I’ve seen on the web, take an in-depth look at cross-browser quirks and explain how to safely “work around” them. Quite obviously, it also explains what named function expressions are good for and how to “take advantage” of them in your applications.
If you have suggestions or find any mistakes, please let me know. Hope you like it.
http://yura.thinkweb2.com/named-function-expressions/