Perfection Kills

by kangax

Exploring Javascript by example

← back 2742 words

Sputniktests web runner

Intro

Sputniktests is an ECMA-262 conformance test suite made by Google. For those who don't know, ECMA-262 is a standard behind well-known implementations like JScript, JavaScript and others. It's what describes ECMAScript language.

Ever since Sputniktests release few months ago, I wanted to see how various browsers conform to the standard. Unfortunately it wasn't very easy to do so. The way test suite could be executed is by running a python script, passing it an executable file of implementation such as V8 or Rhino. It wasn't possible to just check conformance of any browser, especially browser with implementation that can't be run separately.

I realized that a "web runner" for Sputniktests would be a useful thing to have and made one. In the end, it was a fun little exercise that made me understand ECMAScript language just a bit better.

Sputniktests screenshot

Web runner is merely a wrapper around original test suite, made fit to run in a browser environment. Its job is to execute tests sequentially and log any errors/failures in the process. When done, it reports elapsed time and number of errors.

Why it doesn't always matter

Contrary to something like Acid test, Sputniktests is not immediately useful. Passing it fully does not necessarily make a browser more capable than the other one, with lower score. Many failures in modern browsers are rather insignificant from practical point of view and might not even affect any real world applications.

But there's still a huge value in a conformance test suite like this. By testing every single detail of ECMAScript implementation, Sputniktests could help minimize regressions, both — functional and performance ones. It could serve as an excellent foundation for creating a new ECMAScript implementation. And last, but not least, it could help browser implementors find actual valid bugs in browser engines.

There's an important point to understand regarding test suite failures: not all of them can — or even should — be fixed, and here's why:

Proprietary extensions

It is a well-known fact that specifications allows implementations to introduce proprietary extensions. JScript and JavaScript (tm) have been doing this for years. JScript's conditional comments and JavaScript's getters/setters demonstrate it very well. Another famous example is the way function declarations are treated in statements.

The point here is simple. Failure in Sputniktests can be the result of proprietary extension and might not even be considered a bug.

ECMAScript 5th edition

Another cause of "valid failures" might be the next edition of ECMAScript, currently draft. Some browsers have already started implementing parts of it and might fail to comply with 3rd edition that Sputniktests checks against. For-in handling is a good example of such "misunderstanding".

Backwards compatibility

Finally, there's always a beloved backwards compatibility to keep in mind. It might not be possible to fix otherwise valid bug/deviation due to this wonderful constraint.

How runner works

Runner works very simply. First, a query of tests is initialized and populated with all of the 5000+ tests. Then, a table of tests to ignore is initialized and is later used for… ignoring certain conflicting or complex tests. Finally, runner starts picking tests from the query, with a certain interval in between — to keep UI functional during this rather intensive process. Note that interval can be changed on the main screen before starting test suite but defaults to 50ms.

For every test, runner creates a new iframe, inserts it into a current document and writes a script element into it. This is done to keep tests isolated from each other, so that one test wouldn’t affect environment of the next one. Once script is executed, a meta data is printed to the screen: name of current test, total number of errors/failures, elapsed time, etc. Iframe is then deleted.

Before adding actual test script to an iframe, runner first injects a complementary script into it. That script defines global $ERROR, $FAIL and $PRINT and simply proxies them to same-named functions of main (parent) document. When these methods are called, they write an output to main document log area.

Browser comparison

So how do modern and not so modern browsers stand against standard? Here's a comparison table (note that less score is better and that score represents total number of errors and failures):

Sputniktests results chart

We can see few interesting things here:

  • Surprisingly, Opera 9.64 is a winner. Even more strange is that Opera 10 has some serious regressions and falls far far behind, joining ancient Safari 2.x
  • I was expecting Safari 4 to beat Firefox 3.5 (or 3.7), but it doesn't even compare with Firefox 2.x
  • Firefox 3.7 (currently alpha) performs 1 point worse than Firefox 3.5
  • It's amusing to see Internet Explorer results. The latest and greatest 8th version is practically identical to IE 5.5 (!!!). This hints at how fast bugs are being fixed in JScript.
  • Chrome 4 gets surprisingly low number (in between Firefox 3 and Firefox 2). I thought it would beat everyone else, considering that Sputniktests was originally developed to aid Chrome conform to the standard.
  • Out of all latest browsers (not considering regressed Opera 10), Konqueror gets the poorest score and probably needs to work on its compliance in the near future.

Notable deviations

Here are some of the bugs and quirks I noticed in few browsers. Each is accompanied with a short explanation.

1) for (var prop in null) { }
for (var prop in undefined) { }

These statements should actually result in a TypeError, and the explanation to that is pretty simple. During evaluation, an expression on the right hand side of in is being applied internal ToObject method. This internal method is the one that throws TypeError when given null or undefined value.

You might be wondering if ToObject is used anywhere else and has similar consequences? It does. Roughly, in 3 cases:

  • foo[bar]
  • with (foo) …
  • for (bar in foo) …

When foo evaluates to null or undefined, in any of these cases, TypeError is inevitable. Most browsers, however, throw error with first two statements, but not the last one. This is, arguably, a more useful behavior, even though technically, not ECMA-compliant.

Note that 5th edition of ECMAScript actually changes “for-in” to do exactly what most of the browsers currently do — not throw TypeError, but instead proceed as if foo was an empty object.

2) Number('\u00A0') === 0

When Number is called as a function, it performs type conversion. String to number type conversion is expressed in rather involved algorithm, but one of the simplest rules there is that when string consists of a whitespace character (or is empty), the result is 0. This means that both — Number('') and Number(' ') should evaluate to 0.

Some browsers, however, fail to comply in regards to the notion of whitespace character. Passing plain U+0020 does the job, but U+00A0 (and a whole slew of other ones) often doesn’t. Instead, NaN is returned for what should really be a 0.

3) parseFloat(“\u205F -1.1”)

Similar bug exists with handling of white space characters by parseFloat. Spec explains that any leading whitespace is ignored in input string. Something like parseFloat(' 2.5 ') should result in 2.5. And again, some implementations fail with rarer whitespace characters, such as U+205F or U+1680. Interestingly, only Opera is fully conforming here. Firefox and Webkit both fail one way or another.

4) Error.prototype.message

This one looks like a real bug in WebKit. WebKit throws “Unknown error” when merely attempting to access Error.prototype.message. Sputniktests actually managed to mess up here as well: test suite asserts that the property is an empty string, whereas specs say that Error.prototype.message is an implementation-dependent string (which means that it could as well be “foo-bar_BaZ”). Sputniktests need to check type of a property — typeof Error.prototype.message == 'string', and WebKit needs to stop throwing error.

5) EvalError and other xxxError ones are non-enumerable global properties

This one seems like a rather insignificant compliance. All properties of global object are specified to be non-enumerable (that is — have {DontEnum} internal attribute set on). However, at least WebKit enumerates over all of the global EvalError, RangeError, SyntaxError, etc.

6) [[Construct]] and .prototype of built-in objects.

There’s a whole slew of failures in Firefox due to built-in objects having what they shouldn’t have — prototype property and [[Construct]] method. To remind you, [[Construct]] is an internal method that’s called when applying new operator to an object — usually a function. It is basically what makes certain objects “constructable”, and what every native function object has intrinsically. The failing built-ins are global methods like parseInt, isNaN, encodeURI, as well as properties of Object.prototype, Array.prototype, and so on. To quote specs:

“None of the built-in functions described in this section shall initially have a prototype property unless otherwise specified in the description of a particular function”

and:

“None of the built-in functions described in this section shall implement the internal [[Construct]] method unless otherwise specified in the description of a particular function.”

7) typeof new RegExp() === 'function'

This is probably one of the most famous WebKit deviations. As you might know, a large number of browsers make regex objects callable. Callable regular expressions allow to replace /(a|b)/.exec('a') with simply /(a|b)/('a'). I’m not sure where this non-standard behavior originates from, but it’s probably still kept around for backwards compatibility.

Interestingly, regex objects in WebKit seem to actually implement internal [[Call]] method. As per specs, any native object that implements [[Call]] should return “function” when applied typeof to, so WebKit merely follows the standard here. However, this little addition results in a side effect: regex objects are being reported as functions — typeof /x/ == 'function'.

Older Firefox (e.g. Firefox 2), by the way, behaves just like WebKit here.

8) new RegExp(undefined)

Another bug in Firefox is the way RegExp constructor treats pattern of undefined value. Specs mandate that when undefined, pattern should simply become an empty string (i.e. functionally identical to new RegExp('')). WebKit and Opera do just that, but Firefox converts undefined into its string representation — “undefined”, making regex behave as if it was created literally via /undefined/.

9) "".search() and "--undefined--".search()

This one is related to a previous bug. The purpose of String.prototype.search is to find offset within the string where a given pattern matches. As usual, all is nice and well, until we start dealing with non-trivial input values.

When given a non-regex object as a first argument, String.prototype.search should apply new RegExp() on it. This means that "".search() is functionally identical to "".search(new RegExp()), where undefined value is being applied new RegExp on. This expression essentially matches empty regex against empty string. The result of "".search(), quite obviously, should be 0, since empty regex (i.e. nothingness) matches at the very first position of empty string it’s being applied to.

Firefox, however, erroneously makes /undefined/ out of new RegExp(), and fails to match empty string at 0th position. For the very same reasons, it returns 2 in "--undefined--".search(), instead of correct 0.

10) "foo".substring(0, undefined);

Another weird quirk in Firefox is the way it handles second argument — ending position — of String.prototype.substring. Spec clearly states that when undefined, position is considered to be end of a string. For example, "foobar".substring(0, 2) should return "fo", but "foobar".substring(0)"foobar", since end position is considered to be at the end of a string.

Firefox does this partially right, producing proper result when argument is missing — "foobar".substring(0) === "foobar", but somehow fails to do the same, when passing undefined value explicitly — "foobar".substring(0, undefined) === "".

11) Line terminators in regex literals

An interesting quirk present in both — Firefox and Opera, but not in WebKit is related to regular expression literals. Spec makes it clear that regex literals are not allowed to have line terminators in their bodies. Not even when escaped with backslash. Firefox and Opera, however, seem to be perfectly fine with line terminators as long as those are escaped: eval("/\\\u000A/") results in an invalid regex literal that looks like:

/\
/

Test suite errors and oversights

Sputniktests is a truly outstanding effort. I’m amazed at the amount of work that was put into it. However, the project is still in its infancy, and there are clearly some things that could be done better.

What striked me as being inconsistent and harmful is the way Sputniktests declares variables: sometimes using proper declarations (var foo = 'bar'), other times — using undeclared assignments (foo = bar). Undeclared assignments is a very bad practice, and there’s no reason to rely on it here or anywhere. It would be nice to see this changed in the future versions.

Other inconsistencies are with usage of $PRINT function. Sometimes it’s used to log additional information about tests, but not always.

There are cases when tests rely on compliance of other components and, as a result, give false positives. For example, a test for function expression in for-in statement assumes that prototype property of a function is enumerable:


for (x in function __func(){return 0;}){
  if (x == "prototype")
    var __reached = 1;
}
if (__reached !== 1) {
  $ERROR('#2: function expession inside of for-in expression is allowed');
}

Per specification, prototype property of function object is in fact enumerable (it only has {DontDelete} attribute set on). But Firefox, for example, makes prototype non-enumerable and so fails this test. It fails it erroneously because function expression in for-in statements — what this test is actually supposed to ensure — is allowed in Firefox just fine.

A similar case of false results happens when testing for Array.prototype compliance. Array.prototype should itself be an array object; its internal [[Class]] should be that of all array objects — “Array”. The test, unfortunately, checks this compliance by deleting Array.prototype.toString, then calling toString on Array.prototype, letting Object.prototype.toString propagate through and ensuring that [[Class]] of Array.prototype is “Array”.


delete Array.prototype.toString;
if (Array.prototype.toString() !== "[object " + "Array" + "]") {
  $ERROR(/* ... */);
}

Clients that have non-deletable Array.prototype.toString fail this test even with fully conforming Array.prototype.

It might be safer to use call here, but then clients with non-conforming call could result in false positives as well:


// Is Array.prototype's [[Class]] an "Array"?
if (Object.prototype.toString.call(Array.prototype) !== "[object Array]") {
  $ERROR(/* ... */);
}

It is, of course, very hard to avoid these false positives. We can only guess which things are more likely to be compliant. We can also ignore these errors: if certain environment fails one test due to non-conformance of unrelated component, that component should simply be fixed as well.

Test suite has some minor inconsistencies — missing semicolons here and there, or extra ones (after statements). There are superfluous !(... == ...) used instead of (... != ...), as well as if (... == true) instead of if (...). I also noticed few missing conformance checks.

I have no doubt all these annoyances will be gone in the future.

Future work

Having extensive compliance test suite can really help modern browsers achieve even better conformance. I hope we'll see some of the bugs revealed through the Sputniktests fixed in the near future. I hope we'll also see less regressions, if browser implementors integrate it into existing test suites. I also hope Sputniktests can help people learn and understand ECMAScript better.

Web runner is published on github, so that anyone can contribute easily. There are many more things we can improve. I can think of additional features like running separate sections of tests or even individual ones; being able to see test contents right in a browser, or make it possible to pause/resume test suite execution.

Any comments, corrections, suggestions are as always very much welcomed.

And finally, I would like to, once again, thank Sputniktest team for their outstanding efforts to help move web forward.

Did you like this? Donations are welcome

comments powered by Disqus