Perfection kills

Exploring Javascript by example

Archives Posts

How ECMAScript 5 still does not allow to subclass an array

July 15th, 2010 by kangax

Subclassing an array in Javascript has never been a trivial task. At least for a certain meaning of “subclassing an array”. Curiously, new edition of the language — ECMAScript 5 — still does not allow to fully subclass an array.

Not everything is lost though, and there are few ways ECMAScript 5 makes this task closer to the ideal. However, there are few fundamental issues which prevent true array subclassing from happening.

Let’s talk about that.

Today we’ll take a look at what it means to subclass an array, what some of the existing implementations/workarounds are, and which drawbacks those implementations have; We’ll see what ECMAScript 5 brings to the table, and what those fundamental issues with subclassing are. We’ll also talk about alternative approaches to subclassing an array, such as using wrappers, and get to know their limitations.

But first, what does it mean to subclass an array? And why do we even need it?

Why subclass an array?

We can define “subclassing an array” as the process of creating an object which inherits from native Array object (has Array.prototype in its prototype chain), and follows behavior similar (or identical) to native array.

The last point about behavior similar to native array is actually very important, as we’ll see later on. Having “subclass” of array could be thought of as being able to create an array object, but an object which would inherit not directly from Array, but from another object, and only then from Array.

In other words, we want behavior similar to this:

var sub = new SubArray(1, 2, 3);
sub; // [1, 2, 3]
 
sub.length; // 3
sub[1]; // 2
 
sub.push(4);
sub; // [1, 2, 3, 4]
 
// etc.
 
sub intanceof SubArray; // true
sub intanceof Array; // true

Note how SubArray constructor creates a sub object identical in its behavior to array (object has “length” property, numeric “0”, “1”, “2” properties, and inherits Array.prototype.* methods). At the same time, it is SubArray that a sub object directly inherits from, not Array.

So what exactly is the purpose of doing all this? Why subclass an array in such way?

There are usually two reasons:

1. Avoid pollution of global Array

Javascript prototypal nature makes it easy to extend all array objects with custom methods. Instead of assigning to direct properties of array objects, it’s much easier and more efficient to assign to array’s “prototype” object (the one that’s usually accessed via Array.prototype).

Array.prototype.last = function () {
  return this[this.length-1];
};
// ...
[1, 2, 3].last(); // 3

However, extending Array.prototype comes with the price; And that price is chance of collisions. When scripts coexist with other scripts in an application, it’s important for those scripts not to conflict with each other. Extending Array.prototype, while tempting and seemingly useful, unfortunately isn’t very safe in a diverse environment. Different scripts can end up defining same-named methods, but with different behavior. Such scenario often leads to inconsistent behavior and hard-to-track errors.

Collisions can happen not only with user-defined code, but also with proprietary methods implemented by environment itself (e.g. Array.prototype.indexOf from JavaScript 1.6, before it was standardized by ES5) or from future standards (e.g. Array.prototype.map, Array.prototype.reduce, etc. — now all part of ES5).

Using constructor function other than native Array — but with same behavior — would allow to avoid such collisions. Instead of extending Array.prototype, another object would be extended (say, SubArray.prototype) and then used to initialize (sub)array objects. Any third party code which depends on methods from Array.prototype would still be able to safely use them.

2. Create data structures naturally inheriting from array

Another reason to subclass an array is to be able to create data structures, which naturally inherit from array; such as Stack, List, Queue, Set, etc. While there are certainly valid use cases for these structures, in this article I will instead focus on the first aspect — reducing chance of collisions. It is somewhat more relevant in context of cross-browser scripting.

Naive approach

Creating objects that inherit from other objects is more or less straightforward in Javascript. We can use well-known clone method:

function clone(obj) {
  function F() { }
  F.prototype = obj;
  return new F();
}

and then set-up inheritance like this:

function Child() { }
Child.prototype = clone(Parent.prototype);

clone might look confusing, but all it does is create an object with another object as nearest ancestor in its prototype chain. It uses intermediate function to avoid executing “parent” constructor. In this example, new Child creates an object with Child.prototype as first object in the prototype chain, Parent.prototype — second, and so on. To visualize, the prototype chain here looks like this:

new Child()
    |
    | [[Prototype]]
    |
    v
Child.prototype
    |
    | [[Prototype]]
    |
    v
Parent.prototype
    |
    | [[Prototype]]
    |
    v
Object.prototype
    |
    | [[Prototype]]
    |
    v
   null

Using clone method is exactly what person attempts when trying to subclass an array for the first time:

function SubArray() {
  // Take any arguments passed to constructor and add them to an instance
  this.push.apply(this, arguments);
}
SubArray.prototype = clone(Array.prototype);
 
var sub = new SubArray(1, 2, 3);

The approach seems reasonable. After all, the goal is to create an object that inherits from Array, so there’s no reason tried-and-true clone wouldn’t work. Or is there? As with few other things in Javascript, it’s not as trivial as it seems.

Problems with naive approach

So what exactly is wrong with subclassing array using clone method? Let’s take a look at how previously declared SubArray function behaves. We’ll be using native array object alongside, for comparison.

var arr = new Array(1, 2, 3);
var sub = new SubArray(1, 2, 3);
 
arr.length; // 3
sub.length; // 0 (in IE<8)
 
arr.length = 2;
sub.length = 2;
 
arr; // [1, 2]
sub; // [1, 2, 3]
 
arr[10] = 'foo';
sub[10] = 'foo';
 
arr.length; // 11
sub.length; // 2

There’s clearly some kind of inconsistency here. Even not counting a bug in IE<8. But what is this strange relation between length and numeric properties in array? And why doesn’t subclassed array behave identical? To understand this, we need to look into what array objects in Javascript really are.

Special nature of arrays

It turns out that arrays in Javascript are almost like plain Object objects, except for one little difference in behavior. The crux of this difference is summarized concisely in one paragraph of specification (15.4):

Array objects give special treatment to a certain class of property names. A property name P (in the form of a string value) is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 2^32 – 1. Every Array object has a length property whose value is always a nonnegative integer less than 2^32. The value of the length property is numerically greater than the name of every property whose name is an array index; whenever a property of an Array object is created or changed, other properties are adjusted as necessary to maintain this invariant. Specifically, whenever a property is added whose name is an array index, the length property is changed, if necessary, to be one more than the numeric value of that array index; and whenever the length property is changed, every property whose name is an array index whose value is not smaller than the new length is automatically deleted. This constraint applies only to properties of the Array object itself and is unaffected by length or array index properties that may be inherited from its prototype.

For those allergic to the condensed language of ECMA-262, here’s a short summary.

Array objects treat “numeric” properties in a special way. Whenever such property changes, value of array’s “length” property is adjusted as well; it’s adjusted in such was as to make sure that it is always one more than the greatest numeric (own) property of an array. Similarly, when “length” property is changed, numeric properties are adjusted accordingly (but only those that are larger than value of “length”).

We have already seen relation between numeric properties and length in the previous example, but let’s take a look at it again, step by step:

1) When array object is created, its “length” property is set to a value one more than the largest index of an array.

  var arr = ['x', 'y', 'z'];
  arr.length; // 3 (1 greater than largest index of an array — 2 in this case)
 
  arr = ['foo'];
  arr.length; // 1 (1 greater than largest index of an array — 0 in this case)

2) When numeric properties change, so does “length” change — to maintain the relationship of being 1 greater than the largest index.

var arr = ['x', 'y'];
arr.length; // 2, as expected
 
arr[2] = 'z'; // add another numeric property (2) larger than the largest existing one (1)
arr.length; // 3 — length is changed to be 1 greater than (new) largest index (2)

3) When “length” property changes, numeric properties are adjusted in such way so that greatest index is 1 smaller than value of “length”.

var arr = ['x', 'y', 'z'];
arr.length = 2;
 
arr; // ['x', 'y'] — note how last element (z) is deleted, because being at 2nd index, 
     //              it doesn't satisfy criteria of largest index being 1 less than length
 
arr.length = 4;
 
arr; // ['x', 'y'] — "increasing" length doesn't affect numeric properties...
 
arr.join(); // "x,y,," ...but has consequences visible in other cases, such as when using `Array.prototype.join`
 
arr.push('z');
arr; // ['x', 'y', undefined, undefined, 'z'] — ...or when using `Array.prototype.push`

Now you know the “special” nature of Array objects in Javascript, which is in the relationship between “length” and numeric properties. One little detail we haven’t looked at is that array’s “length” property MUST always have a value of non-negative integer less than 2^32. Whenever this condition is violated, a RangeError is thrown:

var arr = [];
arr.length = Math.pow(2, 32); // RangeError
 
arr.length; // 0 (length is still 0, as it initially was)
 
arr.length = Math.pow(2, 32) - 1; // set length to maximum allowed value
 
arr.length++; // RangeError (when setting length explicitly)
arr.push(1); // RangeError (or when setting length implicitly)

Function objects and [[Construct]]

It should start to make sense why there are discrepancies in behavior of objects created via SubArray and Array functions. Even though SubArray creates an object that inherits from Array.prototype, that object completely lacks array’s special behavior. The SubArray instance is nothing more than a plain Object object (as if it was created via an object literal — { }).

But why does SubArray create an Object object and not an Array object? The core of this issue is in the way functions work in ECMAScript.

When new operator is applied to an object — as in new SubArray — that object’s internal [[Construct]] method is called. In our case, it is [[Construct]] of SubArray function. SubArray — being a native function — has [[Construct]] that’s specified to create a plain Object object, and invoke corresponding function providing newly created object as this value. Any native function, including SubArray, should create an Object object and return it as a result.

Now it’s worth mentioning that it’s possible to sort of supersede return value of [[Construct]] by explicitly returning non-primitive value from constructor function:

function SubArray() {
  this.push.apply(this, arguments);
  return []; // return array object explicitly
}

— but in that case, returned object does NOT inherit from constructor’s “prototype” (SubArray.prototype in this case); neither is constructor function invoked with that object as this value:

var sub = new SubArray(1, 2, 3);
 
// Object doesn't have 1, 2, 3, as constructor was never called with `this` value referencing returned object
sub; // []
 
// SubArray is not in the prototype chain of returned object
sub instanceof SubArray; // false

As you can see, creating an object that inherits from Array.prototype is only part of the story. The biggest issue is to preserve the special relation of length and numeric properties. This is why using regular clone approach is not quite up to the task.

The importance of array special behavior

A reasonable question at this point is — “Why does array special behavior matter”? Why would we want to preserve relationship between length and numeric properties when subclassing an array? It turns out that consequences of proper length are not only visible when working with length directly, but also indirectly, when performing other tasks via Array.prototype.* methods.

Take for example Array.prototype.push — a method to append items to the end of array. To determine from which position to start inserting elements into, push retrieves a value of array’s “length”. If length is not preserved properly, elements are inserted at the wrong location:

var arr = ['x', 'y'];
arr.length = 5;
arr.push('z'); // 'z' is inserted at 5th index, since that is what the value of "length" is
arr; // ['x', 'y', undefined, undefined, undefined, 'z']

Take another method — Array.prototype.join. Used to return a representation of an array by concatenating all elements with a separator, Array.prototype.join also uses length property to determine when to stop concatenating values:

var arr = ['x', 'y'];
arr.join(); // "x,y"
arr.length = 5;
arr.join(); // "x,y,,,"

Same goes for Array.prototype.concat — method used to produce a new array by concatenating values passed to concat:

var arr = ['x'];
arr.length = 3;
arr.concat('y'); // ['x', undefined, undefined, 'y']

Finally, the special behavior is often cleverly exploited in other situations, such as to “clear” an array (i.e. delete all of its numeric properties):

var arr = [1, 2, 3];
arr.length = 0;
arr; // [] — setting length to 0 effectively removes all numeric properties (elements) of an array

Existing solutions

Now that we’re familiar with the theory, let’s see what the situation is with subclassing arrays in practice. There have been few attempts in the past, with various levels of “success”. Here are a couple of most popular ones:

Andrea Giammarchi solution

One of the recent implementations is Stack, by Andrea Giammarchi, which looks like this:

var Stack = (function(){ // (C) Andrea Giammarchi - Mit Style License
 
  function Stack(length) {
    if (arguments.length === 1 && typeof length === "number") {
      this.length = -1 < length && length === length << 1 >> 1 ? length : this.push(length);
    }
    else if (arguments.length) {
      this.push.apply(this, arguments);
    }
  };
 
  function Array() { };
  Array.prototype = [];
 
  Stack.prototype = new Array;
  Stack.prototype.length = 0;
  Stack.prototype.toString = function () {
    return this.slice(0).toString();
  };  
 
  Stack.prototype.constructor = Stack;
  return Stack;
})();

It’s an interesting solution, which mainly works around IE<8 bug with Array.prototype.push and length property. However, as should be obvious by now, it doesn’t really solve the problem of maintaining relation between length and numeric properties:

var stack = new Stack('x', 'y');
stack.length;           // 2
 
// so far so good
 
stack.push('z');
stack.length;           // 3
 
// still good
 
stack[3] = 'foo';
stack.length;           // 3
 
// not good anymore (length should have been changed to 4)
 
stack.length = 2;
stack[2];               // 'z'
 
// still not good (element at 2nd index should have been deleted)

Dean Edwards solution

Another popular solution is by Dean Edwards. This one takes a completely different approach — instead of creating an object that inherits from Array.prototype, an actual Array constructor is “borrowed” from the context of another iframe.

// create an <iframe>
var iframe = document.createElement("iframe");
iframe.style.display = "none";
document.body.appendChild(iframe);
 
// write a script into the &lt;iframe> and steal its Array object
frames[frames.length - 1].document.write(
  "<script>parent.Array2 = Array;<\/script>";
);

The reason this “works” is due to browsers creating separate execution environments for each frame in a document. Each such environment has a separate set of both — built-in and host objects. Built-in objects include global Array constructor, among others. Array object of one iframe is different from Array object of another iframe. They also don’t have any kind of hierarchical relation:

// assuming that SubArray was borrowed from another iframe
 
var sub = new SubArray(1, 2, 3);
 
sub instanceof SubArray; // true
sub instanceof Array; // false
sub instanceof Object; // false

Notice how sub is reported as NOT an instance of Array, and NOT an instance of Object. This is because neither Array, nor Object are anywhere in the prototype chain of sub object. Instead, prototype chain consists of SubArray.prototype, followed by <Object from another iframe>.prototype:

new SubArray()
    |
    | [[Prototype]]
    |
    v
<another iframe>.Array.prototype
    |
    | [[Prototype]]
    |
    v
<another iframe>.Object.prototype
    |
    | [[Prototype]]
    |
    v
   null

This brings us to one “consideration” with this approach — difficulties determining the nature of an object derived from such iframe. It’s no longer possible to determine that an object is an array using instanceof or constructor checks [1]:

  // is this object an array?
 
  sub instanceof Array; // false
  sub.constructor === Array; // false

It is, however, still possible to use [[Class]] check (we’ll talk about [[Class]] later on):

  Object.prototype.toString.call(sub) === '[object Array]'; // true

Another, more inherent, downside of this approach is that it doesn’t work in non-browser environments (or, more precisely, in any environment without support for iframes). This problem is likely to become even bigger, given that server-side Javascript implementations are rising quite fast.

Finally, it was reported that Array borrowing can cause mixed content warning in IE6, among few other minor issues.

Other than that, iframe-based array “subclassing” is free of downsides of solutions like Stack, since we’re dealing with real array objects, and so proper length/indices relation.

ECMAScript 5 accessors to the rescue

But let’s talk about ECMAScript 5, which as I mentioned in the beginning, brings something that helps with subclassing arrays. This “something” is actually nothing but property accessors. These useful language constructs have been present in some popular implementations (SpiderMonkey, JavaScriptCore, and others) as a non-standard extension for quite a while now. They are now standardized by the new edition of the language.

Using accessors, it’s rather trivial to create an Object object with special length/indices relation — relation that’s identical to that of Array objects! And since we already know how to create an object with Array.prototype in its prototype chain, combining these two aspects would allow for a complete emulation of arrays.

There’s one little detail about implementation. Since ECMAScript (including last, 5th version) doesn’t provide any catch-all (aka __noSuchMethod__) mechanism, it’s not possible to change value of length property of an object when numeric property is modified; in other words, we can’t intercept the moment when ‘0’, ‘1’, ‘2’, ‘15’, etc. properties are being set. However, accessors allow us to intercept any read access of length property and return proper value, depending on which numeric properties object has at that moment. This is all we really need.

Here’s an implementation of it, at about 45 lines of code:

var makeSubArray = (function(){
 
  var MAX_SIGNED_INT_VALUE = Math.pow(2, 32) - 1,
      hasOwnProperty = Object.prototype.hasOwnProperty;
 
  function ToUint32(value) {
    return value >>> 0;
  }
 
  function getMaxIndexProperty(object) {
    var maxIndex = -1, isValidProperty;
 
    for (var prop in object) {
 
      isValidProperty = (
        String(ToUint32(prop)) === prop && 
        ToUint32(prop) !== MAX_SIGNED_INT_VALUE && 
        hasOwnProperty.call(object, prop));
 
      if (isValidProperty && prop > maxIndex) {
        maxIndex = prop;
      }
    }
    return maxIndex;
  }
 
  return function(methods) {
    var length = 0;
    methods = methods || { };
 
    methods.length = {
      get: function() {
        var maxIndexProperty = +getMaxIndexProperty(this);
        return Math.max(length, maxIndexProperty + 1);
      },
      set: function(value) {
        var constrainedValue = ToUint32(value);
        if (constrainedValue !== +value) {
          throw new RangeError();
        }
        for (var i = constrainedValue, len = this.length; i < len; i++) {
          delete this[i];
        }
        length = constrainedValue;
      }
    };
    methods.toString = {
      value: Array.prototype.join
    };
    return Object.create(Array.prototype, methods);
  };
})();

We can now create “sub arrays” via makeSubArray function. It accepts one argument — an object with methods to add to [[Prototype]] of returned “sub array”.

var subMethods = {
  last: {
    value: function() {
      return this[this.length - 1];
    }
  }
};
var sub = makeSubArray(subMethods);
var sub2 = makeSubArray(subMethods);
// etc.

We can also hide this factory method behind a constructor, to make it similar to Array’s one:

var SubArray = (function() {
  var methods = { 
    last: { 
      value: function() {
        return this[this.length - 1];
      } 
    }
  };
  return function() {
    var arr = makeSubArray(methods);
    if (arguments.length === 1) {
      arr.length = arguments[0];
    }
    else {
      arr.push.apply(arr, arguments);
    }
    return arr;
  };
})();

And then use it as you would use regular Array constructor:

var sub = new SubArray(1, 2, 3);
 
sub.length; // 3
sub; // [1, 2, 3]
 
sub.length = 1;
sub; // [1]
 
sub[10] = 'x';
sub.push(1);

You can find this version of SubArray together with unit tests in Gtihub repository. For brevity, I made this implementation mainly take care of length/indices relation; certain methods (e.g. concat) do not behave identical to Array and need to be implemented accordingly.

[[Class]] limitations

The implementation we have just seen — the one utilizing property accessors — is great. It doesn’t require any host objects (such as iframes); it preserves relation between length and numeric properties; it even disallows out-of-range values for length or indices. All it requires is support for ES5 (or even just Object.create method).

But the dramatic title of this post is not there just for fun. There’s one little detail we’re missing in this otherwise complete implementation. And that detail is proper [[Class]] value — something that ECMAScript still doesn’t give full control over.

I wrote about [[Class]] before, when explaining how to detect arrays. In a nutshell, [[Class]] is an internal property of objects in ECMAScript. Its value is never exposed directly, but can still be inspected using certain methods (e.g. Object.prototype.toString). The usefulness of [[Class]] is that it allows to detect type of objects without relying on instanceof operator or checking object’s constructor — both of which fall short to detect objects from other contexts (e.g. iframes), as we’ve seen earlier.

Now, since objects created by makeSubArray are nothing but plain Object objects (only with special length getters/setters), their [[Class]] is also that of “Object” not an “Array”! We’ve taken care of length/indices relation, we’ve set up Array.prototype inheritance, but there’s no way to change object’s [[Class]] value. And so this solution can not claim to be complete.

Does [[Class]] matter?

You might be wondering — what are the actual implications of these pseudo-array objects having [[Class]] of “Object” not an “Array”. Do we even care? Well, for once, there’s an issue with object detection. Ironically, the solution I proposed to detect arrays relies on [[Class]], and so would fall short with objects like these.

// assuming that `sub` is a pseudo-array
Object.prototype.toString.call(sub) === '[object Array]'; // false

Another, probably more important, implication is that some of the methods in ECMAScript actually rely on [[Class]] value. For example, a well-known Function.prototype.apply accepts an array as its second argument (as well as an arguments object). Section 15.3.4.3 of ES3 says — “if argArray is neither an array nor an arguments object (see 10.1.8), a TypeError exception is thrown”. What this means is that if we pass pseudo-array object as a second argument to apply it will throw TypeError. apply doesn’t know or care if an object inherits from Array.prototype; neither does it care about object implementing special length/indices behavior. All it cares is that object is of proper type — type that we, unfortunately, can not emulate.

// assuming that `sub` is a pseudo-array
someFunction.apply(this, sub); // TypeError

There’s some vagueness in specification on this matter. For example, in Date.prototype.setTime spec says “If the this value is not a Date object, throw a TypeError exception.”, but in Date.prototype.getTime, it uses [[Class]] rather than just “not a Date object” — “If the this value is not an object whose [[Class]] property is “Date”, throw a TypeError exception”.

It’s probably safe to assume that these 2 phrases — “Date object” and “object with [[Class]] of ‘Date’” — have identical meaning. Ditto for “Array object” and “object with [[Class]] of ‘Array’”, as well as others.

Function.prototype.apply is not the only method sensitive to [[Class]] of an object. Array.prototype.concat, for example, follows different algorithm based on whether an object is an array or not (in other words — whether it has [[Class]] of “Array” or not).

// array ([[Class]] == "Array")
var arr = ['x', 'y'];
 
// object with numeric properties ([[Class]] == "Object")
var obj = { '0': 'x', '1': 'y' };
 
[1,2,3].concat(arr); // [1, 2, 3, 'x', 'y']
[1,2,3].concat(obj); // [1, 2, 3, { '0': 'x', '1': 'y' }]

As you can see, array values are “flattened”, whereas non-array ones are left as is. It is certainly possible to give these pseudo-arrays custom implementation of concat (and “fix” any other of Array.prototype.* methods), but the problem with Function.prototype.apply can not be solved.

It’s worth mentioning that another downside of accessor-based pseudo-array approach is performance. I haven’t done any tests, but it’s pretty clear that an implementation which has to enumerate over all numeric properties on every access of length property is not going to perform well. This is why I can’t recommend this solution for anything other than educational purposes.

Wrappers. Direct property injection.

Realizing a somewhat futile nature of subclassing arrays in Javascript often makes alternative solutions look very attractive. One of such solutions is using wrappers. Wrapper approach avoids setting up inheritance or emulating length/indices relation. Instead, a factory-like function can create a plain Array object, and then augment it directly with any custom methods. Since returned object is an Array one, it maintains proper length/indices relation, as well as [[Class]] of “Array”. It also inherits from Array.prototype, naturally.

function makeSubArray() {
  var arr = [ ];
  arr.push.apply(arr, arguments);
  arr.last = function() { 
    return this[this.length - 1];
  };
  return arr;
}
 
var sub = makeSubArray(1, 2, 3);
sub instanceof Array; // true
 
sub.length; // 3
sub.last(); // 3

While direct extension of array object is a beautiful, simplistic solution, it’s not without downsides. The main disadvantage is that on each invocation of constructor, an array needs to be extended with N number of methods. The time it takes to create an array is no longer a constant (if methods were on SubArray.prototype), but is directly proportional to the number of methods that need to be added.

Wrappers. Prototype chain injection.

To overcome the problem of “N methods”, another variation of wrappers can be used — the one in which object’s prototype chain is augmented, rather than object itself. Let’s see how this could be done:

function SubArray() { }
SubArray.prototype = new Array;
SubArray.prototype.last = function() {
  return this[this.length - 1];
};
 
function makeSubArray() {
  var arr = [ ];
  arr.push.apply(arr, arguments);
  arr.__proto__ = SubArray.prototype;
  return arr;
}

The idea is simple. When makeSubArray function is executed, two things happen: 1) an array object is created and is populated with any passed arguments; 2) object’s prototype chain is augmented in such way so that next object is SubArray.prototype, not original Array.prototype. The augmentation of prototype chain is done via non-standard __proto__ property.

But what happens in makeSubArray function is of course only half of the story. To make sure that object has Array.prototype in its prototype chain, we need to make SubArray.prototype inherit from it. This is exactly what’s being done on a second line of this snippet (SubArray.prototype = new Array). Prototype chain of an object returned from makeSubArray now looks like this:

new SubArray()
    |
    | [[Prototype]]
    |
    v
SubArray.prototype
    |
    | [[Prototype]]
    |
    v
Array.prototype
    |
    | [[Prototype]]
    |
    v
Object.prototype
    |
    | [[Prototype]]
    |
    v
   null

And because returned object is actually an Array, not an Object one, we also get length/indices relation as well as proper [[Class]] value. In fact, we can go even further and move initialization logic into SubArray constructor itself:

function SubArray() {
  var arr = [ ];
  arr.push.apply(arr, arguments);
  arr.__proto__ = SubArray.prototype;
  return arr;
}
SubArray.prototype = new Array;
SubArray.prototype.last = function() {
  return this[this.length - 1];
};
 
var sub = new SubArray(1, 2, 3);
 
sub instanceof SubArray; // true
sub instanceof Array; // true

Even though augmenting prototype chain is a more performant solution, there’s a clear downside — it relies on non-standard __proto__ property. ECMAScript, unfortunately, does not allow to set [[Prototype]] of an object — internal property referencing immediate ancestor in its prototype chain. Not even in 5th edition. Even though __proto__ is supported by a rather large number of implementations, it is far from being truly compatible.

Summary

So here it is; all the fun intricacies of subclassing arrays in Javascript.

We’ve seen that contrary to what might seem, actual inheritance is by far not the only aspect of subclassing arrays in Javascript; that arrays are different from regular objects by having special length/indices relation; how this length/indices relation is important and has nothing to do with prototype chain of an object; how arrays have special [[Class]] value of “Array” which is also rather important, and isn’t inherited either; how it’s not possible to change [[Class]] value of an object — not even in ECMAScript 5. We looked at different ways to “subclass” an array, starting from borrowing Array constructors from other contexts, and ending with augmentation of prototype chain. We examined benefits and downsides of each one of those solutions.

What we haven’t touched upon is the performance metrics of each of the implementations — perhaps a good topic for another discussion.

On this note, I leave you with a table summarizing pros/cons of the above mentioned techniques.

Proper [[Class]] length/indices Uses native objects only Requires ES3 only
Stack (Andrea Giammarchi) No No Yes Yes
IFrame borrowing (Dean Edwards) Yes Yes No Yes
Accessors No Yes Yes No
Direct extension Yes Yes Yes Yes
Prototype extension Yes Yes Yes No

[1] Whether this endeavor is something worth pursuing is a topic for another discussion

P.S. Big thanks to John David Dalton for reviewing an article and giving useful suggestions.

Archives Posts

JScript and DOM changes in IE9 preview 3

June 24th, 2010 by kangax

3rd preview of IE9 was released yesterday, with some amazing additions, like canvas element and an extensive ES5 support. I’ve been digging through it a little, to see what has changed and what hasn’t — mainly looking at JScript and DOM. I posted some of the findings on twitter, but want to also list them here, as it’s not very convenient to share code snippets in 140 characters. Referencing it all in one place will hopefully make it easier for IE team to find and fix these deficiencies.

ECMAScript 5 and JScript

The big news is that IE9pre3 has (almost) full support for ES5. By “full support”, I mean that it implements majority of new API, such as Object.create, Object.defineProperty, String.prototype.trim, Array.isArray, Date.now, and many other additions. As of now, IE9 implements the largest number of new methods; even more than latest Chrome, Safari and Firefox. Unbelievable, isn’t it? :)

screenshot of es5 compatibility table

You can see the results in this compatibility table (note that it lists results of mere “existence” testing, not any kind of conformance).

What’s missing is strict mode, which actually isn’t implemented in any of the browsers yet.

Some of the things I noticed:

ES5 Object.getPrototypeOf on host objects seems to lie, always returning null instead of proper value of [[Prototype]]:

  Object.getPrototypeOf(document.body); // null
  Object.getPrototypeOf(document); // null
  Object.getPrototypeOf(alert); // null
  Object.getPrototypeOf(document.childNodes); // null

This doesn’t happen in other browsers that implement Object.create at the moment, such as latest Chrome, WebKit or Firefox. In Chrome, for example:

  Object.getPrototypeOf(document.body) === HTMLBodyElement.prototype;
  Object.getPrototypeOf(document) === HTMLDocument.prototype;
  Object.getPrototypeOf(alert) === Function.prototype;
  Object.getPrototypeOf(document.childNodes) === NodeList.prototype

… and so on.

Interestingly, bound functions in IE9pre3 are represented as “function(){ [native code] }”, similar to host objects:

  var bound = (function f(x, y){ return this; }).bind({ x: 1 });
  bound + ''; // "function(){ [native code] }"
 
  // compare to
 
  alert + ''; // "function alert(){ [native code] }"

Note how function representation does not include identifier (f), parameters (x and y), nor representation of function body (return this;). This of course proves once again that relying on function decompilation is NOT a good idea.

Whitespace character class (as in /\s/) still doesn’t match majority of whitespace characters (as defined by specs). These include “U+00A0”, “U+2000” to “U+200A”, “U+3000”, etc. The test is available here. Curiously, ES5 String.prototype.trim seems to “understand” those characters as whitespace very well, producing empty string — as expected — for something like '\u00A0'.trim().

It was nice to see that ES5 Array.isArray is about 20 times faster than custom implementation, such as this one:

  function isArray(o) {
    return Object.prototype.toString.call(o) === "[object Array]";
  }

The difference in speed is similar to other browsers that implement this method.

An infamous, 10+ year-old JScript NFE bug, which I described at length before, is finally fixed:

  var f = function g() { return f === g; };
  typeof g; // "undefined"
 
  f(); // true

arguments’ [[Class]] is now an “Arguments”, just like ES5 specifies it:

  var args = (function(){ return arguments; })();
  Object.prototype.toString.call(args); // "[object Arguments]"

DOM

Unfortunately, the entire host objects infrastructure still looks very similar to the one from IE8. Host objects don’t inherit from Object.prototype, don’t report proper typeof, and don’t even have basic properties like “length” or “prototype”, which all function objects must have:

  alert instanceof Object; // false
  typeof alert; // "object"
  alert.length; // undefined

Because they don’t inherit from Object.prototype, we don’t have any of Object.prototype methods, naturally:

  alert.toString; // undefined
  alert.constructor; // undefined
  alert.hasOwnProperty; undefined

Object.prototype is not the only object host methods fail to inherit from. In majority of modern browsers, host objects also inherit from Function.prototype and so have Function.prototype methods like call and apply. This doesn’t happen in IE9pre3.

  alert instanceof Function; // false
  document.createElement instanceof Function; // false
 
  alert.call; // undefined

Curiously, call and apply are present on some host objects, but they are still not inherited from Function.prototype:

  typeof document.createElement.call; // "function"
  document.createElement.call === Function.prototype.call; // false

Host objects’ [[Class]] is far from ideal as well. IE9pre3 actually violates ES5, which says that objects implementing [[Call]] (or in other words — are callable) should have [[Class]] of “Function” — even if they are host objects. In IE9pre3, alert is a callable host object, yet it reports its [[Class]] as “Object” not “Function”. Not good.

  Object.prototype.toString.call(alert); // "[object Object]"
  Object.prototype.toString.call(document.createElement); // "[object Object]"

IE9pre3 still messes up DOM objects’ attributes and properties, although not as badly as earlier versions:

  var el = document.createElement('p');
  el.setAttribute('x', 'y');
  el.x; // 'y'
 
  el.foobarbaz = 'moo';
  el.hasAttribute('foobarbaz'); // true
  el.getAttribute('foobarbaz'); // 'moo'

Some old, humorous bugs can still be seen in IE9pre3, such as methods returning “string” when applied typeof on:

  typeof Option.create; // "string"
  typeof Image.create; // "string"
  typeof document.childNodes.item; // "string"

Undeclared assignments still throw error when same-id’ed elements are present in DOM, however not with same-name’ed elements (as it was in previous versions):

  <div id="foo"></div>
  <a name="bar"></a>
  ...
  <script>
    foo = function(){ /* ... */ }; // Error
    bar = function(){ /* ... */ }; // no Error
  </script>

Similarly to IE8, only Element and specific element type interfaces (HTMLDivElement, HTMLScriptElement, HTMLSpanElement, etc.) are exposed as same-named global properties. Node and HTMLElement are still missing, and element’s prototype chain most likely still looks like this:

  document.createElement('div');
    |
    | [[Prototype]]
    v
  HTMLDivElement.prototype
    |
    | [[Prototype]]
    v
  Element.prototype
    |
    | [[Prototype]]
    v
  null

…rather than what can be seen in almost all other modern browsers:

  document.createElement('div');
    |
    | [[Prototype]]
    v
  HTMLDivElement.prototype
    |
    | [[Prototype]]
    v
  HTMLElement.prototype
    |
    | [[Prototype]]
    v
  Element.prototype
    |
    | [[Prototype]]
    v
  Node.prototype
    |
    | [[Prototype]]
    v
  Object.prototype
    |
    | [[Prototype]]
    v
  null

getComputedStyle from DOM Level 2 is still missing, however its value is mysteriously a null, not undefined. The property actually exists on an object, but has a value of null. Hopefully, this is just a placeholder and proper method will be added before final release.

  document.defaultView.getComputedStyle; // null
  'getComputedStyle' in document.defaultView; // true

Array.prototype.slice can now convert certain host objects (e.g. NodeList’s) to arrays — something that majority of modern browsers have been able to do for quite a while:

  Array.prototype.slice.call(document.childNodes) instanceof Array; // true

That’s it for now.

Unfortunately, I don’t have much time to look into these things extensively, at the moment. There might be more updates on twitter.

As always, any corrections, suggestions, and additions are much appreciated.

Archives Posts

`instanceof` considered harmful (or how to write a robust `isArray`)

January 10th, 2009 by kangax

Checking types in Javascript is well known as a pretty unreliable process.
Good old typeof operator is often useless when it comes to certain types of values:

typeof null; // "object"
typeof []; // "object"

People often expect to see something like “null” in the former check and something like “array” in the latter one.
Fortunately, checking for null is not that hard, despite useless typeof, and is usually accomplished by strict-comparing value to null:

value === null;

Checking for arrays, on the other hand, is a somewhat tricky business. There are usually two schools of thought – using instanceof operator (or checking object’s constructor property) and the-duck-typing way – checking for presence (or types) of certain set of properties (which are known to be present in array objects).

Obviously, both ways have their pros and cons.

1) `instanceof` operator / `constructor` property

instanceof operator essentially checks whether anything from left-hand object’s prototype chain is the same object as what’s referenced by prototype property of right-hand object. It sounds somewhat complicated but is easily understood from a simple example:

var arr = []; 
arr instanceof Array; // true

This statement returns `true` because Array.prototype (being a prototype property of a right-hand object) references the same object as an internal [[Prototype]] of left-hand object ([[Prototype]] is “visible” via arr.__proto__ in clients that have __proto__ extension). An alternative constructor check, which I mentioned earlier, would usually look like:

var arr = []; 
arr.constructor == Array; // true

Both instanceof and constructor look very innocent and seem like great ways to check if an object is an array. If I remember correctly, latest jQuery is using constructor:

An excerpt from jQuery (rev. 5917):

... 
isArray: function( arr ) {
  return !!arr && arr.constructor == Array;
} 
...

The problems arise when it comes to scripting in multi-frame DOM environments. In a nutshell, Array objects created within one iframe do not share [[Prototype]]’s with arrays created within another iframe. Their constructors are different objects and so both instanceof and constructor checks fail:

var iframe = document.createElement('iframe'); 
document.body.appendChild(iframe); 
xArray = window.frames[window.frames.length-1].Array;
var arr = new xArray(1,2,3); // [1,2,3]  
 
// Boom! 
arr instanceof Array; // false  
 
// Boom! 
arr.constructor === Array; // false

This “problem” was mentioned by Crockford as far as back in 2003. Doug suggested to try duck-typing and check for a type of one of the Array.prototype methods – e.g.:

typeof myArray.sort == 'function'

Exactly for these reasons Javascript authors often resort to a second approach:

2) Duck-typing

We’ve been using it in Prototype.JS for quite some time now. Dean Edwards was using it in its base2, last time I looked at it.

An excerpt from Prototype.js (v. 1.6.0.3):

function isArray(object) {
  return object != null && typeof object === "object" &&
    'splice' in object && 'join' in object; 
}

By “fixing” multi-frame “problem”, this naive approach fails short in some of the trivial cases. If you were ever to have an object with splice and join properties, Object.isArray would obviously detect that object as being an Array:

var testee = { splice: 1, join: 2 }; 
Object.isArray(testee); // true

Back in June, I was reading ECMA-262 specs and noticed that there was an easy way to get value of an internal [[Class]] property that every native object has. Object.prototype.toString was defined like so:



Object.prototype.toString( )
When the toString method is called, the following steps are taken:
1. Get the [[Class]] property of this object.
2. Compute a string value by concatenating the three strings “[object ", Result (1), and "]“.
3. Return Result (2)

Contrary to Function.prototype.toString which is implementation dependent and is NOT recommended to be relied upon, Object.prototype.toString has a clearly defined behavior for all native objects.



15.3.4.2 Function.prototype.toString()
An implementation-dependent representation of the function is returned. This representation has the syntax of a FunctionDeclaration. Note in particular that the use and placement of white space, line terminators, and semicolons within the representation string is implementation-dependent.

Just as a fun exercise, I wrote a simple __getClass method, put it into an “experimental” folder and forgot about it : )

function __getClass(object) {
  return Object.prototype.toString.call(object)
    .match(/^\[object\s(.*)\]$/)[1]; 
};

A couple of weeks ago, though, someone created a ticket for Prototype.js – proposing an Object.isDate method. An implementation used constructor check and so was vulnerable to cross-frame issues. This is when I remembered about getClass and its possible usage in isArray, isDate and other similar methods.

Specs mention that:



15.4.2.1 new Array([ item0[, item1 [,...]]])

The [[Class]] property of the newly constructed object is set to “Array”.


This means that creating isArray function could not be simpler than:

function isArray(o) {
  return Object.prototype.toString.call(o) === '[object Array]'; 
}

The solution is not dependent on frames (since it checks internal [[Class]]) and is more robust than duck-typing approach. I have tested it on a handful of browsers (including some archaic and mobile ones) and was happy to find that all of them are indeed compliant in this regard.

Let’s hope this little “trick” serves as a remedy to cross-frame issues that authors struggle to find workarounds for : )

Happy new year!

Filed under [[Class]], isArray having 79 Comments »