Perfection kills

Exploring Javascript by example

Archives Posts

Tag is not an element. Or is it?

June 1st, 2010 by kangax

It’s interesting how widely some misconceptions spread around. The one I noticed recently is the “issue” of elements vs. tags. The problem is that people say tags when they mean elements, and do it so often that it’s not clear if the distinction is still relevant.

Or if anyone even cares anymore.

Elements vs. tags

If you look at section 3 of HTML 4.01 — “on SGML and HTML”, there’s an explicit note about elements not being tags. In HTML 4.01,
<p>foo bar</p> is an element, not a tag. An element consists of a start tag, content, and an end tag. In case of <p>foo bar</p>, <p> is a start tag, foo bar is content, and </p> is an end tag.

In other words, elements consist of tags.

Optional tags

The distinction between tags and elements becomes slightly less clear once we start dealing with elements that have optional tags, as defined by HTML 4.01. For example, <p> or <td> elements don’t have to have end tags. They could very well exist without them. When parser finds <p>foo bar in markup, it still creates an element. There’s no end </p> tag, but parser doesn’t really need it; start <p> tag already denotes what kind of element it is.

  <p>foo bar
 
  <tr>
    <td>baz
    <td>qux
  </tr>

But that’s not all.

Some elements, besides having optional end tags, have empty content model, which means that they can’t have any content at all. And when an element is not allowed to have any content and has an optional tag, it’s called an empty element. Not only are end tags optional in such elements, but they must be completely omitted. These, unfortunately, are not some obscure elements, but are very much useful ones like <br>, <link>, <img>, <input>, <meta> and few others.

What’s interesting is that <br> is still an element, only an element that consists of start tag only. It’s just that its content and end tag must never be present. The fact that <br>, <img> or other empty elements consist of start tags only, makes things rather confusing.

And we’re not even talking about elements with both tags optional — <html>, <head>, <body>. Those could exist without any visible traces at all, and are only created based on the context.

  <html>
    <!-- 
            There's no HEAD start tag, no HEAD end tag, and no HEAD content here. 
            Yet, HEAD element is still created implicilty.
            This happens because content model of HTML element is defined as `head, body`, 
            which means that both elements should be present in HTML element in that order. 
            As soon as BODY start tag is found, even if HEAD tags are not present, 
            HEAD element is created automatically.  -->
    <body>
    ...
    </body>
  </html>

Which confusion?

So which practical implications does this confusion actually have?

For one, saying something like “insert an image after a <p> tag” is ranging from “wrong” to “ambiguous”, since we can’t insert anything but a chunk of text after a <p> tag, and <p> tag can be either a start one (<p>) or an end one (</p>). In this case, a better way would be to say — “insert an <img> tag after a start <p> tag”:

  <p>
    <img ...> <!-- IMG tag is inserted after a start P tag -->
    ...
  </p>

in which case <img> element would become a child of <p> element. Or we could say — “insert an <img> tag after an end <p> tag”:

  <p>
    ...
  </p>
  <img ...> <!-- IMG tag is inserted after an end P tag -->

in which case <img> element would be a sibling following <p> one.

Of course, most of the time, what people really mean by “insert an image after a <P> tag” is a second version. It’s just that “element” is accidentally replaced with a “tag”. An even better way — and the one that avoids mention of tags in the first place — is to say “insert an <IMG> element after a <P> element”. This version leaves no room for incorrect interpretation.

Global confusion

What’s interesting about all this is not so much the finer points of difference between tags and elements, but just how widely this misconception prevails. Google search returns 480,000 results for “div tag”, but only 137,000 for “div element”. For an empty element, such as img, the difference is even scarier — “img tag” returns 959,000 results, while “img element” only 48,200. An element is confused for a tag everywhere, from blogs, articles, and mailing lists to books, references, and frameworks.

Pedantry or an important distinction?

Once you start thinking about the distinction, edges become somewhat blurry. Are all of the examples above really wrong?

When describing “image_tag”, Ruby on Rails documentation says “Returns an html image tag …”. The returned string — “<img …>” — can actually very well be considered an image (start) tag. Yes, the string represents an element, but since an element is empty, it’s also a string that consists of <img> tag only, and so can probably be called an “image” tag.

At the same time, “javascript_include_tag” already crosses the line of correctness. It still uses “Returns an html script tag, but already returns a string that can only be considered an element — “<script type=”text/javascript” src=”…”></script>”, since there’s now a start tag, content (empty), and an end tag.

w3schools is just plain wrong [1], saying things like “The <div> tag defines a division or a section in an HTML document.” or “The <div> tag is often used to group block-elements to format them with styles.”. Tags do not define division, they represent elements, and it is elements that have certain semantic meaning; in this case — division.

In some of the popular articles, we can find phrases like “… the nearer ancestor of our <footer> tag is the <body> tag …”, in which case it’s pretty clear that “tag” is not the right word at all; Tags can not be ancestors, but elements can.

However, saying that “browser supports <video> tag” is technically not wrong, since browsers supporting <video> element, most definitely can parse and understand <video> tags as well (it is by recognizing video tags that they are able to create video elements in DOM).

Speaking of DOM…

What about DOM?

Before I knew the difference between tags and elements, I would always think in terms of tags when talking about HTML, and in terms of elements when talking about DOM. It just made sense that HTML, being markup language, consists of tags, while HTML DOM — or rather, the document available for scripting — is a tree-like structure consisting of elements, and other kinds of nodes. I knew that browser parses HTML markup (and so tags), and then creates a tree-like structure to represent a document, in which case tags essentially become elements. The fact that elements are not just kinds of nodes, but are also chunks of text in markup seemed very strange when I first found out about it.

It seems that this is exactly how most of the people think about tags vs. elements. Tags exist in HTML (text), and elements – in document (DOM). This would explain why tags prevail in discussions about HTML, or markup in general; and why elements are mostly mentioned in context of scripting, rendering, etc.

Nevertheless, I believe that keeping terminology straight is important. Things should be called as they really are, to avoid the ambiguity that we’ve seen in the previous example. A method named something like forEachTag should not iterate over each element, and vice-versa; technical discussions, articles, and documentation should really strive to use proper terms.

What now?

The attempts at demystification were already made in the past, yet the effect is barely visible. So I wonder — why? Is it too unintuitive to speak in terms of elements in context of HTML, or is this a lack of explanation and exposure of the subject? Does the distinction even matter? Or does it matter in technical discussions only? Does it make sense to distinguish these two entities, or should we just try to infer the exact meaning based on the context, as it seems to be done right now? Are we all simply used to the word “tag”, and don’t care about the difference most of the time?

What do you think?

[1] …which is not surprising, considering the amount of other misconceptions on that site, such as classifying HTML comments as tags.

Filed under don'ts, html having 10 Comments »

Archives Posts

What’s wrong with extending the DOM

April 5th, 2010 by kangax

I was recently surprised to find out how little the topic of DOM extensions is covered on the web. What’s disturbing is that downsides of this seemingly useful practice don’t seem to be well known, except in certain secluded circles. The lack of information could well explain why there are scripts and libraries built today that still fall into this trap. I’d like to explain why extending DOM is generally a bad idea, by showing some of the problems associated with it. We’ll also look at possible alternatives to this harmful exercise.

But first of all, what exactly is DOM extension? And how does it all work?

How DOM extension works

DOM extension is simply the process of adding custom methods/properties to DOM objects. Custom properties are those that don’t exist in a particular implementation. And what are the DOM objects? These are host objects implementing Element, Event, Document, or any of dozens of other DOM interfaces. During extension, methods/properties can be added to objects directly, or to their prototypes (but only in environments that have proper support for it).

The most commonly extended objects are probably DOM elements (those that implement Element interface), popularized by Javascript libraries like Prototype and Mootools. Event objects (those that implement Event interface), and documents (Document interface) are often extended as well.

In environment that exposes prototype of Element objects, an example of DOM extension would look something like this:

  Element.prototype.hide = function() {
    this.style.display = 'none';
  };
  ...
  var element = document.createElement('p');
 
  element.style.display; // ''
  element.hide();
  element.style.display; // 'none'

As you can see, “hide” function is first assigned to a hide property of Element.prototype. It is then invoked directly on an element, and element’s “display” style is set to “none”.

The reason this “works” is because object referred to by Element.prototype is actually one of the objects in prototype chain of P element. When hide property is resolved on it, it’s searched throughout the prototype chain until found on this Element.prototype object.

In fact, if we were to examine prototype chain of P element in some of the modern browsers, it would usually look like this:

  // "^" denotes connection between objects in prototype chain
 
  document.createElement('p');
    ^
  HTMLParagraphElement.prototype
    ^
  HTMLElement.prototype
    ^
  Element.prototype
    ^
  Node.prototype
    ^
  Object.prototype
    ^
  null

Note how the nearest ancestor in the prototype chain of P element is object referred to by HTMLParagraphElement.prototype. This is an object specific to type of an element. For P element, it’s HTMLParagraphElement.prototype; for DIV element, it’s HTMLDivElement.prototype; for A element, it’s HTMLAnchorElement.prototype, and so on.

But why such strange names, you might ask?

These names actually correspond to interfaces defined in DOM Level 2 HTML Specification. That same specification also defines inheritance between those interfaces. It says, for example, that “… HTMLParagraphElement interface have all properties and functions of the HTMLElement interface …” (source) and that “… HTMLElement interface have all properties and functions of the Element interface …” (source), and so on.

Quite obviously, if we were to create a property on “prototype object” of paragraph element, that property would not be available on, say, anchor element:

  HTMLParagraphElement.prototype.hide = function() {
    this.style.display = 'none';
  };
  ...
  typeof document.createElement('a').hide; // "undefined"
  typeof document.createElement('p').hide; // "function"

This is because anchor element’s prototype chain never includes object refered to by HTMLParagraphElement.prototype, but instead includes that referred to by HTMLAnchorElement.prototype. To “fix” this, we can assign to property of object positioned further in the prototype chain, such as that referred to by HTMLElement.prototype, Element.prototype or Node.prototype.

Similarly, creating a property on Element.prototype would not make it available on all nodes, but only on nodes of element type. If we wanted to have property on all nodes (e.g. text nodes, comment nodes, etc.), we would need to assign to property of Node.prototype instead. And speaking of text and comment nodes, this is how interface inheritance usually looks for them:

  document.createTextNode('foo'); // < Text.prototype < CharacterData.prototype < Node.prototype
  document.createComment('bar'); // < Comment.prototype < CharacterData.prototype < Node.prototype

Now, it’s important to understand that exposure of these DOM object prototypes is not guaranteed. DOM Level 2 specification merely defines interfaces, and inheritance between those interfaces. It does not state that there should exist global Element property, referencing object that’s a prototype of all objects implementing Element interface. Neither does it state that there should exist global Node property, referencing object that’s a prototype of all objects implementing Node interface.

Internet Explorer 7 (and below) is an example of such environment; it does not expose global Node, Element, HTMLElement, HTMLParagraphElement, or other properties. Another such browser is Safari 2.x (and most likely Safari 1.x).

So what can we do in environments that don’t expose these global “prototype” objects? A workaround is to extend DOM objects directly:

  var element = document.createElement('p');
  ...
  element.hide = function() {
    this.style.display = 'none'; 
  };
  ...
  element.style.display; // ''
  element.hide();
  element.style.display; // 'none'

What went wrong?

Being able to extend DOM elements through prototype objects sounds amazing. We are taking advantage of Javascript prototypal nature, and scripting DOM becomes very object-oriented. In fact, DOM extension seemed so temptingly useful that few years ago, Prototype Javascript library made it an essential part of its architecture. But what hides behind seemingly innocuous practice is a huge load of trouble. As we’ll see in a moment, when it comes to cross-browser scripting, the downsides of this approach far outweigh any benefits. DOM extension is one of the biggest mistakes Prototype.js has ever done.

So what are these problems?

Lack of specification

As I have already mentioned, exposure of “prototype objects” is not part of any specification. DOM Level 2 merely defines interfaces and their inheritance relations. In order for implementation to conform to DOM Level 2 fully, there’s no need to expose those global Node, Element, HTMLElement, etc. objects. Neither is there a requirement to expose them in any other way. Given that there’s always a possibility to extend DOM objects manually, this doesn’t seem like a big issue. But the truth is that manual extension is a rather slow and inconvenient process (as we will see shortly). And the fact that fast, “prototype object” -based extension is merely somewhat of a de-facto standard among few browsers, makes this practice unreliable when it comes to future adoption or portability across non-convential platforms (e.g. mobile devices).

Host objects have no rules

Next problem with DOM extension is that DOM objects are host objects, and host objects are the worst bunch. By specification (ECMA-262 3rd. ed), host objects are allowed to do things, no other objects can even dream of. To quote relevant section [8.6.2]:

Host objects may implement these internal methods with any implementation-dependent behaviour, or it may be that a host object implements only some internal methods and not others.

The internal methods specification talks about are [[Get]], [[Put]], [[Delete]], etc. Note how it says that internal methods behavior is implementation-dependent. What this means is that it’s absolutely normal for host object to throw error on invocation of, say, [[Get]] method. And unfortunatey, this isn’t just a theory. In Internet Explorer, we can easily observe exactly this—an example of host object [[Get]] throwing error:

  document.createElement('p').offsetParent; // "Unspecified error."
  new ActiveXObject("MSXML2.XMLHTTP").send; // "Object doesn't support this property or method"

Extending DOM objects is kind of like walking in a minefield. By definition, you are working with something that’s allowed to behave in unpredictable and completely erratic way. And not only things can blow up; there’s also a possibility of silent failures, which is even worse scenario. An example of erratic behavior is applet, object and embed elements, which in certain cases throw errors on assignment of properties. Similar disaster happens with XML nodes:

  var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
  xmlDoc.loadXML('<foo>bar</foo>');
  xmlDoc.firstChild.foo = 'bar'; // "Object doesn't support this property or method"

There are other cases of failures in IE, such as document.styleSheets[99999] throwing “Invalid procedure call or argument” or document.createElement('p').filters throwing “Member not found.” exceptions. But not only MSHTML DOM is the problem. Trying to overwrite “target” property of event object in Mozilla throws TypeError, complaining that property has only a getter (meaning that it’s readonly and can not be set). Doing same thing in WebKit, results in silent failure, where “target” continues to refer to original object after assignment.

When creating API for working with event objects, there’s now a need to consider all of these readonly properties, instead of focusing on concise and descriptive names.

A good rule of thumb is to avoid touching host objects as much as possible. Trying to base architecture on something that—by definition—can behave so sporadically is hardly a good idea.

Chance of collisions

API based on DOM element extensions is hard to scale. It’s hard to scale for developers of the library—when adding new or changing core API methods, and for library users—when adding domain-specific extensions. The root of the issue is a likely chance of collisions. DOM implementations in popular browsers usually all have properietary API’s. What’s worse is that these API’s are not static, but constantly change as new browser versions come out. Some parts get deprecated; others are added or modified. As a result, set of properties and methods present on DOM objects is somewhat of a moving target.

Given huge amount of environments in use today, it becomes impossible to tell if certain property is not already part of some DOM. And if it is, can it be overwritten? Or will it throw error when attempting to do so? Remember that it’s a host object! And if we can quietly overwrite it, how would it affect other parts of DOM? Would everything still work as expected? If everything is fine in one version of such browser, is there a guarantee that next version doesn’t introduce same-named property? The list of questions goes on.

Some examples of proprietary extensions that broke Prototype are wrap property on textareas in IE (colliding with Element#wrap method), and select method on form control elements in Opera (colliding with Element#select method). Even though both of these cases are documented, having to remember these little exceptions is annoying.

Proprietary extensions are not the only problem. HTML5 brings new methods and properties to the table. And most of the popular browsers have already started implementing them. At some point, WebForms defined replace property on input elements, which Opera decided to add to their browser. And once again, it broke Prototype, due to conflict with Element#replace method.

But wait, there’s more!

Due to long-standing DOM Level 0 tradition, there’s this “convenient” way to access form controls off of form elements, simply by their name. What this means is that instead of using standard elements collection, you can access form control like this:

  <form action="">
    <input name="foo">
  </form>
  ...
  <script type="text/javascript">
    document.forms[0].foo; // non-standard access
    // compare to
    document.forms[0].elements.foo; // standard access
  </script>

So, say you extend form elements with login method, which for example checks validation and submits login form. If you also happen to have form control with “login” name (which is pretty likely, if you ask me), what happens next is not pretty:

  <form action="">
    <input name="login">
    ...
  </form>
  ...
  <script type="text/javascript">
    HTMLFormElement.prototype.login = function(){ 
      return 'logging in'; 
    };
    ...
    $(myForm).login(); // boom!
    // $(myForm).login references input element, not `login` method
  </script>

Every named form control shadows properties inherited through prototype chain. The chance of collisions and unexpected errors on form elements is even higher.

Situation is somewhat similar with named form elements, where they can be accessed directly off document by their names:

  <form name="foo">
    ...
  </form>
  ...
  <script type="text/javascript">
    document.foo; // [object HTMLFormElement]
  </script>

When extending document objects, there’s now an additional risk of form names conflicting with extensions. And what if script is running in legacy applications with tons of rusty HTML, where changing/removing such names is not a trivial task?

Employing some kind of prefixing strategy can alleviate the problem. But will probably also bring extra noise.

Not modifying objects you don’t own is an ultimate recipe for avoiding collisions. Breaking this rule already got Prototype into trouble, when it overwrote document.getElementsByClassName with own, custom implementation. Following it also means playing nice with other scripts, running in the same environment—no matter if they modify DOM objects or not.

Performance overhead

As we’ve seen before, browsers that don’t support element extensions—like IE 6, 7, Safari 2.x, etc.—require manual object extension. The problem is that manual extension is slow, inconvenient and doesn’t scale. It’s slow because object needs to be extended with what’s often a large number of methods/properties. And ironically, these browsers are the slowest ones around. It’s inconvenient because object needs to be first extended in order to be operated on. So instead of document.createElement('p').hide(), you would need to do something like $(document.createElement('p')).hide(). This, by the way, is one of the most common stumbing blocks for beginners of Prototype. Finally, manual extension doesn’t scale well because adding API methods affects performance pretty much linearly. If there’s 100 methods on Element.prototype, there has to be 100 assignments made to an element in question; if there’s 200 methods, there has to be 200 assignments made to an element, and so on.

Another performance hit is with event objects. Prototype follows similar approach with events and extends them with a certain set of methods. Unfortunately, some events in browsers—mousemove, mouseover, mouseout, resize, to name few—can fire literally dozens of times per second. Extending each one of them is an incredibly expensive process. And what for? Just to invoke what could be a single method on event obejct?

Finally, once you start extending elements, library API most likely needs to return extended elements everywhere. As a result, querying methods like $$ could end up extending every single element in a query. It’s easy to imagine performance overead of such process, when we’re talking about hundreds or thousands of elements.

IE DOM is a mess

As shown in previous section, manual DOM extension is a mess. But manual DOM extension in IE is even worse, and here’s why.

We all know that in IE, circular references between host and native objects leak, and are best avoided. But adding methods to DOM elements is a first step towards creation of such circular references. And since older versions of IE don’t expose “object prototypes”, there’s not much to do but extend elements directly. Circular references and leaks are almost inevitable. And in fact, Prototype suffered from them for most of its lifetime.

Another problem is the way IE DOM maps properties and attributes to each other. The fact that attributes are in the same namespace as properties, increases chance of collisions and all kinds of unexpected inconsistencies. What happens if element has custom “show” attribute and is then extended by Prototype. You’ll be surprised, but show “attribute” would get overwritten by Prototype’s Element#show method. extendedElement.getAttribute('show') would return a reference to a function, not the value of “show” attribute. Similarly, extendedElement.hasAttribute('hide') would say “true”, even if there was never custom “hide” attribute on an element. Note that IE<8 lacks hasAttribute, but we could still see attribute/property conflict: typeof extendedElement.attributes['show'] != "undefined".

Finally, one of the lesser-known downsides is the fact that adding properties to DOM elements causes reflow in IE, so mere extension of element becomes a quite expensive operation. This actually makes sense, given the deficient mapping of attributes and properties in its DOM.

Bonus: browser bugs

If everything we’ve been over so far is not enough (in which case, you’re probably a masochist), here’s a couple more bugs to top it all of.

In some versions of Safari 3.x, there’s a bug where navigating to a previous page via back button wipes off all host object extensions. Unfortunately, the bug is undetectable, so to work around the issue, Prototype has to do something horrible. It sniffs browser for that version of WebKit, and explicitly disables bfcache by attaching “unload” event listener to window. Disabled bfcache means that browser has to re-fetch page when navigating via back/forward buttons, instead of restoring page from the cached state.

Another bug is with HTMLObjectElement.prototype and HTMLAppletElement.prototype in IE8, and the way object and applet elements don’t inherit from those prototype objects. You can assign to a property of HTMLObjectElement.prototype, but that property is never “resolved” on object element. Ditto for applets. As a result, those elements always have to be extended manually, which is another overhead.

IE8 also exposes only a subset of prototype objects, when compared to other popular implementations. For example, there’s HTMLParagraphElement.prototype (as well as other type-specific ones), and Element.prototype, but no HTMLElement (and so HTMLElement.prototype) or Node (and so Node.prototype). Element.prototype in IE8 also doesn’t inherit from Object.prototype. These are not bugs, per se, but is something to keep in mind nevertheless: there’s nothing good about trying to extend non-existent Node, for example.

Wrappers to the rescue

One of the most common alternatives to this whole mess of DOM extension is object wrappers. This is the approach jQuery has taken from the start, and few other libraries followed later on. The idea is simple. Instead of extending elements or events directly, create a wrapper around them, and delegate methods accordingly. No collisions, no need to deal with host objects madness, easier to manage leaks and operate in dysfunctional MSHTML DOM, better performance, saner maintenance and painless scaling.

And you still avoid procedural approach.

Prototype 2.0

The good news is that Prototype mistake is something that’s going away in the next major version of the library. As far as I’m concerned, all core developers understand the problems mentioned above, and that wrapper approach is the saner way to move forward. I’m not sure what the plans are in other DOM-extending libraries like Mootools. From what I can see they are already using wrappers with events, but still extend elements. I’m certinaly hoping they move away from this madness in a near future.

Controlled environments

So far we looked at DOM extension from the point of view of cross-browser scripting library. In that context, it’s clear how troublesome this idea really is. But what about controlled environments? When script is only run in one or two environments, such as those based on Gecko, WebKit or any other modern non-MSHTML DOM. Perhaps it’s an intranet application, that’s accessed through certain browsers. Or a desktop, WebKit-based app.

In that case, situtation is definitly better. Let’s look at the points listed above.

Lack of specification becomes somewhat irrelevant, as there’s no need to worry about compatibility with other platforms, or future editions. Most of the non-MSHTML DOM environments expose DOM object prototypes for quite a while, and are unlikely to drop it in a near future. There’s still a possibility for change, however.

Point about host objects unreliability also loses its weight, since host objects in Gecko or WebKit -based DOMs are much, much saner than those in MSHTML DOM. But they are still host objects, and so should be treated with care. Besides, there are readonly properties covered before, which could easily cripple the flexibility of API.

The point about collisions still holds weight. These environments support non-standard form controls access, have proprietary API, and are constantly implementing new HTML5 features. Modifying objects you don’t own is still a wicked idea and can lead to hard-to-find bugs and inconsistencies.

Performance overhead is practically non-existent, as these DOM support prototype-based DOM extension. Performance can actually be even better, comparing to, say, wrappers approach, as there’s no need to create any additional objects in order to invoke methods (or access properties) off DOM objects.

Extending DOM in controlled environment sure seems like a perfectly healthy thing to do. But even though the main problem is that with collisions, I would still advise to employ wrappers instead. It’s a safer way to move forward, and will save you from maintenance overhead in the future.

Afterword

Hopefuly, you can now clearly see all the truth behind what looks like an elegant approach. Next time you design a Javascript framework, just say no to DOM extensions. Say no, and save yourself from all the trouble of maintaining a cumbersome API and suffering unnecessary performance overheads. If on the other hand, you’re considering to employ Javascript library that extends DOM, stop for a second, and ask yourself if you’re willing to take a risk. Is ellusive convenience of DOM extension really worth all the trouble?

Archives Posts

onload=function(){} considered harmful

December 9th, 2009 by kangax

Harmful pattern

There seems to be a new pattern appearing on the web — attaching window load listener through undeclared assignment:

  onload = function(){
    /* ... */
  };

I’d like to explain why it’s a good idea to avoid it.

A conventional approach to perform this task is to explicitly assign to window.onload property. That is, not counting other means like DOM L2 methods — addEventListener (as well as proprietary attachEvent), or intrinsic event attributes — <body onload="...">:

  window.onload = function(){
    /* ... */
  };

How does it work?

A tempting “short” version takes advantage of Javascript loose nature with regards to variable declarations. In Javascript, assigning to undeclared variable actually creates a property on a Global Object — global property. Since Global Object in browsers is usually a window object (or at least it often behaves that way), undeclared assignment essentially results in creation of property on window. As long as Global Object and window are the same entity, window.onload = ... and onload = ... should have identical results. At least, that’s how it is in theory, and in practice there are more implications, as you will see later on.

So if two are identical, why would we ever prefer longer version?

Because shorter one relies on undeclared assignment.

Who cares?

Undeclared assignments have been frowned upon for a long time, and rightfully so. Global variables declared locally are hard to maintain and generally cause confusion. It’s not always clear whether such assignments are intentional or simply an oversight. It is why validators like JSLint have been emiting warnings when encountering them.

MSHTML peculiarities

Another reason to avoid undeclared assignments is due to rather destructive behavior of MSHTML DOM. When undeclared assignment happens in IE, an obscure error is thrown if identifier is named as id or name of one of the elements in a document:

  <p id="foo"></p>
  <form name="bar" action=""><p></p></form>
 
  <script type="text/javascript">
    try {
      foo = 1;
    } 
    catch(e) {
      document.write(e); // TypeError: Object doesn't support this property or method
    }
    try {
      bar = 1;
    } 
    catch(e) {
      document.write(e); // ReferenceError: Illegal assignment
    }
  </script>

Note that plain variable declarations in global scope, or explicit assignments have no such problems:

  <p id="foo"></p>
  <form name="bar" action=""><p></p></form>
 
  <script type="text/javascript">
    var foo = 1; // declares (and initializes) global `foo` variable
    window.foo = 1; // assigns to a "foo" property of `window` object
    this.foo = 1; // assigns to a "foo" property of Global Object
  </script>

But “onload” is different!

Technically speaking, the case with onload = function(){ } can be considered an exception. After all, an intention to create global onload property is rather clear there. It’s also unlikely that there will be an element with such id/name (although, you never know!). There’s, however, another problem rising up, and that problem is strict mode of ECMA-262 5th edition — a standard for an upcoming version of ECMAScript language, approved officially only few days ago.

Strict what?

The premise of strict mode is to provide higher security level for an ECMAScript program (or part of it): avoid features that are considered error-prone or inefficient; employ stricter error checking; provide increased performance. One of such “stricter error checks” happens to be the one with undeclared assignments, which simply throw error:

  "use strict";
  onload = function(){ // ReferenceError
    /* ... */
  };
  window.onload = function(){ // Works as expected
    /* ... */
  };

Now, it’s worth mentioning that not all browsers would throw error. Some of them (e.g. WebKit) actually have properties corresponding to event handlers — such as “onload” — declared implicitly, before script execution occurs. Those that don’t — such as Firefox or Opera — would miserably fail.

  "onload" in window; // true in WebKit, but not Firefox or Opera
  window.onload; // `null` in WebKit, `undefined` in Firefox or Opera
  onload; // `null` in WebKit, ReferenceError in Firefox or Opera

Does it really matter?

It’s good to understand that strict mode is not a requirement, but is merely an option. It is there to provide stricter rules for those who need it, and are willing to cope with (and enjoy) consequences. So if you are planning to make your code “strict”, don’t forget to avoid undeclared assignments — even as innocent-looking as “onload” ones.

« Previous Entries