Extending built-in native objects. Evil or not?
Few days ago, Nick Morgan asked my opinion on extending native objects. The question came up when trying to answer — “why doesn’t underscore.js extend built-ins”? Why doesn’t it define all those Array methods — like map, forEach, every — on Array.prototype. Why does it put them under _ “namespace” — _.each, _.map, _.every, etc. Is it because extending built-in natives is evil? Or is it not? The thread quickly filled with conflicting ideas…
I often see this confusion about extending things in Javascript.
There’s a big difference between extending native built-in objects and extending host objects. I tried to explain what’s wrong with extending host objects in a blog post, a while back. Now, if you look at the list of problems with extending host objects it’s easy to see that most of them don’t really apply to native, built-in objects.
To avoid any confusion, by native, built-in objects I’m talking about objects and methods introduced in ES5 — Array.prototype extensions (forEach, map, reduce, etc.), Object extensions (Object.create, Object.keys, etc.), Function.prototype.bind, String.prototype.trim, JSON.*, etc. These are the things that are shimmed most often. And the question is — is it OK to extend native, built-in objects with these standardized methods?
Well, let’s quickly go over problems with host objects extension:
Host vs. Native
-
“Lack of specification” doesn’t apply here, as long as methods that are being shimmed are part of ES5 (or ES3). ES5 is a standard. There’s a publicly available specification. Implementing ES5 methods according to spec is doable (except certain edge-ish cases).
-
“Host objects have no rules” doesn’t apply either. This is native objects we’re dealing with, and semantics of native objects are very well defined in those same ECMA-262 specifications. What this means in practice is that unless we’re dealing with faulty implementations, adding method
bindtoFunction.prototypeshould allow us to add it. There’s no uncertainty aboutFunction.prototypethrowing error on extension, or silently ignoring our command (after all, the spec says: “The initial value of the [[Extensible]] internal property of the Function prototype object is true”). Ditto for other objects. -
“Chance of collisions” is non-existent as well. Since the methods that are being shimmed are part of a standard, and we’re shimming them according to standard, there’s no chance of collisions of any sort. Either implementation has those methods, or it doesn’t. If it doesn’t, methods are shimmed. That’s it.
-
“Performance overhead” not only doesn’t exist, but could actually be the opposite of what happens. It’s likely that
[].forEach(...)will be faster then_.forEach([], ...), but even if it isn’t, there should certainly be no performance hit with former version. Contrary to DOM objects that might not have [[Prototype]]’s exposed for public extension, there’s no need to manually extend arrays, objects and strings with these methods. Conceptually, there’s no performance overhead there. -
“IE DOM is a mess” doesn’t apply. We’re not dealing with DOM. And native objects are extension-friendly in IE, as far as I know.
So what do we have?
Well, it looks like properly extending native objects — unlike host ones — is actually not all that bad. This is of course considering that we’re talking about standard objects and methods. Extending native built-ins with custom methods immediately makes “collision” problem apparent. It violates “don’t modify objects you don’t own” principle, and makes code not future-proof.
Downsides
Are there any downsides?
Well, for once, there are cases when certain scripts mess up native objects/methods in a non-compliant way. Kind of like what Prototype.js does with some of its methods (e.g. Array.prototype.map or Array.prototype.reverse; standard-compliance is planned for future releases, as far as I know). If the shim adds standard-compliant methods, and application expects those methods to be non-compliant (but script/library-specific), then there could obviously be problems.
Second, as I mentioned above, while we know that native objects are free for extension, there’s always a risk of running into an oddball environment which doesn’t conform to spec. Keeping methods on a standalone (user-defined) object can avoid such scenarios. Whether this could be considered an issue depends on how paranoid you are.
Finally, you have to be careful when shimming methods that are not universally shimmable. Like Object.create, which had a very popular non-compliant shim floating around for a while. The method was defined directly on an Object, but silently failed to do anything useful with second argument — a set of property descriptors. Adding cross-browser support for property descriptors is a rather complicated endeavor, which is why defining such methods on a standalone object could save you some trouble (you could just implement a subset of Object.create functionality and call it a day).
Don’t forget that writing proper, compliant shims is hard. When in doubt, use standalone object. When the method you’re shimming is part of the unfinished spec, use standalone object. Only when you’re certain about method compliance and method is part of the finished, future-proof specification, is it safe to shim native object directly.
Enumerability
Another interesting, but likely insignificant difference is enumerability of shimmed methods. Unless methods are added using ES5 additions that allow to specify property enumerability (Object.defineProperty or Object.defineProperties), methods end up being enumerable:
if (!Array.prototype.map) {
Array.prototype.map = function() { /* ... */ };
}
Object.keys(Array.prototype); // ["map"]
// can be worked around:
if (!Array.prototype.map) {
Object.defineProperty(Array.prototype, 'map', {
value: function() { /* ... */ }
});
}
Object.keys(Array.prototype); // []
Underscore.js and API consistency
Getting back to underscore.js, I see an important aspect of consistency. Underscore adds not only standard methods like map, reduce and trim, but also its own, custom ones — values, extend, clone, etc. By adding map, reduce, and trim to standalone object, it keeps its API consistent.
I’d like to also mention that I do extend Array.prototype in fabric.js with methods like forEach, map, every. I make sure those methods are spec-compliant, and I take a risk of conflicts with libraries that shim methods in non-compliant way. Methods that are non-standard, on the other hand, are defined under standalone utility object. I’m not worried much about inconsistency, since — unlike in underscore.js — there’s only a handful of shimmed methods.
So there you have it. It should be now clear that extending native built-ins is definitely not as risky as messing with host objects. Do it carefully, follow spec closely, and use your reasonable judgement. For a spec-compliant shims, MDN is a good place to start with (but don’t trust it blindly either, as there were cases of non-compliance there as well).
WebReflection said on Aug 8, 2011 @ 12:15
#1I keep following a “shim what you need” approach which is both consistent and no-conflicts prone.
(function () {
// I always have my private scope
var indexOf = [].indexOf || function (v) {
for(var i = this.length; i– && this[i] !== v;);
return i;
};
}());
It does not matter if it’s ES5 compliant if the reason I am using indexOf is only to avoid duplicated in some stack or to check if an array contains already some value.
It is also minifier friendly since:
var i = indexOf.call(arr, value);
// will be
var i=d.call(a,v);
// which is shorter than
var i=a.indexOf(v);
If anybody would like to bring performances in place, I find full shims extremely slow compared with ad-hoc one, so even if .call may be slightly slower than direct property invoke performances are gained through the logic.
Do I want to shim everything? It depends, but in projects were many libraries may be involved, every of them will shim internally or try to shim full specs bringing potentially never used code size and logic in place, this is the worst downside I know about native prototype pollution.
WebReflection said on Aug 8, 2011 @ 12:28
#2P.S. indexOf was only an example, I do the same with defineProperty or any other ES5 considering the way I use them is enough for what I need. Things are different when it comes to framework creation since 3rd parts would expect more “full specs” behaviors there.
It would be really nice if we could create a “shim and download only what you need” service, similar way I have done here times ago, except all major libraries should agree about using this service rather than reinvent the wheel for each FW.
David Bruant said on Aug 8, 2011 @ 12:31
#3Excellent article! Thanks!
I’ve been thinking about ES5 shims and enumerability and i’d like to say a word on that.
As it turns out, as of August 2011 the only web browsers with relevant market share and that would need Array.prototype methods to be shimed are IE6/7/8 (source: http://kangax.github.com/es5-compat-table/ and http://gs.statcounter.com/) which also turn out to not have Object.defineProperty (IE8 doesn’t count since it’s only on DOM objects). So we’re in a situation where browsers with Object.defineProperty don’t need to be shimed and some without do. The enumerability problem is actually not solvable.
Corey Ballou said on Aug 8, 2011 @ 12:49
#4Good read. Care to weigh in on on the Sugar.JS implementation? It got a fair amount of flack in HN comments recently and was being directly compared to underscore.js in it’s utilitarian purpose.
Garrett said on Aug 8, 2011 @ 13:16
#5WebReflection’s code (comment 1) shows another problem, but using Native First Dual Approach.
The Native First Dual Approach (NFD) pattern is using an adapter function that uses native code where supported and user-defined code where not supported. NFD fails when the native method has different results from yours. This can happen when the native implementations are buggy or when your implementation is buggy.
Trying WebReflection’s code, notice the different results depending on if
[].indexOfis present:var a = [1,2,3,4,1];
a.indexOf(1);
Results “0″ where natively supported, “4″ everywhere else. It is also missing
fromIndex, so not suitable as a shim. WebReflection keeps this method private, and so even though it is buggy, he may have control over things such as input arguments, so having always unique arrays as input, or never having afromIndex.Not only is writing compliant shims difficult and complex, but defining your own object and using NFD carries many of the same problems.
Kangax brings up this problem with “there’s also a risk of running into an oddball environment that doesn’t conform to the spec”. However, defining your methods on your own (user-defined) object won’t won’t inherently avoid native problems. New features tend to be buggy. ES5 JSON and Date methods were wildly inconsistent and buggy when first implemented in Webkit and Gecko — major implementations. Before using new methods, test to make sure they actually work and if where native support is absent or buggy, use your fallback.
Lea Verou said on Aug 8, 2011 @ 14:23
#6Thank you so much for writing this. The information and opinions in this article is invaluably useful for my upcoming JSConf EU talk (“Polyfilling the gaps”).
Gianni "gf3" Chiappetta said on Aug 8, 2011 @ 16:12
#7It’s nice to hear someone else say it. Let’s use Javascript as it was intended to be used.
Quildreen Motta said on Aug 8, 2011 @ 18:46
#8Ah, nice write up :3
Recently I started a library for extending the core javascript library, and had a bit of a hard time deciding between extending the native objects directly or using a separate object. I even considered jDalton’s sandboxing appproach, but in the end decided for the standalone object because it was a library and I really didn’t want performance overheads nor conflicts with other people libraries.
But since I think that using two different objects to treat the same kind of data (and also because I prefer much functional style to OO-style), I’ve went with keeping everything in a stand-alone object, but allowing people to easily unpack those methods as generic functions inside a native constructor, or own methods inside the [[Prototype]], or as a generic function in the global object. This way I have to worry less about conflicts because it’s up to the user to define how they’ll use it :3
See: https://github.com/killdream/Black/blob/master/src/core.js#L17-L38
Still… I’m not sure what would be the problems of not having those properties enumerable (unless you care about knowing the properties that are in the prototypes at run-time).
@WebReflect
I still can’t understand libraries that provide fallbacks to ES5 methods, when those libraries have nothing to do with providing fallback to ES5 methods. It just feels wrong to me, and leads to lots of duplicated dead code =/ Yes, you gain on `easy of use’, but I’d argue that for a library that intends to be used with other libraries, letting the user provide the fallbacks himself (with something like es5-shim) makes much more sense.
Christopher Hunt said on Aug 8, 2011 @ 20:49
#9I think that it depends on *where* you extend native objects – and non-native objects for that matter. If you’re doing the extension within your own application then I would say that is fine. However if you are doing it within a library then exercise extreme caution. Don’t forget also that by assigning symbols like ‘_’ and ‘$’ that you’re extending the native browser Window object. You’ll notice that libraries are generally cautious in their approach to this and do it only when the property is not already declared.
John-David Dalton said on Aug 8, 2011 @ 21:01
#10@kangax I’d like to emphasize again:
The
Array#indexOfshim in fabric.js fails to correctly support a negativefromargument so a large negative value could lock up the browser.WebReflection said on Aug 8, 2011 @ 21:37
#11@Garrett I wonder if you actually read my comment or stopped at the raw snippet. I re-cite myself:
It does not matter if it’s ES5 compliant if the reason I am using indexOf is only to avoid duplicated in some stack or to check if an array contains already some value.
Everything you said about NFD makes sense for frameworks that would like to be adopted/used out there but if I partially shim for my own purpose in my own private scope, exactly what I have done there, nobody, even you, can say that indexOf for what I need is wrong.
The only exception would be indeed a buggy implementation but if we have to detect “fully specs compliant” every single native method rather than file bugs all our apps will take seconds or minutes before they can work cross browser ( included mobile ).
@Quildreen\ Motta as already explained to Garret, I was not suggesting to bring partial shims into libraries indeed, I was suggesting partial shims for private usage only gaining performances in therms of logic and final code size.
Hope it’s more clear now, Regards.
WebReflection said on Aug 8, 2011 @ 21:40
#12@Garret, @Quildreen\ Motta, just in case it’s still not clear, this is a perfect valid example of that shim. No extra code needed, no need for fromIndex, no need to fully shim the indexOf. Cheers
Brandon Benvie said on Aug 9, 2011 @ 0:00
#13What about running your code a sandbox, or reverse sandbox really. Bootstrapping your loader so they’re loaded in an iframe’s context with references to the main window. This gives your the freedom to modify native objects that are only seen by code you control. It also allows some modification of host objects as long as you’re prepared to deal with any host weirdness.
WebReflection said on Aug 9, 2011 @ 1:20
#14@Brandon\ Benvie if you extend Array.prototype in a sandbox then you just .concat() it in the global thread it will return a copy of the current global thread [[Class]] which is the Array constructor without the extended prototype. @John-David\ Dalton can give you much more details about this and many other related problems since he is the main alchemist behind fuse.js project, a framework basically based on sandboxes.
Brandon Benvie said on Aug 9, 2011 @ 4:15
#15I’ve looked on fuse before and I was wondering about kind of inverting it. All the sandbox stuff I’ve seen so far revolves around exporting natives out of an iframe (or faking them in some other manner which fuse has a few methods for).
I’ve been thinking about simply running ALL of a module’s code inside an iframe, and it’s interaction with the main window would be in attaching event listeners, creating/removing/moving DOM elements, and setting properties on DOM elements. Perhaps even keeping element creating inside the sandbox and then adopting into the main DOM when needed.
In this case most of your work is being done directly in your own private namespace without needing to fidget with stolen external natives.
WebReflection said on Aug 9, 2011 @ 4:42
#16In that case have a look at Elsewhere which seems to be closer to what you are looking for
I don’t post the link here or @jdalton blames me for spam :P
look for “webreflection elsewhere” in your favorite search Gengine ;-)
Brandon Benvie said on Aug 9, 2011 @ 6:14
#17Ha that’s very close to what I was thinking, though a bit more involved. The basic implementation I’m playing with does little more than create an iframe and provide a function to attach a script element to it. This script would just be the entry point/controller for whatever stuff you’d normally do in the main window. The idea is this is a trusted script (it’s your own stuff) so it doesn’t need any sort of boxing or protection. You get pretty much transparent access to the main window but in your own context and it (so far it seems) requires very little alteration to existing stuff.
I’ll disclaimer that by saying I’m not playing with old versions of IE so I haven’t seen any of what that may throw at me. I’m using it now in order to augment SVGElement related stuff which of course doesn’t apply to old IE.
The overarching idea here is that modifying DOM/native prototypes is a very attractive idea and I really would like to be able to work directly with them if/when it doesn’t break stuff. With Node it’s no problem as controlling context is easy.
Ywg said on Aug 9, 2011 @ 6:51
#18@WebReflection – “It would be really nice if we could create a “shim and download only what you need” service, similar way I have done here times ago, except all major libraries should agree about using this service rather than reinvent the wheel for each FW.”
http://xkcd.com/927/
Quildreen Motta said on Aug 9, 2011 @ 8:53
#19@WebReflection
I can see your point, but the actual implementation of #indexOf or whatever method was not really what I was arguing about. @Ywg got where I was trying to get at.
See, we have a full ECMAScript 5 spec that defines all that functionality (Array iteration and stuff), so if you write a DOM library, why do you need to define your very own Array#indexOf, Array#forEach, Array#whatever. That stuff is standard and implemented in most browsers, I think library authors should just use them and ask the user to provide a fallback if they want to run the library in a non ES5-compliant environment.
Otherwise, if you include valentine, moustache and insert-some-other-microlibs here, you’ll have at least 5 different implementations of #indexOf, 5 different implementations of #forEach and so on. These bytes count when you’re sending them over the wire =/
Garrett said on Aug 9, 2011 @ 11:53
#20Webreflection — No, you made your point — your private method does what you need it to do. That’s why I wrote: “WebReflection keeps this method private, and so even though it is buggy, he may have control over things such as input arguments, so having always unique arrays as input, or never having a fromIndex”
Just offering the other viewpoint here: When user input makes it down into those methods, watch out! Pretty soon users are managing to pass in non-unique arrays, like [1,2,3,4,1], and when those make it into your method the result depends on the browser.
Garrett said on Aug 9, 2011 @ 11:57
#21No “ES5 Array Extras” as a microlib, huh? That would get it working in older IE versions sans repetition or variation of the same methods in different libs.
Nick Morgan (skilldrick) said on Aug 9, 2011 @ 12:16
#22Thanks so much for writing this. When I was answering that question, I was looking around for a decent write-up of extending native built-ins. I kept coming across your article on extending host objects, which is obviously a very different matter. It’s great that we have this article now, as a fairly definitive explanation of the pros and cons.
WebReflection said on Aug 9, 2011 @ 21:28
#23@Ywg ah ah ah, precisely :D
Generally speaking what I mean is that I haven’t seen a widely adopted solution to the problem yet. There are few libraries that comes almost for free.
As example, I haven’s seen such approach anywhere:
// define ES5 dependencies
(function (checkList) {
var load = [], m;
while(checkList.length) {
(m = checkList.pop()) in Array.prototype || $LAB.script(
"http://whatever.cdn.com/es5/fullspecs/Array.prototype." + m + ".js"
);
}
}{"indexOf forEach".split(" ")});
// be sure dependencies have been loaded
$LAB.wait();
// load main application file
$LAB.script("main.js").wait(function () {
// if necessary do stuff/initialize here
});
Rather than this every tiny or big library tries to be dependencies free and while I like avoid dependencies, when it comes to ES5 shims I don’t think the dependencies free approach brings any advantage.
It must be said at that point every site would depend on $LAB.js so … well, I don’t think that’s gonna happen :-)
lifesinger said on Aug 9, 2011 @ 22:51
#24I recommend this tiny shim module: https://github.com/seajs/dew/tree/master/src/es5-safe
It has a full test cases: http://seajs.github.com/dew/src/es5-safe/test/runner.html
Comparing to kriskowal/es5-shim module, es5-safe.js only contains the safe parts of ES5 shims, and it is more robust and elegant for old browsers.
Best regards!
Lime said on Aug 10, 2011 @ 22:42
#27Unless if it is in the standard I avoid native `prototype` extension at all costs.
The issues then arises of having tons of nested functions
alert(reduce(map(string.split(" "),function(){}),function()){}))You loose all the capabilities of chaining when nesting functions and the code gets really difficult to follow.
It would be really nice if you had chaining of scoped properties/methods and you could avoid property/prototype collision altogether.
UnlimitJS allows for chaining with native JavaScript objects without extending objects’ prototypes. It defines one property that all methods/functions can work off of.
It cross browser IE6+,Firefox,Opera,Chrome and Safari.
1. for in loops are safe to use
2. Chaining is super easy obj[fun]()[fun]()
You can watch the project at github.
Luke said on Aug 27, 2011 @ 19:40
#31Not sure whether I’d call this a “downside” of extending native objects (or in this case, primitives), but it is definitely a quirk to be aware of.