Perfection Kills

by kangax

Exploring Javascript by example

← back 2442 words

Refactoring Javascript with kratko.js

Understanding the concept of code refactoring is one of the best things you can do to become a better programmer [1].

It all started a couple of weeks ago when I — once again — looked at the growing, stinky mess that my code has become. It’s a few thousand LOC app with various interactive widgets and controls all on one page. Even though a lot of functionality was encapsulated in these separately-defined widgets, the main “dashboard” object — a bootstrapper, so to speak — grew like a weed and became quite an abomination.

It was time for refactoring.

Every now and then I would clean things up. And so there I was, once again, trying to figure out where to start. One method was whopping 70 lines long. Another few of them — just below that giant — were doing one specific task and would be better off refactored into a separate module/widget/”class”/object. What are they doing in bootstrapper anyway? I scrolled a little further and witnessed more mess — long methods, spread-out functionality and a 4-argument function sneaking somewhere in between. 4 arguments? That’s just scandalous.

But where to start? Should I just randomly choose a thread to pull? If only I could see the stats for this dashboard object. Some kind of tool to help analyze method length of the entire thing. This way I could start with the messiest part. I could also see overall picture. Heck, I could even use a tool like this on a future project — just to get a feel of how bad things really are.

Well, a thing like that is totally doable, isn’t it?!

And so that how kratko.js was born; a simple tool to help you refactor Javascript.

kratko.js

There are 3 aspects of refactoring I’m usually interested in:

  • Which objects/”classes” have too many methods?
  • Which methods do too many things?
  • Which methods have too many arguments?

Answering these questions and refactoring according to that usually takes care of most of the mess. Objects become more autonomous, by doing only a specific task. Methods become smaller by doing less and having the right level of abstraction. And there’s no more long-argument functions, order of which no one can ever remember.

This is exactly what kratko.js tries to provide.

kratko.js takes an object and collects statistics about its methods and arguments. It can then display that information in a table, allowing to sort by method or argument length. It also provides such numbers as average method/arguments length, and max/min method/arguments length.

It turns out to be pretty useful.

Kratko and TableViewer

To keep things simple and modular, kratko is made of 2 objects: Kratko which has getStatsFor method, and a TableViewer which could take stats object and display it in an interactive HTML table. If you don’t care to view results in a fancy HTML table, you can just use Kratko.getStatsFor passing it an object to inspect and receiving an object with elaborate stats for it. The whole thing is a tiny script — about 170 lines long. You can use it in the browser or on the server. All that’s required is environment that supports function decompilation (in other words, where Function.prototype.toString returns more or less exact representation of a function).

TableViewer, on the other hand, is a layer on top of Kratko and provides an outlook of these stats. What’s really useful, though, is that TableViewer is also interactive. You can sort a table by method or argument length, and view actual method contents.

getStatsFor signature

The signature of Kratko.getStatsFor is pretty straight-forward. Besides having average and min/max numbers, it also stores signatures of each method — just in case you’d like to iterate over it and do something other than getting an average.


{
  methods: {
    name: {
      methodString: ...,
      length: ...,
      argsLength: ...
    }
  },

  totalMethodLength: ...,

  minMethodLength: ...,
  maxMethodLength: ...,

  totalArgsLength: ...,

  minArgsLength: ...,
  maxArgsLength: ...,

  numMethods: ...
}

So, for example, an object with 2 methods — foo and bar:


var obj = {
  foo: function(x, y) {
    return x + y;
  },
  bar: function(a, b, c, d, e, f, g) {
    if (true) {
      // some comment
      alert(123);
    }
  }
};

..would have this kind of representation:


{
  methods: {
    foo: {
      methodString: 'function (x, y) {\n  return x + y;\n}',
      length: 1,
      argsLength: 2
    },
    bar: {
      methodString: 'function (a, b, c, d, e, f, g) {\n  if (true) {\n    //some comment\n    alert(123);\n  }}',
      length: 3,
      argsLength: 7
    }
  },

  totalMethodLength: 4,
  minMethodLength: 1,
  maxMethodLength: 3,

  totalArgsLength: 9,
  minArgsLength: 2,
  maxArgsLength: 7,

  numMethods: 2
}

Bookmarklet

The way I’ve been using kratko.js so far is via simple bookmarklet (up-to-date source). Open this bookmarklet on the page of a project you’re working on; tell it which object to inspect (e.g. SomeWidget.prototype, dashboard, etc.) and a stats table will appear. Now you can sort through the methods, and get a nice outlook on what’s going on.

The bookmarklet and the actual kratko.js script should work on most modern browsers, but I’ve only tested it on Chrome (13) and Firefox (4). I haven’t worked on making it cross-browser, but I also haven’t used anything fancy, so it should be pretty portable.

Distribution graph

After working with kratko.js for a bit, I realized that average method length doesn’t always tell me much. My “class” had 100+ methods. Even though some of them were larger than 50 lines long — and so were unacceptable — the average would still come out to be somewhere along 15-20, due to a large number of short, 1-5 -line methods.

I needed a way to quickly see total method size distribution, and so I added a little graph for that. It groups methods by length, showing which ones prevail and where:

Graph of <code>TableViewer.prototype</code>

I consider the maximum acceptable length of a method in Javascript to be ~20-25 lines long. Anything more than that is usually messy. It takes longer to figure out what's going on, and it's likely a sign of either not enough abstraction, or a method doing too many things (whereas, ideally, it should be doing just one thing — and as clear as possible).

Having said that, distribution graph is really an excellent indication of how "clean" an object is. You obviously want to stay as much on the left side of it as possible.

Here's an example [2] (of a "bad" graph):

As you can see, those lonely 85 and 56 -liners on the right have to go. There's also that little nasty bunch in the 28-43 area. There aren't many of them, thankfully. But they should probably be taken care of as well. Once that's done, the graph would be mainly concentrated in the left 1-20 area. And that would make things much cleaner.

Code style

The elephant in the room at this point is the code style. How can we base our assessment on a line count, when it could vary so much based on code style? Even though kratko.js does strip comments and empty lines [3], there's still room for deviations. Some prefer single-line statement, others follow more explicit Allman style:


  if (smth) { smthElse }

  // vs.

  if (smth) {
    smthElse
  }

  // vs.

  if (smth)
  {
    smthElse
  }

That’s exactly why it’s hard to have any absolute assessments. Obviously, it’s important to understand approximate optimal method length rather than follow any exact requirements. Decide for yourself which range you’d like to stay in, based on code style and personal preference (some people like to keep methods even smaller — under 10 lines, for example). Kratko is simply there to show you an overall picture, and guide you through the process of refactoring.

Nested functions

One last interesting point I’d like to go over is that of nested functions. When I started to refactor some of the longer methods, I noticed that many of them are long not due to lack of abstraction but due to presence of other, nested functions. Defining a small function right within another one is a pretty common thing in Javascript. It allows to keep things DRY and to create necessary abstractions.

Here’s a simple example:


function sort(arr1, arr2) {

  function byFirst(a, b) { return a[0] - b[0] }

  // ...

  if (smth) {
    arr1.sort(byFirst);
  }
  else {
    arr2.sort(byFirst);
  }
}

Even more importantly, it allows to scope certain functionality to one place. When an object has logic that requires more than 50 lines, you can either define few methods right on that object, or create one method with few “helper” functions defined right within it.

There’s certainly nothing inherently wrong in doing so. Those are the same nicely-chunked functions, only defined not on an object but within “parent” method. The advantage of such style is more explicit scoping, and the downside is that it could look messy and unclear to someone not familiar with such concepts as hoisting and functions within functions:


// Potential problem #1 is that inner function can look confusing

function foo() {
  function bar() { }
}

// Potential problem #2 is that hoisted function declaration (if any) can look confusing

function foo() {
  bar();

  // ... some other code

  function bar() { }
}

When there’s more than one nested function, and these nested functions aren’t just one-liners, there’s clearly an additional cognitive burden involved in method reading. When reading a method, you now need to distill actual logic from these auxilary function declarations. Grouping all declarations at top (or bottom) of the function can help, but there’s still a slight overhead.

There’s this one function in Prototype.js which I noticed demonstrates this really well. The function is 76 lines long, so please bear with me. I’m inserting it untouched just to show how nested functions — especially scattered ones — could make it harder to read the method and follow its logic. It’s a good thing Prototype.js has always kept its variable/function names as explicit and descriptive as possible — something I’ve always loved about it — or reading this function would be even harder:


function (methods) {
  var F = Prototype.BrowserFeatures, T = Element.Methods.ByTag;

  if (!methods) {
    Object.extend(Form, Form.Methods);
    Object.extend(Form.Element, Form.Element.Methods);
    Object.extend(Element.Methods.ByTag, {
      "FORM":     Object.clone(Form.Methods),
      "INPUT":    Object.clone(Form.Element.Methods),
      "SELECT":   Object.clone(Form.Element.Methods),
      "TEXTAREA": Object.clone(Form.Element.Methods)
    });
  }

  if (arguments.length == 2) {
    var tagName = methods;
    methods = arguments[1];
  }

  if (!tagName) Object.extend(Element.Methods, methods || { });
  else {
    if (Object.isArray(tagName)) tagName.each(extend);
    else extend(tagName);
  }

  function extend(tagName) {
    tagName = tagName.toUpperCase();
    if (!Element.Methods.ByTag[tagName])
      Element.Methods.ByTag[tagName] = { };
    Object.extend(Element.Methods.ByTag[tagName], methods);
  }

  function copy(methods, destination, onlyIfAbsent) {
    onlyIfAbsent = onlyIfAbsent || false;
    for (var property in methods) {
      var value = methods[property];
      if (!Object.isFunction(value)) continue;
      if (!onlyIfAbsent || !(property in destination))
        destination[property] = value.methodize();
    }
  }

  function findDOMClass(tagName) {
    var klass;
    var trans = {
      "OPTGROUP": "OptGroup", "TEXTAREA": "TextArea", "P": "Paragraph",
      "FIELDSET": "FieldSet", "UL": "UList", "OL": "OList", "DL": "DList",
      "DIR": "Directory", "H1": "Heading", "H2": "Heading", "H3": "Heading",
      "H4": "Heading", "H5": "Heading", "H6": "Heading", "Q": "Quote",
      "INS": "Mod", "DEL": "Mod", "A": "Anchor", "IMG": "Image", "CAPTION":
      "TableCaption", "COL": "TableCol", "COLGROUP": "TableCol", "THEAD":
      "TableSection", "TFOOT": "TableSection", "TBODY": "TableSection", "TR":
      "TableRow", "TH": "TableCell", "TD": "TableCell", "FRAMESET":
      "FrameSet", "IFRAME": "IFrame"
    };
    if (trans[tagName]) klass = 'HTML' + trans[tagName] + 'Element';
    if (window[klass]) return window[klass];
    klass = 'HTML' + tagName + 'Element';
    if (window[klass]) return window[klass];
    klass = 'HTML' + tagName.capitalize() + 'Element';
    if (window[klass]) return window[klass];

    var element = document.createElement(tagName),
        proto = element['__proto__'] || element.constructor.prototype;

    element = null;
    return proto;
  }

  var elementPrototype = window.HTMLElement ? HTMLElement.prototype :
   Element.prototype;

  if (F.ElementExtensions) {
    copy(Element.Methods, elementPrototype);
    copy(Element.Methods.Simulated, elementPrototype, true);
  }

  if (F.SpecificElementExtensions) {
    for (var tag in Element.Methods.ByTag) {
      var klass = findDOMClass(tag);
      if (Object.isUndefined(klass)) continue;
      copy(T[tag], klass.prototype);
    }
  }

  Object.extend(Element, Element.Methods);
  delete Element.ByTag;

  if (Element.extend.refresh) Element.extend.refresh();
  Element.cache = { };
}

It’s also worth mentioning that “problem” with nested function can easily be taken care of by moving these auxilary functions into a closure:


var foo = {

  bar: (function(){

    function aux1() { }
    function aux2() { }

    return function() {
      // ...
      aux1();
      aux2();
    };

  })()
}

kratko.js obviously doesn’t know about such “hidden” functions, and so bar now “looks” like this:


function() {
  // ...
  aux1();
  aux2();
}

All of these problems are arguably non-issues, so apply your own judgement and preference when dealing with this kind of things. If you see a method that’s long due to nested functions, and you’re OK with nested functions, move on to something else. The reason I’m bringing this up is to show once again that kratko.js measurements are not absolute and should be taken into consideration rather then followed strictly.

What does future hold?

I was pondering what else could a tool like kratko.js do. Some of the ideas that came to my mind:

  • Measure method width (to catch methods that are wider than 100/80/72 characters long, depending on your preference)

  • Count statements rather than lines; this could make both of these snippets equal to 1 line long, rather than 10-to-1:

    
    if
    (
    true
    )
    {
    alert
    (
    1
    )
    }
    
    
    // vs.
    
    
    if (true) alert(1)
    
  • Syntax highlight the code (of the methods)

  • Support navigating through objects (e.g. to get to jQuery.fn from jQuery or to fabric.Element.prototype from fabric.Element)

  • Integration with other services (e.g. pre-commit hook, pre-production quality check, etc.)

  • <insert your idea here>

Source, unit tests

kratko.js is licensed under MIT and is free for any kind of use (just leave attribution, please, and contribute if/what you can). The source is on github. So are unit tests. If you want to change, fix, or suggest something — I welcome any pull requests.

Use it to make your code better. Have fun!

[1] If you don’t have “Refactoring” or “Clean Code” on a bookshelf, do yourself a favor and go get them now. This will be totally worth it.

[2] This “bad” graph is actually of my fabric.Element.prototype. Embarassing, I know. Will be fixed soon.

[3] Note that multiline comments stripping is not very bulletproof and can have false results in certain (rare) cases. it does work well most of the time, though.

Did you like this? Donations are welcome

comments powered by Disqus