sometimes nothin' can be a real cool hand

« Exception handling rules of thumb | Main | Mile high code »

Complexian. Simian's little brother.

Listen to this article Listen to this article

With a bit of moral support from Simon Harris, the author of Simian, I've put together a tool called Complexian. It measures NPath complexity in your Java code, and warns you if it gets too high. The internals have been born from the same seeds of design as Simian, so it runs blindingly fast.

The NPath complexity of your code is exactly the number of unit tests required to get complete coverage of it. Complexian exists to measure the NPath complexity of your code and to warn you when it gets too high. Keeping the number low means that you can cover many more of the paths through your system with unit tests. That can save valuable time and money as projects get bigger and go for longer periods of time.

You can use Checkstyle already to check NPath complexity, but here's a few reasons you might prefer to use Complexian.

  1. Its significantly faster. Complexian can check the 1.2 million shared lines of code in JDK1.5 in under 15 seconds.
  2. Output is ordered so you see the “worst” offenders more obviously.
  3. You get summary of the total system, and how offensive the violations are in relation to it.
  4. Checkstyle has an overflow bug in it, so you don’t get the real story. (we’ve submitted a patch though, so this will go away eventually).

The most important reason for me is that complexity is treated like a second class citizen when its bundled in with other checkstyle checks. Too many times have I seen either the specific check turned off, or checkstyle itself turned off because developers didn’t like some formatting rule. Its important enough to me that I want that decision to be much more obvious, and treated with more consideration. If producing a tool that deals with the issue directly helps to highlight it, then I'm happy with the outcome.

You can see more details about Complexian and download it from the resources section of this site. Please send any feedback you have to me.

Comments

marty,
can you explain a bit about the threshold level - a level of 100 - what does this mean exactly in your example and how the scoring amounts are calculated? Additionally how you would decide a good score from a bad score... Are these scoring attributes a simian thing?

jeff

The threshold is the level at which Complexian starts reporting. Low values are better. NPath complexity is the total number of combinations of paths possible through your method.

This is an exponential algorithm, so having a few looping constructs in your method will easily cause the number to be quite large. Setting a threshold of 100 means "Please tell me about all of the methods that are complex enough such that their complexity is greater than 100".

The default of 100 is totally arbitrary. I'd suggest refactoring in the 20's or 30's. Single digit values are excellent. A getter or setter would usually be 1 for example, as it only has one possible path.

I'll add some more detail to the resources page to describe exactly how the calculations work a bit later.

Post a comment