Aug 1, 2008

Offline code reviews are more effecient

I've been listening to the stackoverflow podcast by Jeff Atwood and Joel Spolsky. Like Joel's blog/columns I find him alternately insightful and infuriatingly naive.

Anyway, the recent stackoverflow podcast #15 contains segment on code reviews in which both Jeff and Joel agree that interactive face-to-face code reviews are more effective than offline reviews in which reviewers read the code and send comments to the author. I have extensive experience with both and I strongly disagree.

This is actually studied and there is good, researched hard data that shows offline code reviews are just about as good as group- or meeting-based code reviews. Meetings find 4% more defects on average, which is statistically significant, but take anywhere between 50-1200% longer to get there. This indicates diminishing returns to say the least, hence they are less efficient. A good summary of these findings can be found in the free book Best Kept Secrets of Peer Code Review, specifically the "Brand New Information" chapter (note: you can ignore the sales pitch chapters at the end if you like--no disclaimer necessary: I don't own stock in the company or use their products).

Given Joel's earlier podcast rant about people on the internet blogging things based on anecdotes without research and data, I find this kind of ironic. Yet it is understandable: meeting-based code reviews do feel a lot better emotionally than offline reviews; there is less oppportunity for misunderstanding and most people do enjoy the social interaction. As good engineers, though, we should recognize that what feels good isn't always the best for us, and do the right thing.

To be fair, Jeff and Joel are really lauding the learning factor of information and tip-swapping that occurs during discussions, which has nothing to do with the code defect rate or efficiency. However, further reading of the literature shows that group-based code review tends to find few new defects, and those they do find tend to be surface-level in nature.

The book theorizes--and this is borne out by my own experience and those of my trusted coworkers--that really understanding code and algortihms at a sufficient level of depth takes time and concentration that is nigh impossible to achieve in a social setting.

I don't want to completely discount the learning aspect. If you have less experienced developers then it does help to train them in code reviews. However, you should do this consciously with the understanding of the productivity hit your more senior employees are taking. Of course, there's nothing stopping you from doing offline reviews and then reviewing results with junior developers.

What I've seen work is having primarily offline reviews with comments sent back to the author (and tracked in a system), and then having face-to-face (or voice-to-voice) meetings to clarify if necessary. This gets the benefits of concentrated brain cycles from the reviewer while maintaining human contact and communication where needed. In addition, some percentage of code can be targeted for meeting-style reviews to maintain the benefits Jeff and Joel care about in terms of learning. Along with good code review guidelines and coding convention guidelines this process can scale to larger (50+) teams with many smaller code reviews a day. It is also very effective for geographically distributed teams.

No comments: