Wednesday, December 3, 2014

Why is it difficult to understand code?

If you are asked to read and understand a software that spans 10000s of lines of code, it is impossible to do it. Even if you spend months, you may be able to fit some pieces but will not end up getting a hang of what the code does.

This is the real bane of software. In fact this is true even for some one who has had a long gap and end up staring at his own code.

This is true, no matter how much of annotation, commenting, documentation that is there.

Code commenting and documentation is only a small step in the direction of understanding one’s code. The reason is, a code is something that changes so drastically during its early phases. Almost all the parts of the code is volatile like a molten lava. Needs change, perceptions change, how you solve something change. It is hard to keep up the documentation or commenting in line with the code change and it will almost always be erroneous.

Coding a large software is like carrying a model in your head which is having several moving parts. It is like how Einstein would have carried his view of the universe and all the lingo of general relativity and associated visuals, albeit in a small way. No other person can understand it unless they get this model right. No matter how much of a documentation is out there.

Over time, if the product survives, portions of the code gets stabilized. If the design decisions, the perception of customer needs are reasonably accurate, the initial core of the code remains somewhat intact. This is really the time to ‘document’ the code. This is when the model in the head of the programmer can be laid out in diagrams and words and comments and algorithms to make sense to others who may enter to maintain or enhance it.

I would say the initial period of development of anything should be by a team which is sitting next to each other. Until it reaches the ‘inner core’ level of stability there is no need to document anything. Everyone simply codes. Everyone talks and listens to one another and grow the model in their heads.

Teams spread across require specs to be developed and integration etc. to be planned. These are like turning an already muddy water. If there is a choice, never have your initial development spread across teams. This is because, unless two brains are in synch in terms of the language spoken, visualizing the model there is always a gap to be filled up and time to be wasted and mistakes to be made and frustrations to be built up. And two brains to be in synch is not a question of skill, but a question of attitude. Hard to get.

Even if there are documentation after reaching the stability, it is nothing like having the original team around. If you happen to lose the key members, ensure their video lectures are captured that gives the ‘model’ that was visualized, key algorithms used, key decisions, assumptions made all along. This can help the developers to latch on to the context of looking at the code.

2 comments:

  1. Good analysis, and good recommendations, especially the one about the video lectures. Although I disagree with the bit about completely eschewing with documentation in the beginning. As with many other problems, enforcement or the lack of it accounts for a lot. "We'll document it eventually" isn't all that different from what happens with every new start--in fact, that tends to be the exact sentiment. The issue is that it's pretty common for the "eventually" to never arrive.

    I've often wondered about a similar scheme for capturing things with a recording, let's say at the end of the day, where the audience "receiving" it is otherwise completely uninvolved in the development process, i.e., someone who's not actually a programmer building that "inner core". Whether it can be explained to someone so uninvolved is a good gauge for whether the explanation is a good one or not. The way it could work is that the developers individually take 10 minutes near the end of the day to make a recording. The developer is given a transcription of their recording to be reviewed and edited first thing in the morning. The textual revisions are then sent back to the receiver who has read-only access to the code and is in charge of curating and integrating them into the overall documentation they are developing in a parallel process. The recordings are destroyed. The whole thing should take each developer no more than an hour per day (hard limit; expected and preferred to be nearer to 20-40 minutes), recording time included, and the developers shouldn't spend time writing any high-level documentation otherwise before that reasonable stability of the "inner core" has been reached.

    The developers' video/audio documentation probably *shouldn't* be recorded at the very end of the day, as that may rush things and cause a sort of careless, "get it over with" attitude. Instead it should probably happen an hour or more before quitting time, and the time in between be using to do planning for the next day and a lot of quick and satisfying low-hanging fruit sort of things.

    Of course, the worst thing would be to treat this as a process to be slavishly followed and as a talisman guaranteed to produce favorable results, regardless of whether it's actually working or not. A good indicator of whether it's working or not would be whether the time in between the point at which things are set aside to begin recording and the point at which you're going home amount to a lot of fluff.

    ReplyDelete
  2. I feel when you just start to code a product whose specs change every day, the focus will be more on getting to a point of 'clarity' about the product and the code which reflects the current state of thinking about what the product should be doing. It is definitely a good practice to record and archive things. It is the hard way and most engineers may start to see it as a optional thing over time. This is because we solve something which is not the reality. I see many books which preach self discipline in the guise of best practices and that is another way.

    If you want a hard way, I feel the compiler should not compile unless you fulfill documentation of some sort which is also seen by others in the team or the neutral person whom you are talking about :-) That's really when a developer who is constantly fighting to make things work with the code will reluctantly start putting out some sort of notes. It is a question of where we draw the line. In a constantly changing universe like a startup of two or a bunch with vagueness about what the product should do and do not, it is very hard to do documentation. Probably the only thing I can think of is the self-discipline way where I put adequate comments while checking in to help myself when I look at that again.

    Another way is to think and write code in very small reusable functions which does not need a great deal of documentation to understand. I am quite excited about thinking in this direction. I guess functional programming is a promising area from what little I know of it.

    More I think of this, it sounds to me like Github needs to have a easy way to record video snippets when a person checks in code and archive those videos. May be it may ease a lot in terms of bringing people who come in later to know what has happened and also help developers not go through the pain of typing things...???


    ReplyDelete