Continuing on from the previous blog entry, The Futility of Commenting Code, I’d like to address the dissenting comment of Sandman, who claims to have only seen the kind of caricature comments I made up in posts about commenting. Well, here goes reality.
A Real Example
Consider the following real example, which I chose because I knew the source for Apache projects was available online, and suspecting Apache Commons would be less heavily checked than more popular projects. I dove into the e-mail package, because it’s the first one I actually use. I found this example right away:
- source for
org.apache.commons.mail.ByteArrayDataSource
(revision 782475)
pieces of which I repeat here:
/** * Create a datasource from a String. * * @param data A String. * @param aType A String. * @throws IOException IOException */ public ByteArrayDataSource(String data, String aType) throws IOException { this.type = aType; try { baos = new ByteArrayOutputStream(); // Assumption that the string contains only ASCII // characters! Else just pass in a charset into this // constructor and use it in getBytes(). baos.write(data.getBytes("iso-8859-1")); baos.flush(); baos.close(); } catch (UnsupportedEncodingException uex) { throw new IOException("The Character Encoding is not supported."); } finally { if (baos != null) { baos.close(); } } }
I wasn’t talking about API comments, but these display the same problem. This public constructor is documented with “Create a datasource from a String”, but in fact, there are two string parameters, both documented as “a String”. That’s what the delcaration says, so this is exactly the kind of redundant comment I was talking about.
Next, consider the one code comment. It starts on the wrong foot, with “Assumption that the string contains only ASCII characters!”. If you look at the method call, data.getBytes("iso-8859-1")
, you’ll see that it’s actually more general than documented, in that it’ll work for any ISO-8859-1 characters (aka Latin1). The second part of the comment, “Else just pass in a charset into this constructor and use it in getBytes()” makes no sense, because the bytes are hard coded, and there is no char encoding argument to the constructor.
Furthermore, the catch block (with its throw) should just be removed. It’s catching an UnsupportedEncodingException
, which extends IOException
, then throwing a fresh IOException
. It should just be removed, at which point an unsupported encoding throws an unsupported encoding exception; you don’t even need to change the signature, because unsupported encoding exceptions are kinds of I/O exceptions. There are two problems with the code as is. First, the the IOException reports an unhelpful message; the unsupported encoding exception has the info on what went wrong in its message. Second, it’s returning a more general type, making it harder for catchers to do the right thing. You might also consider the fact that the original stack trace is lost a problem.
Another instance of (almost) commenting the language is later in the same file:
try { int length = 0; byte[] buffer = new byte[ByteArrayDataSource.BUFFER_SIZE]; bis = new BufferedInputStream(aIs); baos = new ByteArrayOutputStream(); osWriter = new BufferedOutputStream(baos); //Write the InputData to OutputStream while ((length = bis.read(buffer)) != -1) { osWriter.write(buffer, 0, length); } osWriter.flush(); osWriter.close(); } ...
I’d argue comments like “Write the InputData to OutputStream” are useless, because this is the idiom for buffered writes.
Aside on Bad Code
Maybe you think you should always buffer streams. In this case, that’s wrong. Both bufferings simply cause twice as many assignments as necessary.
Buffering the input stream is useless because you’re already reading into the byte array
buffer
.Buffering the output stream is useless, because you’re writing to a byte array output stream
baos
.Furthermore, you don’t need to close or flush a byte array output stream, because both are no-ops (see the
ByteArrayOutputStream
javadoc).Finally, the naming is wonky, because
osWriter
is not a writer, it’s an output stream.
This isn’t an isolated case. Other files in this package have similar problems with doc.
Another Example
While we’re picking on Apache Commons, I checked out another package I’ve used, fileUpload.
public class DiskFileUpload extends FileUploadBase { // ------------------------------------ Data members /** * The factory to use to create new form items. */ private DefaultFileItemFactory fileItemFactory; // ------------------------------------ Constructors
That’s the kind of pre-formatted useless stuff I was talking about. We know what a member variable and constructor look like in Java. There weren’t any other code comments in that file.
In that same package, the fileupload.ParameterParser
class has some, though, and I’m guessing they’re of the kind that others mentioned as “good” comments, such as:
... // Trim leading white spaces while ((i1 < i2) && (Character.isWhitespace(chars[i1]))) { i1++; } ...
I’d argue that perhaps a method called trimLeadingWhiteSpace()
implemented the same way would be clearer. But if you’re not going to define new methods, I’d say this kind of comment helps. But always verify that the code does what it says it does; don’t take the comment’s word (or method name’s word) for it.
I couldn’t leave that file without commenting on their return-only-at-the-end-of-a-function style:
String result = null; if (i2 > i1) { result = new String(chars, i1, i2 - i1); } return result;
I have no idea why people do it, but as you can see in this case, it just makes the code hard to follow.
More Examples
I thought only a couple wouldn’t be convincing. So here’s some links and examples:
public boolean isFull() { // size() is synchronized return (size() == maxSize()); }
Documenting what’s clear from the code. (And nice section divider comments.)
// Return the parser we already created (if any) if (parser != null) { return (parser); } ... } catch (RuntimeException e) { // rethrow, after logging log.error(e.getMessage(), e); throw e; }
Yes, that’s what return
, log
, and throw
do.
public LogFactoryImpl() { super(); initDiagnostics(); // method on this object
Yes, it really says “method on this object”.
There’s lots more, so I’ll leave finding them as an exercise.
It’s Not All Bad
There were useful code comments in those packages that explained how the code corresponded to an algorithm in Knuth, or how complex invariants were being preserved in complex loops. I’m really not saying don’t comment code at all.
October 19, 2009 at 3:42 pm |
A place I worked had a Quality Assurance group, who I’m certain were all failed programmers. There quality assurance was to assure that programs had lots of comments, we didn’t do certain things that had been decided were impossible to maintain and everything was tidy. Unfortunately many couldn’t read the code well and made up for it by requiring lots of comments.
October 22, 2009 at 9:03 am |
Some of these comments seem like pseudocode. “the function here should do so and so”. It’s been gradually replaced with real code but the pseudocode-comments remain.