CodeSmell Name: Variable Name Same as Type
Type: Code
Problem: In self-documenting code, the variable name should convey information about the contents, usage, and/or purpose of the variable. Naming the variable as the type, or a trivial derivative of the type, guarantees that the code will not be optimally self-documenting.
Context: This code smell occurs most often when using interfaces that require the creation of lots of little intermediate objects in pursuit of a greater goal.
Example: Consider the following JDBC code:
Connection con = DriverManager?.getConnection(url,"",""); String sqlString = "SELECT * FROM user_table WHERE user_type = 'Admin'"; ResultSet rs = con.createStatement().executeQuery(sqlString); while (rs.next()) { rs.getString("name"); rs.getString("authorization_level"); }In order to determine which information is being accessed, it is necessary to read the entire query. Compare with the following:
Connection hrDatabaseConnection = DriverManager?.getConnection(url, "", ""); String fetchAdministratorsQuery = "SELECT * FROM user_table WHERE user_type = 'Admin'"; ResultSet administratorRecords = hrDatabaseConnection.createStatement().executeQuery(fetchAdministratorsQuery); while (administratorRecords.next()) { administratorRecords.getString("name"); administratorRecords.getString("authorization_level"); }The intention of every line is now apparent simply from the names of the variables involved in that line. Self-documenting code takes the place of comments, and the representational space of the variable names is not wasted in pointless redundancy.
Analysis: Naming your variables according to their type requires that someone seeking to understand the variable's semantics must track it throughout its life cycle. Incorporating semantic information about the variable into its name provides the reader with easy and automatic access to your intentions for the variable.
This code smell is popular because it is much easier to name your variables "resultSet" and "connection" than "userQueryResults" and "backupDatabaseConnection". Particularly when an interface requires the creation of many small objects, none of them terribly significant to the overall goal, it can be very frustrating to try to find useful names for every variable.
Nevertheless, while "resultSet" and "connection" may be perfectly clear to you now, and reading the whole method may make them clear again when you revisit them in six months, well-designed variable names can increase local access to information (so you only need to look at the lines you care about) without the redundancy of comments. If the "resultSet" actually holds user records, calling it "userRecordResultSet" avoids the need to look up to the SQL query string (named, of course, "userRecordQueryString", not "sqlQueryString") to determine this.
Criticism
I have never found long names for things like record sets to particularly useful. If anything, they tend to bloat up the code because they tend to be referenced often. There is a rule-of-thumb that says the more a variable is used, the shorter it should be. I agree with this. Thus, "rs" perfectly fine by me. I don't mind iterator indexes such as "i" either as long as there are not two active at the same time. If there are two active at the same time, such as "row" and "column", then more descriptive names perhaps are warranted. I can't say that what bothers me or doesn't bother me will be the same for everybody, but "rs" and "i" do not bother me when reading others' code. I would prefer they spend the time on better comments. The long one reads like English without pronouns: too explicit and verbose. Pronouns and abbreviations used right make English more natural and concise in my opinion. And please use "admin" instead of "administrator". --top
Agreeing with top - at least for local scoping, longer names simply make it more difficult to absorb the code and its structure. For non-local scoping, where name by type is non-obvious, I'd be more inclined to choose longer variable names in order to enable global searches for access to that variable. If a piece of code is too long to absorb, and names like 'i' become difficult to follow, then the whole local block needs to be refactored.
Rs is fair enough (though I think that there is little difference between rs and administratorsResultSet or something of the sort, so why not take the extra effort) when we have small, simple methods, however (and I deal with A LOT of code which has variables in developer-ese) when dealing with two of the same type in a method - e.g. ResultSet res and ResultSet rs (res1, res2) or String str and String s, it becomes virtually impossible to tell what is going on. Any time I spend trying to figure out what a variable is is wasted time. If a method needs to be modified for any reason, then it is likely that the meaning of the variables will change too, leading to more wasted time renaming the variables anyway. If the meaning of the variable is going to be housed in comments, then that meaning is the least likely to ever be kept updated unlike the variable name.
How about a vote:
Short Version: 4
Long Version: 1
Examples in the Literature:
Examples in Practice:
Exceptions:
When looping through employees, to print them, I see nothing wrong with:
void print(Collection<Employee> employees) { for (Employee employee : employees) { System.out.println(employee.getName()); } }(Ignoring that the "System.out" line itself is a really nasty and bad way to do that. For this example, we'll assume that a real program would do something more sensible -- but still using the 'employee' variable. ;-)