Clean Code
Any fool can write code that a computer can understand. Good programmers write code that humans can understand. — Martin Fowler
Clean code is not a luxury — in a maintenance context, code that humans cannot understand becomes progressively more expensive to modify. This lecture covers the practices and tools that keep code readable, maintainable, and trustworthy over time.
What is Quality Code?
Quality code is characterised by four properties:
- Correct: matches its technical specification, does what it claims to do
- Robust: handles edge cases, unexpected inputs, and failures gracefully
- Readable: easy to understand without prior context; the next developer can comprehend it quickly
- Efficient: uses resources appropriately — not over-engineered, not wasteful
Software entropy: an evolving system increases its complexity unless work is done to reduce it. — Meir Lehman
Shipping first-time code is like going into debt. A little debt speeds development so long as it is paid back promptly with refactoring. The danger occurs when the debt is not repaid. Every minute spent on code that is not quite right counts as interest on that debt. — Ward Cunningham
Short-term shortcuts accumulate over time. Unrepaid technical debt slows all future development — every new feature must be built on top of an increasingly unstable foundation.
Naming Conventions
Names are the primary communication channel between the author and future readers. A well-named variable, method, or class eliminates the need for a comment.
- Names should be auto-descriptive and pronounceable — avoid abbreviations and acronyms
- Variables and fields: nouns or noun phrases describing what they hold
- Methods: verbs or verb phrases —
getUser(),validateInput(),computeTotal() - Booleans: questions that answer true/false —
isValid(),hasPermission(),areEqual() - Constants:
SCREAMING_SNAKE_CASEfor named constants; never embed magic numbers or strings in logic
// Bad: what is 'd'? what does '4' mean? what is 'u.s'?
int d = 86400;
if (u.s == 4) { ... }
// Good: self-explanatory, no comment needed
final int SECONDS_PER_DAY = 86400;
if (user.status == UserStatus.SUSPENDED) { ... }
Comments
Comments are always a failure. — Robert C. Martin (“Uncle Bob”), Clean Code
Comments lie — they go out of sync with the code they describe. They age badly: code is refactored but comments are forgotten. They are not refactorable: renaming a method does not rename comments that mention it.
A comment often signals that the code failed to:
- Choose a good name for a variable or method
- Extract logic into a well-named helper
- Create the right abstraction
- Javadoc (
/** ... */) for public APIs — documents intent and contracts, not implementation - Algorithm citations — when using a non-obvious algorithm, cite the paper or source (future maintainers need to understand the why)
- Legal and license headers — required by many organizations
Code Layout
Good layout makes a file easier to navigate before a single line of logic is read:
- File size: aim for ~200 lines; 500 is a hard warning sign
- Line length: 80–120 characters per line — beyond that, the logic is probably too complex
- Indentation and spacing: consistent style, enforced by tooling (not by convention)
- Reading order: code should flow top-to-bottom like a newspaper — high-level concept first, details lower
A well-organized class reads like a news article: the headline (class name and purpose) at the top, the most important concepts next, and implementation details at the bottom. A reader should be able to stop reading at any point and have understood the most important parts.
Principles
KISS — Keep It Simple, Stupid
If you can’t explain it simply, you don’t understand it well enough. — Albert Einstein
- Use the simplest logical approach that works
- Avoid layers of abstraction for their own sake
- Simpler code is easier to read, test, debug, and maintain
- Complexity is a debt paid by every future reader
DRY — Don’t Repeat Yourself
Also known as DIE (Duplication Is Evil).
- Every piece of knowledge should have a single, authoritative representation in the codebase
- Code duplication means multiple places to fix the same bug
- Duplication is the root cause of many maintenance problems: a fix in one copy is forgotten in the other
Factorize: extract common logic into methods, constants, or classes. Tools like PMD’s CPD (Copy/Paste Detector) can find duplication automatically.
YAGNI — You Aren’t Gonna Need It
From Extreme Programming (XP): prefer the simplest thing that could possibly work.
- Do not build features you think you might need in the future
- Unused code is dead weight: it must be read, tested, and maintained, but serves no user
- Speculative generality is one of the most common code smells in maintenance-heavy codebases
SOLID
The five SOLID principles provide a framework for designing classes and modules that are easy to change and extend:
| Principle | Name | Meaning |
|---|---|---|
| S | Single Responsibility | A class should have exactly one reason to change |
| O | Open/Closed | Open for extension, closed for modification — add behaviour by adding code, not by changing existing code |
| L | Liskov Substitution | A subclass must be usable wherever its parent class is used without surprising the caller |
| I | Interface Segregation | Many small, specific interfaces are better than one large general-purpose interface |
| D | Dependency Inversion | Depend on abstractions (interfaces), not concrete implementations |
Violating SOLID typically leads to tightly coupled, hard-to-test classes that break whenever a related component changes.
Law of Demeter
“Don’t talk to strangers.” A method should only call:
- Methods on itself (
this) - Methods on objects it created
- Methods on objects passed to it as parameters
- Methods on its direct fields
Violating the Law of Demeter produces train wrecks: chains like order.getCustomer().getAddress().getCity() couple the caller to the entire object graph and make refactoring painful. The fix is to ask the object for what you need, not to dig into its structure.
// Violation — caller knows too much about internal structure
String city = order.getCustomer().getAddress().getCity();
// Better — delegation through the chain
String city = order.getCustomerCity();
Code Smells
Code smells are patterns that signal a likely design problem. They don’t always indicate a bug, but they indicate code that will become harder to maintain over time:
| Smell | Description |
|---|---|
| Long Method | Methods longer than 20-30 lines usually have multiple responsibilities |
| Large Class | Classes that do too much; violates Single Responsibility |
| Duplicate Code | The same logic appears in more than one place |
| Long Parameter List | More than 2-3 parameters is a sign of missing abstraction |
| Feature Envy | A method uses data from another class more than its own |
| Dead Code | Code that is never called — dead weight, misleads readers |
| Magic Numbers/Strings | Literal values with no explanation embedded in logic |
| Deeply Nested Conditionals | More than 2-3 nesting levels; use early returns or extracted methods |
A single code smell is a minor concern. Multiple smells in the same class signal a class that has lost cohesion and will become exponentially more expensive to modify. Address smells incrementally during normal development — do not wait for a dedicated cleanup sprint.
Java Tooling
Good practices need tool support. The Java ecosystem provides a mature set of tools that integrate into build systems (Maven, Gradle) and CI pipelines.
Checkstyle
Enforces coding style rules: naming conventions, import ordering, Javadoc completeness, maximum line length, indentation.
- Configurable rule sets (Google Java Style, Sun Coding Conventions, or custom)
- Integrates with Maven, Gradle, IntelliJ IDEA, Eclipse, VS Code
- Fast: runs on source code before compilation
<!-- Maven: add to pom.xml -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-checkstyle-plugin</artifactId>
<version>3.4.0</version>
<configuration>
<configLocation>google_checks.xml</configLocation>
<failsOnError>true</failsOnError>
<consoleOutput>true</consoleOutput>
</configuration>
<executions>
<execution>
<id>validate</id>
<phase>validate</phase>
<goals><goal>check</goal></goals>
</execution>
</executions>
</plugin>
Run: mvn checkstyle:check
PMD
Static source code analyzer that finds potential bugs, unused code, overly complex methods, empty catch blocks, and naming violations. Its CPD (Copy/Paste Detector) sub-tool finds duplicated code.
# Analyze source code
pmd check -d src/main/java -R rulesets/java/quickstart.xml -f text
# Find duplicate code (CPD)
pmd cpd --minimum-tokens 100 --dir src/main/java
SpotBugs
Analyzes compiled bytecode (.class files) rather than source code. This allows it to detect a different class of bugs: null pointer dereferences, resource leaks, incorrect equals/hashCode implementations, infinite loops, and security vulnerabilities.
SpotBugs is the maintained successor to FindBugs.
<plugin>
<groupId>com.github.spotbugs</groupId>
<artifactId>spotbugs-maven-plugin</artifactId>
<version>4.8.6.4</version>
<configuration>
<effort>Max</effort>
<threshold>Low</threshold>
</configuration>
</plugin>
Run: mvn spotbugs:check
JaCoCo — Code Coverage
Measures which lines, branches, and methods are actually executed by your test suite. Generates HTML and XML reports.
<plugin>
<groupId>org.jacoco</groupId>
<artifactId>jacoco-maven-plugin</artifactId>
<version>0.8.11</version>
<executions>
<execution>
<goals><goal>prepare-agent</goal></goals>
</execution>
<execution>
<id>report</id>
<phase>test</phase>
<goals><goal>report</goal></goals>
</execution>
</executions>
</plugin>
Run: mvn test (coverage runs automatically) → HTML report in target/site/jacoco/index.html
100% line coverage does not mean bug-free code. A test that executes a line without asserting anything about the result contributes to coverage without validating behaviour. Coverage is a necessary but not sufficient condition for quality.
SonarQube
A unified quality management platform that aggregates results from static analysis, coverage, duplication detection, and security scanning into a single dashboard.
- Combines Checkstyle, PMD, SpotBugs, and JaCoCo data
- Tracks metrics over time: technical debt trend, code smell growth, coverage evolution
- Quality Gate: a configurable threshold (e.g., “coverage > 80%, no new critical bugs”) that can fail a CI pipeline
- Available as a cloud service (SonarCloud) or self-hosted
Build Automation Pipeline (GitLab CI)
Putting it all together — an example pipeline that enforces quality automatically on every push:
stages:
- lint
- build
- test
- quality
job:checkstyle:
stage: lint
script: mvn checkstyle:check
job:build:
stage: build
script: mvn compile -DskipTests
job:test:
stage: test
script: mvn test
job:quality:
stage: quality
script:
- mvn spotbugs:check
# JaCoCo report is generated automatically during 'mvn test' (bound to test phase)
- mvn sonar:sonar -Dsonar.projectKey=my-project
when: on_success
dependencies:
- job:test
Putting Checkstyle in the lint stage means style violations are caught before compilation even starts — the fastest possible feedback. SpotBugs and SonarQube run after tests so they have coverage data available. Order your pipeline stages to surface the cheapest checks first.