Clean Code

Any fool can write code that a computer can understand. Good programmers write code that humans can understand. — Martin Fowler

Clean code is not a luxury — in a maintenance context, code that humans cannot understand becomes progressively more expensive to modify. This lecture covers the practices and tools that keep code readable, maintainable, and trustworthy over time.

What is Quality Code?

Quality code is characterised by four properties:

Correct: matches its technical specification, does what it claims to do
Robust: handles edge cases, unexpected inputs, and failures gracefully
Readable: easy to understand without prior context; the next developer can comprehend it quickly
Efficient: uses resources appropriately — not over-engineered, not wasteful

Software entropy: an evolving system increases its complexity unless work is done to reduce it. — Meir Lehman

Shipping first-time code is like going into debt. A little debt speeds development so long as it is paid back promptly with refactoring. The danger occurs when the debt is not repaid. Every minute spent on code that is not quite right counts as interest on that debt. — Ward Cunningham

Short-term shortcuts accumulate over time. Unrepaid technical debt slows all future development — every new feature must be built on top of an increasingly unstable foundation.

Naming Conventions

Names are the primary communication channel between the author and future readers. A well-named variable, method, or class eliminates the need for a comment.

Names should be auto-descriptive and pronounceable — avoid abbreviations and acronyms
Variables and fields: nouns or noun phrases describing what they hold
Methods: verbs or verb phrases — getUser(), validateInput(), computeTotal()
Booleans: questions that answer true/false — isValid(), hasPermission(), areEqual()
Constants: SCREAMING_SNAKE_CASE for named constants; never embed magic numbers or strings in logic

// Bad: what is 'd'? what does '4' mean? what is 'u.s'?
int d = 86400;
if (u.s == 4) { ... }

// Good: self-explanatory, no comment needed
final int SECONDS_PER_DAY = 86400;
if (user.status == UserStatus.SUSPENDED) { ... }

Comments

Comments are always a failure. — Robert C. Martin (“Uncle Bob”), Clean Code

Comments lie — they go out of sync with the code they describe. They age badly: code is refactored but comments are forgotten. They are not refactorable: renaming a method does not rename comments that mention it.

A comment often signals that the code failed to:

Choose a good name for a variable or method
Extract logic into a well-named helper
Create the right abstraction

When comments ARE acceptable

Javadoc (/** ... */) for public APIs — documents intent and contracts, not implementation
Algorithm citations — when using a non-obvious algorithm, cite the paper or source (future maintainers need to understand the why)
Legal and license headers — required by many organizations

Code Layout

Good layout makes a file easier to navigate before a single line of logic is read:

File size: aim for ~200 lines; 500 is a hard warning sign
Line length: 80–120 characters per line — beyond that, the logic is probably too complex
Indentation and spacing: consistent style, enforced by tooling (not by convention)
Reading order: code should flow top-to-bottom like a newspaper — high-level concept first, details lower

The Newspaper Metaphor

A well-organized class reads like a news article: the headline (class name and purpose) at the top, the most important concepts next, and implementation details at the bottom. A reader should be able to stop reading at any point and have understood the most important parts.

Principles

KISS — Keep It Simple, Stupid

If you can’t explain it simply, you don’t understand it well enough. — Albert Einstein

Use the simplest logical approach that works
Avoid layers of abstraction for their own sake
Simpler code is easier to read, test, debug, and maintain
Complexity is a debt paid by every future reader

DRY — Don’t Repeat Yourself

Also known as DIE (Duplication Is Evil).

Every piece of knowledge should have a single, authoritative representation in the codebase
Code duplication means multiple places to fix the same bug
Duplication is the root cause of many maintenance problems: a fix in one copy is forgotten in the other

Factorize: extract common logic into methods, constants, or classes. Tools like PMD’s CPD (Copy/Paste Detector) can find duplication automatically.

YAGNI — You Aren’t Gonna Need It

From Extreme Programming (XP): prefer the simplest thing that could possibly work.

Do not build features you think you might need in the future
Unused code is dead weight: it must be read, tested, and maintained, but serves no user
Speculative generality is one of the most common code smells in maintenance-heavy codebases

SOLID

The five SOLID principles provide a framework for designing classes and modules that are easy to change and extend:

| Principle | Name | Meaning | |---|---|---| | S | Single Responsibility | A class should have exactly one reason to change | | O | Open/Closed | Open for extension, closed for modification — add behaviour by adding code, not by changing existing code | | L | Liskov Substitution | A subclass must be usable wherever its parent class is used without surprising the caller | | I | Interface Segregation | Many small, specific interfaces are better than one large general-purpose interface | | D | Dependency Inversion | Depend on abstractions (interfaces), not concrete implementations |

Violating SOLID typically leads to tightly coupled, hard-to-test classes that break whenever a related component changes.

Law of Demeter

“Don’t talk to strangers.” A method should only call:

Methods on itself (this)
Methods on objects it created
Methods on objects passed to it as parameters
Methods on its direct fields

Violating the Law of Demeter produces train wrecks: chains like order.getCustomer().getAddress().getCity() couple the caller to the entire object graph and make refactoring painful. The fix is to ask the object for what you need, not to dig into its structure.

// Violation — caller knows too much about internal structure
String city = order.getCustomer().getAddress().getCity();

// Better — delegation through the chain
String city = order.getCustomerCity();

Code Smells

Code smells are patterns that signal a likely design problem. They don’t always indicate a bug, but they indicate code that will become harder to maintain over time:

| Smell | Description | |---|---| | Long Method | Methods longer than 20-30 lines usually have multiple responsibilities | | Large Class | Classes that do too much; violates Single Responsibility | | Duplicate Code | The same logic appears in more than one place | | Long Parameter List | More than 2-3 parameters is a sign of missing abstraction | | Feature Envy | A method uses data from another class more than its own | | Dead Code | Code that is never called — dead weight, misleads readers | | Magic Numbers/Strings | Literal values with no explanation embedded in logic | | Deeply Nested Conditionals | More than 2-3 nesting levels; use early returns or extracted methods |

Code smells compound over time

A single code smell is a minor concern. Multiple smells in the same class signal a class that has lost cohesion and will become exponentially more expensive to modify. Address smells incrementally during normal development — do not wait for a dedicated cleanup sprint.

Java Tooling

Good practices need tool support. The Java ecosystem provides a mature set of tools that integrate into build systems (Maven, Gradle) and CI pipelines.

Checkstyle

Enforces coding style rules: naming conventions, import ordering, Javadoc completeness, maximum line length, indentation.

Configurable rule sets (Google Java Style, Sun Coding Conventions, or custom)
Integrates with Maven, Gradle, IntelliJ IDEA, Eclipse, VS Code
Fast: runs on source code before compilation

<!-- Maven: add to pom.xml -->
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-checkstyle-plugin</artifactId>
    <version>3.4.0</version>
    <configuration>
        <configLocation>google_checks.xml</configLocation>
        <failsOnError>true</failsOnError>
        <consoleOutput>true</consoleOutput>
    </configuration>
    <executions>
        <execution>
            <id>validate</id>
            <phase>validate</phase>
            <goals><goal>check</goal></goals>
        </execution>
    </executions>
</plugin>

Run: mvn checkstyle:check

PMD

Static source code analyzer that finds potential bugs, unused code, overly complex methods, empty catch blocks, and naming violations. Its CPD (Copy/Paste Detector) sub-tool finds duplicated code.

# Analyze source code
pmd check -d src/main/java -R rulesets/java/quickstart.xml -f text

# Find duplicate code (CPD)
pmd cpd --minimum-tokens 100 --dir src/main/java

SpotBugs

Analyzes compiled bytecode (.class files) rather than source code. This allows it to detect a different class of bugs: null pointer dereferences, resource leaks, incorrect equals/hashCode implementations, infinite loops, and security vulnerabilities.

SpotBugs is the maintained successor to FindBugs.

<plugin>
    <groupId>com.github.spotbugs</groupId>
    <artifactId>spotbugs-maven-plugin</artifactId>
    <version>4.8.6.4</version>
    <configuration>
        <effort>Max</effort>
        <threshold>Low</threshold>
    </configuration>
</plugin>

Run: mvn spotbugs:check

JaCoCo — Code Coverage

Measures which lines, branches, and methods are actually executed by your test suite. Generates HTML and XML reports.

<plugin>
    <groupId>org.jacoco</groupId>
    <artifactId>jacoco-maven-plugin</artifactId>
    <version>0.8.11</version>
    <executions>
        <execution>
            <goals><goal>prepare-agent</goal></goals>
        </execution>
        <execution>
            <id>report</id>
            <phase>test</phase>
            <goals><goal>report</goal></goals>
        </execution>
    </executions>
</plugin>

Run: mvn test (coverage runs automatically) → HTML report in target/site/jacoco/index.html

Coverage is not Quality

100% line coverage does not mean bug-free code. A test that executes a line without asserting anything about the result contributes to coverage without validating behaviour. Coverage is a necessary but not sufficient condition for quality.

SonarQube

A unified quality management platform that aggregates results from static analysis, coverage, duplication detection, and security scanning into a single dashboard.

Combines Checkstyle, PMD, SpotBugs, and JaCoCo data
Tracks metrics over time: technical debt trend, code smell growth, coverage evolution
Quality Gate: a configurable threshold (e.g., “coverage > 80%, no new critical bugs”) that can fail a CI pipeline
Available as a cloud service (SonarCloud) or self-hosted

Build Automation Pipeline (GitLab CI)

Putting it all together — an example pipeline that enforces quality automatically on every push:

stages:
    - lint
    - build
    - test
    - quality

job:checkstyle:
    stage: lint
    script: mvn checkstyle:check

job:build:
    stage: build
    script: mvn compile -DskipTests

job:test:
    stage: test
    script: mvn test

job:quality:
    stage: quality
    script:
        - mvn spotbugs:check
        # JaCoCo report is generated automatically during 'mvn test' (bound to test phase)
        - mvn sonar:sonar -Dsonar.projectKey=my-project
    when: on_success
    dependencies:
        - job:test

Fail fast, fix fast

Putting Checkstyle in the lint stage means style violations are caught before compilation even starts — the fastest possible feedback. SpotBugs and SonarQube run after tests so they have coverage data available. Order your pipeline stages to surface the cheapest checks first.