Java Janitor Jim - "Integrity by Design" through Ensuring "Illegal States are Unrepresentable" - Part 1
Seeking simple Java-style solutions to enhance system integrity while reducing client validation boilerplate - V2026.01.24
tl;dr
I wanted a simple pattern for preventing a class from being instantiated in an invalid state, or from mutating into one.
Why? Because it vastly reduces the amount and complexity of reasoning required for use at client call-sites.
Think of it as “integrity by design”, a compliment to the “integrity by default” effort undertaken by the Java architects, detailed here.
This article discusses the design and implementation of a record pattern, very similar to the one I designed and implemented for Scala’s case class several years ago, which provides the “integrity by design” guarantees by ensuring that only valid record instances can be observed.
This pattern is also trivially cross-applicable to Java classes.
The Problem - (Re)Validation Everywhere
When a method, function/lambda/closure (FLC), or type (a.k.a. class, interface, enum, or record) is defined, the weaker its contractual guarantees (design-by-contract [DbC] preconditions, postconditions, and invariants), the greater the number of its possible states.
As the number of possible states increases, the surface area and burden for all related tests in the TDD framework grow. And it expands the validation required by clients.
While weaker contractual guarantees might provide more flexibility, they eventually introduce increased validation complexity into any code that depends on them.
The trade-off can be summarized as…
Increased flexibility leads to increased complexity.
All of this increased complexity leads to leaky state (undesirable, invalid, and/or insane states), fragile code bases where touching one part of the code breaks seemingly unrelated parts, and an increase in runtime exceptions rather than the much preferred compile-time errors.
Leaky state threatens the stability and robustness of production environments. And is considered especially dangerous in Enterprise IT systems.
It also expands the surface area for AI/LLM hallucinations.
What if we could find a better trade-off? A trade-off more in alignment with the DRY (Don’t Repeat Yourself) principle? That increases the likelihood of catching AI/LLM errors at compile time rather than at runtime after shipping code to customers?
Simple, Naive, and Wrong
Before diving into the proposed design solution, consider an example.
Class NamePartsMethodNaive
Here’s a relatively common static methods pattern found in legacy Java code bases.
public class NamePartsMethodNaive {
public static String capitalizeNameParts(
String first,
String last,
String middle
) {
return "%s%s %s".formatted(
capitalize(first),
middle.isEmpty()
? ""
: " " + capitalize(middle),
capitalize(last));
}
private static String capitalize(String namePart) {
return namePart.substring(0, 1).toUpperCase() +
(namePart.length() > 1
? namePart.substring(1)
: "");
}
}
Notice how it isn’t correctly checking for and guarding the String reference against null, empty, blank, or containing non-letters; i.e., invalid or insane states.
What’s an example of an insane state?
This, which will throw a NullPointerException:
var fullNameInsaneStateNpe =
NamePartsMethodNaive
.capitalizeNameParts("", null, "-14");Or perhaps even worse, this, which doesn’t throw anything and acts as if all is okay, returning an incoherent String value:
var fullNameInsaneStateHuh =
NamePartsMethodNaive
.capitalizeNameParts(" ", "", " ");While the code for fullNameInsaneStateHuh might work today, capitalizeNameParts() is quite fragile in the face of any client that experiences a change in design or implementation requirements, and fails to account for the need to constrain the invalid states sent to this method, regardless of whether it returns an exception or, worse, an incoherent and therefore invalid value.
Said simply, this implementation requires that all validation be forced into every client of this class and its methods.
And, the primary way the client will discover the need for said guards and protection is reactively, i.e., with runtime exceptions and weird failures. And those might occur in bits of code, nowhere near where this method returns the erroneous results. Think of the undesirable data appearing in the UI/UX and/or in downstream reports.
The more of these potential invalid-state failures that can be moved from runtime checks to compile-time checks, the better for system operation, robustness, and future refactoring and test requirements.
Boilerplate Infestation
So, what might it look like to make a first pass at reifying this method to better guard against possible insane states?
Class NamePartsMethodGuarded
public class NamePartsMethodGuarded {
public static String capitalizeNameParts(
String first,
String last,
String middle
) {
//TODO: while handling international accents, this must be expanded to
// handle appropriately placed apostrophes, hyphens, and spaces
if ((first == null) || first.isBlank() ||
!first.chars().allMatch(Character::isLetter)
) {
throw new IllegalArgumentException(
"first must not be null, blank, or contain any non-letter " +
"characters");
}
if ((last == null) || last.isBlank() ||
!last.chars().allMatch(Character::isLetter)
) {
throw new IllegalArgumentException(
"last must not be null, blank, or contain any non-letter " +
"characters");
}
var firstResolved = capitalize(first);
var lastResolved = capitalize(last);
String middleResolved = null;
if ((middle == null) || middle.isBlank()) {
middleResolved = "";
} else {
if (!middle.chars().allMatch(Character::isLetter)) {
throw new IllegalArgumentException("middle must not contain " +
"any non-letter characters");
}
middleResolved = capitalize(middle);
}
return "%s%s %s".formatted(
firstResolved,
middleResolved.isEmpty()
? ""
: " " + middleResolved,
lastResolved);
}
private static String capitalize(String namePart) {
//implicitly assumes all the validation has been correctly completed,
// i.e. the parameter namePart cannot be null, blank, or contain any
// non-letter characters
return namePart.substring(0, 1).toUpperCase() +
(namePart.length() > 1
? namePart.substring(1)
: "");
}
}That sure is a whole lot of boilerplate. While this example could be further DRYed, the nuances of the validation logic make it challenging to justify the ROI.
At least, most of the insane states have been constrained and funneled into more effective runtime exceptions.
While this is indeed an improvement, what if we move some or all of the constraints into compile-time errors?
Insane States, Take a (Compile-time) Hike
If we combine the intentions behind DbC’s (Eiffel’s lovely class state validation model) concept of preconditions, postconditions, and invariants, with OOP’s types and encapsulation, and then mathematically compose (explicitly not OOP’s “reusability” bias) that with FP’s immutability, function purity, and ADTs (Algebraic Data Types), we discover a nice balance in the trade-offs.
The benefits should be a dramatic decrease in the number of unwanted insane states, bias the codebase towards more robust, resilient, and composable (a.k.a. refactorable) code, and move most errors to compile-time, thereby resulting in producing fewer, or even eliminating, runtime exceptions.
The first step is to define types that better reflect the code’s intention. And Java has provided that in an excellent “immutable and totally transparent data encapsulator”, the record.
Given the above example as context, we could define the following helper type:
record NamePartCapitalizedUnsafe (singular)
@org.jspecify.annotations.NullMarked
public record NamePartCapitalizedUnsafe(
String namePart
) {
public NamePartCapitalizedUnsafe {
//TODO: while handling international accents, this must be expanded to
// handle appropriately placed apostrophes, hyphens, and spaces
if ((namePart.isBlank() ||
!namePart.chars().allMatch(Character::isLetter))
) {
throw new IllegalArgumentException(
"namePart must not be blank or contain any non-letter characters");
}
namePart = capitalize(namePart);
}
private static String capitalize(String namePart) {
//implicitly assumes all the validation has been correctly completed,
// i.e. the parameter namePart cannot be blank or contain any non-letter
// characters
return namePart.substring(0, 1).toUpperCase() +
(namePart.length() > 1
? namePart.substring(1)
: "");
}
}The following assumptions are made regarding the above code:
JSpecify’s
@NullMarkedas the default contract nullability contextThis entirely removes the need for
nullchecks (but ifnullis still required, see the@Nullableannotation provided by JSpecify) on the rest of therecord’s methods
The (shallow) immutability as defined and expected by default when using a Java
recordThis enables the safe use of these (
record) types with concurrency
The default constructor ensures that only validated instantiations occur, otherwise throwing an
IllegalArgumentException
And now we can revisit the original example, and pare it down to something much simpler and DRYer:
record NamePartsCapitalizedUnsafe (plural)
@org.jspecify.annotations.NullMarked
public record NamePartsCapitalizedUnsafe(
NamePartCapitalizedUnsafe namePartCapitalizedFirst,
NamePartCapitalizedUnsafe namePartCapitalizedLast,
Optional<NamePartCapitalizedUnsafe> namePartCapitalizedMiddle
) {
public String fullName() {
return "%s%s %s".formatted(
namePartCapitalizedFirst.namePart(),
namePartCapitalizedMiddle
.map(v ->
" " + v.namePart())
.orElse(""),
namePartCapitalizedLast.namePart());
}
}The following assumptions are made regarding the above code:
The only way to provide values to the constructor is via types that have stringent instantiation restrictions, ensuring they are in an already validated state
This results in no additional validation needed for this type
The encoding of “optional” or “default” is explicitly made at the compile-time type level using
Optional, as opposed to the runtime value level (via being assignednull, an empty, a blank, or a sentinel value)
What Has Been Gained?
We now have a design that prevents the definition or instantiation of invalid and/or insane data states.
Seeing either record, the NamePartsCapitalizedUnsafe (plural) or the NamePartCapitalizedUnsafe (singular), type appear as property, as a parameter to a method, and/or as a method’s return type, it trivially conveys a specific contract has already been successfully enforced on that instance.
How?
Because all invalid instances are prevented from being successfully instantiated by the record’s default constructor.
The more stringent the contract, the less validation and testing is required by any client for either or both of these “simplified” types.
There are several benefits to this pattern:
Moving a whole category of run-time errors to compile-time checks
Significantly reducing the TDD total surface area to just this type
Reducing the overhead of validation in both clients and method or FLC signatures
Amplifying design intention while minimizing implementation decisions
Reducing the Undesirable Error-by-Thrown-Exception
Unfortunately, we are still using an exception-driven approach to reject invalid states. This is referred to as the Error-by-Thrown-Exception model. Hence, the “Unsafe” suffix was appended to the record’s name
For more composable code, it is desirable to surface invalid states via the Error-by-Returned-Value model.
These two approaches are not mutually exclusive. It is possible, and indeed required, to provide and implement both. Then, the client can choose the preferred error model based on their unique context.
To do this, we need to move the validation logic out of the default constructor and into a static method, invalidate(), taking the same parameters as the record constructor.
And then, instead of throwing the resulting exception, we return it in a non-empty Optional.
Returning an empty Optional means the parameter(s) are safe to use to instantiate the record instance.
record NamePartCapitalizedSafe
@org.jspecify.annotations.NullMarked
public record NamePartCapitalizedSafe(
String namePart
) {
public static Optional<IllegalArgumentException> invalidate(
String namePart
) {
//TODO: while handling international accents, this must be expanded to
// handle appropriately placed apostrophes, hyphens, and spaces
return !namePart.isBlank() &&
namePart.chars().allMatch(Character::isLetter)
? Optional.empty()
: Optional.of(new IllegalArgumentException(
"namePart must not be blank or contain any non-letter characters"));
}
public NamePartCapitalizedSafe{
invalidate()
.ifPresent(illegalArgumentException -> {
throw illegalArgumentException;
});
namePart = capitalize(namePart);
}
private static String capitalize(String namePart) {
//implicitly assumes all the validation has been correctly completed,
// i.e. the parameter namePart cannot be blank or contain any non-letter
// characters
return namePart.substring(0, 1).toUpperCase() +
(namePart.length() > 1
? namePart.substring(1)
: "");
}
}Improved, But Inefficient
While we now have the desired Error-by-Returned-Value model implemented, it is inefficient because it will be executed twice if a client uses the invalidate() method prior to constructing the record.
To prevent this, we can introduce a factory method, from(), which takes the same parameters as the record constructor.
However, we face a challenge. The from() The method must return one of two mutually exclusive values: either the unthrown Exception instance or the validated record instance.
This is handled by the Either<L, R> class. You can think of it as…
A pair of
Optionals where one, only one, and exactly one of the twoOptionals is non-emptyPair<Optional<L>, Optional<R>>
An
Optionalwhere the empty side is defined to also return a (possibly differently typed) valueFrom deus-ex-java’s
Eitherjavadoc:Represents an immutable value of one of two possible types (a disjoint union). An instance of
Eitheris (constructor enforced via preconditions) to be well-defined and for whichever side is defined, theleftorright, the value is guaranteed to be not-null.A common use of
Eitheris as an alternative toOptionalfor dealing with possibly erred or missing values. In this usage,Optional.isEmpty()is replaced withEither.getLeft()which, unlikeOptional.isEmpty(), can contain useful information, like a descriptive error message.Either.getRight()takes the place ofOptional.get().
I plan to eventually write an article discussing Optional, Either, and the entire class of algebraic-data-types (ADT) upon which the mathematical composition (as opposed to OOP’s focus on reusability) is based.
Refactoring to Use Either
The refactored code is now efficient, as a call to the from() factory method is able to ensure that the invalidate() static method is called exactly once prior to providing a validated instance.
@org.jspecify.annotations.NullMarked
public record NamePartCapitalizedEither(
String namePart
) {
public static Optional<IllegalArgumentException> invalidate(
String namePart
) {
//TODO: while handling international accents, this must be expanded to
// handle appropriately placed apostrophes, hyphens, and spaces
return !namePart.isBlank() &&
namePart.chars().allMatch(Character::isLetter)
? Optional.empty()
: Optional.of(new IllegalArgumentException(
"namePart must not be blank or contain any non-letter characters"));
}
public static Either<IllegalArgumentException, NamePartCapitalizedEither>
from(
String namePart
) {
try {
return Either.right(new NamePartCapitalizedEither(namePart));
} catch (IllegalArgumentException illegalArgumentException) {
return Either.left(illegalArgumentException);
}
}
public NamePartCapitalizedEither {
invalidate()
.ifPresent(illegalArgumentException -> {
throw illegalArgumentException;
});
namePart = capitalize(namePart);
}
private static String capitalize(String namePart) {
//implicitly assumes all the validation has been correctly completed,
// i.e. the parameter namePart cannot be blank or contain any non-letter
// characters
return namePart.substring(0, 1).toUpperCase() +
(namePart.length() > 1
? namePart.substring(1)
: "");
}
}As you can see, this has turned into quite a bit of “boilerplate”.
This is the trade-off required to eliminate the “boilerplague” which infected clients who depended upon the original version, NamePartsMethodNaive.
Even though we have refactored the NamePartCapitalized* (singular) record several times, the NamePartsCapitalized* (plural) record remained essentially unaffected.
This is a great indication that the refactoring impacts were effectively contained; i.e., they didn’t propagate out to the clients.
@org.jspecify.annotations.NullMarked
public record NamePartsCapitalizedEither(
NamePartCapitalizedEither namePartCapitalizedFirst,
NamePartCapitalizedEither namePartCapitalizedLast,
Optional<NamePartCapitalizedEither> namePartCapitalizedMiddle
) {
public String fullName() {
return "%s%s %s".formatted(
namePartCapitalizedFirst.namePart(),
namePartCapitalizedMiddle
.map(v ->
" " + v.namePart())
.orElse(""),
namePartCapitalizedLast.namePart());
}
}We now have all of the components necessary to compose a more advanced example of “ensuring illegal states are unrepresentable”.
However…
Next Article - Part 2
To meet my target of keeping this article readable in under 10 minutes, I have pushed the remaining content into an additional future article, [“Integrity by Design” through Ensuring “Illegal States are Unrepresentable” - Part 2].
This additional article will reify and abstract this pattern to make it even more general.
It will focus on a more complex example of “encoding a geospatial 2D polygon” via a bottom-up design that will emergently produce the desired effects while virtually eliminating any client validation.



