Layering Abstractions

                        June 10, 2021

                Layering Abstractions

                        Layering Abstractions
When designing software systems,
we care about flexibility
as requirements change.
One might break that down
into two rough parts.
First, flexibility across systems—i.e.,
when the way that two or more systems interacts
needs to change, 
one can adapt without significantly changing
any individual component.
A lot has been written about this,
including the Unix Philosophy,
SOLID (some of which overlaps this article’s content),
modular programming,
and software contracts.
Second, which I will focus on in this article,
is flexibility within a component.
For this article, a “component” 
is a single deployable binary,
which could be a web service,
or a background job,
or a data processing pipeline.
By flexibility within a component,
I mean that 
changes to requirements—such as
the semantic interpretation of its output,
the incorporation of a new input,
or an efficiency improvement—don’t
require major redesigns.
Sometimes it’s inevitable,
but ideally the amount of code 
that needs to change
should be somehow proportional
to the desired impact of the requirement.
The benefits should justify the effort.
One approach 
to designing a component for flexibility
is what I informally think of as
“layering abstractions.”
A software component is an onion.
The outer layers get weathered,
bruised, 
and eventually peeled off,
but they protect the inner layers.
Then, breaking with the metaphor,
the core rarely needs to change,
and can be reused 
as the core of another onion.
The amount of work required for new applications
depends on how many layers need to be modified
or rebuilt to support it.
For me, this approach works well
in the context of a mathematically inclined system,
where the layers usually correspond to:¹
Interfacing with the outside world
 : ! ! ! ! ! ! 
 : ! ! ! ! ! ! 
 : --------- r-'
 : --------- : !`.
 : --------- : ! !`.
 : ========= n ! ! !`.-.-.-.-.-.-.
             '-n ! ! ! ! ! ! ! ! :
               '-n ! ! ! ! ! ! ! :
                 '-n ! ! ! ! ! ! :
                  .'-n --------- :
                .' ! ----------- :
              .' ! ! ----------- :
Applying hyper-specific business logic 
(hacks and patches)
  .-.-.-.-.-.' ! ! ! ----------- :
 : ! ! ! ! ! ! ! ! ============= :
 : ! ! ! ! ! ! ! r-'
 : ! ! ! ! ! ! r-'
 : --------- r-'
 : --------- : !`.
 : --------- : ! !`.
 : ========= n ! ! !`.-.-.-.-.-.-.
             '-n ! ! ! ! ! ! ! ! :
               '-n ! ! ! ! ! ! ! :
                 '-n ! ! ! ! ! ! :
                  .'-n --------- :
                .' ! ----------- :
              .' ! ! ----------- :
Generally applicable business logic
  .-.-.-.-.-.' ! ! ! ----------- :
 : ! ! ! ! ! ! ! ! ============= :
 : ! ! ! ! ! ! ! r-'
 : ! ! ! ! ! ! r-'
 : --------- r-'
 : --------- : !`.
 : --------- : ! !`.
 : ========= n ! ! !`.-.-.-.-.-.-.
             '-n ! ! ! ! ! ! ! ! :
               '-n ! ! ! ! ! ! ! :
                 '-n ! ! ! ! ! ! :
                  .'-n --------- :
                .' ! ----------- :
              .' ! ! ----------- :
Interface to mathematical model
  .-.-.-.-.-.' ! ! ! ----------- :
 : ! ! ! ! ! ! ! ! ============= :
 : ! ! ! ! ! ! ! r-'
 : ! ! ! ! ! ! r-'
 : --------- r-'
 : --------- : !`.
 : --------- : ! !`.
 : ========= n ! ! !`.-.-.-.-.-.-.
             '-n ! ! ! ! ! ! ! ! :
               '-n ! ! ! ! ! ! ! :
                 '-n ! ! ! ! ! ! :
                   '-n --------- :
                    .'---------- :
                  .'! ---------- :
                .'! ! ---------- :
              .'! ! ! ========== :
              ! ! ! r-'         ::
Raw algorithm ! ! r-'           ::
or model      ! r-'             ::

Here’s an example I worked on recently,
with the domain of datacenters replaced
with the domain of libraries.
In this problem you manage a library system
and you want to make sure there are enough
children’s books among the different branches.
You know where the books are now,
and you have some idea of where the demand for books is,
and you want to make sure that some branches
get priority over other branches,²
but that there is a minimum supply of books
in every branch.
This problem naturally comes with caveats
that you only learn along the way.
For one, we also care about particular categories
within “children’s books,”
such as cardboard books,
classics,
new releases,
books about animals, etc.
Categories might overlap,
and we want enough of each category at each branch,
not just total book count.
And there are always exceptions:
Goodnight Moon is required at every library.
We also need to be aware of which books
are currently checked out 
and can’t be redistributed yet.
The non-layered approach would pick an algorithm, 
and then expose the algorithm to the details
of the children’s books and categories,
branches,
checked-out books,
and exceptional cases.
A layered approach 
would pick a general algorithm
at the bottom layer,
and higher layers would shape or simplify
domain specificity and data to fit the algorithm’s interface.
For example,
whatever algorithm we choose,
the lowest level should know nothing about books.
Instead “books” would be a bucket 
of integer-valued “supply” of a generic resource bucket.
And the demand for books in various branches 
would be similar 
(I’m using a second type for name clarity later).
In Java, modeling everything with just interfaces
for brevity:
interface Supply<B, L> {
  B bucket();
  int quantity();
}

interface Demand<B, L> {
  B bucket();
  int quantity();
}

Here, crucially, a B represents not just a single item,
but a “bucket” of items that can include
the branch where the books are currently,
the category of books—which could be a singleton,
like the “Goodnight Moon” category—or
other details we haven’t thought of yet.
Notably, this setup assumes buckets don’t overlap.
If categories overlap, then a bucket would have to
correspond to a multiset of category labels,
which would be simple to model for supply,
but might be difficult to model for demand.
Prioritization rules would be provided to the lowest
level as a comparison of buckets,
and minimum thresholds are similar.
interface DemandPrioritization<B> implements Comparator<B> {
  @Override
  int comparePriority(B left, B right);
}

interface SupplyBuffer<B> {
  int minimumBuffer(B bucket);
}

The core algorithm would then operate with this interface
interface SupplyReallocation<B> {
  Collection<Supply<B>> reallocate(
    Collection<Supply<B>> supply,
    Collection<Demand<B>> demand,
    DemandPrioritization<B> priority,
    SupplyBuffer<B> minimumSupply);
}

The caller of this code (still in the same component) 
is required to define the buckets as concrete types,
make choices about how special cases are represented
(such as a Goodnight Moon bucket),
and implement the two interfaces required.
For example, minimumSupply might be provided
by a class that is backed by the data pulled
from a database table,
or even hard coded until “real” data is available.
Imagine a twist:
there are some books in “sets.”
If you move one book,
you have to move the whole set.
Where would we make this change?
We could redefine the buckets
so that the unit of supply
is a “book or set.”
This would require no changes to the
hard, algorithmic part.
Alternatively,
if the rules for how books may move around
get sufficiently complex,
or the buckets suffer combinatorial explosion,
one could expand the algorithm
to be more general by
accepting as input a “reallocation impact rule”
that, according to some limited interface,
defines impact of book movement. E.g.,
interface ReallocationChoice<B> {
  B sourceBin();
  B targetBin();
  int quantity();
}

interface ReallocationEffect<B> {
  Collection<ReallocationChoice<B>> impactOfChoice(ReallocationChoice<B> choice);  
}

This should have well-defined behavior like
throwing an exception if the choice 
is incompatible with the required effects.
Of course, not all algorithm implementation choices
would be compatible with such an extension.
But the point is that by generalizing the domain slightly,
you admit more flexibility for changes,
and they often require only changes to
how the data in the outer layers is reorganized
and passed to the core.
And, as in mathematics,
sometimes generalizing the problem slightly
points to a more obvious, efficient solution.
At the layer just above the core algorithm,
we can split the business processing into more layers.
Typically I like to think of these higher layers
as split between long-term and short-term business logic.
The long-term logic might be organized like these interfaces.
interface BookBranchQty {
  Book book();
  Branch branch();
  int quantity();
}

interface BookReallocator {
  Collection<BookBranchQty> reallocate(Collection<BookBranchQty> currentDistribution);
}

Here the details about books, branches,
and such are made explicit,
as well as being aware of things like
checked-out books vs not-checked-out books, if needed.
In the above, 
BookReallocator might wrap (via a builder) 
the additional data needed to group BookBranchQty into the right Supply buckets, along with
a DemandPrioritization instance to pass to
SupplyReallocation::reallocate.
And crucially, “hacks”  meant to deal
with short term situations
should live outside of BookReallocator,
and only be applied to the data passed to it.
In some sense,
that means that you’re designing
both the BookReallocator
and the deeper SupplyReallocation instance
to be flexible enough that “hacks” 
can be injected via the normal interface.
One might also say that makes them “not hacks,”
but your mileage may vary on that.
Say a beloved children’s book author
donates her private collection to your library system.
We can inject new data to BookReallocator
representing the donation as its own branch,
with zero demand or buffer at that branch.
Say some books are suddenly deemed too racy for children,
and they need to be phased out quickly,
but we want to make sure we replenish
enough books before pulling the bad books out of circulation.
Add a new fake branch called “The Dump”,
with infinite demand for the bad book,
and gets highest priority.
And, of course,
the core algorithm is generally useful!
You could apply it to other kinds of supply/demand
problems related to your library.
Perhaps reallocating volunteer hours or funding dollars.
And those, likely,
would come with their own peculiar hacks and business logic.

Stairs ASCII art from baustein.eu. ↩

One might assert that libraries, as a public resource,
  should prioritize underserved communities. ↩

                        Don't miss what's next. Subscribe to Halfspace: