Alphabetization Soup

Monday, January 2, 2023

•

1,033 words

•

6 minute read

Something I've seen come up—typically in code review—in the past year is the alphabetization of different constructs within source code.

While alphabetization can be used to provide consistency within a codebase, over-application of it has the potential to actually make code worse.

Why alphabetize?

Let me preface this with the caveat that alphabetizing constructs in your code should be the computer's job. If you are making the stylistic choice to sort things alphabetically and don't have this as an "autofix" option in your formatter or linter, perhaps consider forgoing this until your tooling can handle it for you.

Now that we know that your computer will be doing all the drudgery of reordering constructs on your behalf, let's talk about why you might want to do this in the first place.

Alphabetizing your code can lead to greater consistency throughout the codebase. If certain constructs are always inserted in the same order, it removes the possibility that two different developers will make a different choice about where an addition will go. This can also help keep diffs clean, as additions will follow a deterministic order.

Another cited benefit is that alphabetizing can reduce the cognitize overhead for developers. If a list of items is alphabetized, it requires little thought to know where to insert a new item.

Case studies

Let's take a look at some different scenarios that I've seen out in the wild and how alphabetization plays out for each of them.

Object fields

Suppose we have the following Customer type:

interface Customer {
  id: CustomerId;
  username: string;
  fullName: string;
  age: number;
  address: Address;
}

As it stands, we have the fields arranged in a meaningful order:

The customer's ID comes first, as that is what we use to uniquely identify the record within the system
Next we have customer's username and full name, which are both additional identifiers
Finally we have some ancillary metadata, like the customer's age and address

The important thing here is that the author has the agency to organize the fields in a way that helps encode additional meaning into the source code.

Of course, there is some subjectivity here. Who's to say that address shouldn't come before age?

In this case, I think the grouping of similar fields is more important than their ordering with respect to each other.

If we were to arrange the fields within the Customer type alphabetically it would look this:

interface Customer {
  address: Address;
  age: number;
  fullName: string;
  id: CustomerId;
  username: string;
}

While this ordering may seem fine (although I still maintain that it's weird to not have id come first), there are other cases where sorting fields alphabetically just doesn't make sense.

Consider this common definition of a Rectangle object that you might see in graphics programming:

interface Rectangle {
  width: number;
  height: number;
}

It is standard to talk about rectangles as "width by height", so by keeping the fields in the same order within the code we're keeping things aligned with the real world.

If we were sort these fields alphabetically, we'd end up with this backwards ordering:

interface Rectangle {
  height: number;
  width: number;
}

There are also situations where the ordering of fields within an object has actual implications at runtime. When using repr(C) in Rust the order of the fields determines their order in memory:

#[repr(C)]
struct ThreeInts {
    some_int: i16,
    just_an_int: i8,
    another_int: i32
}

If we were to sort the struct fields alphabetically it would change the memory layout of the ThreeInts struct, and potentially break compatibility with an external program:

#[repr(C)]
struct ThreeInts {
    another_int: i32
    just_an_int: i8,
    some_int: i16,
}

Conclusion: Don't alphabetize fields in objects.

Import statements

Let's take a look at another scenario: import statements.

In most modern tooling, import statements are, by and large, written by your editor. You start typing the name of a function you want to use, and your editor will helpfully suggest an import for you and insert it into the import list.

Because the computer is already the primary author of the import statements in the first place, it makes sense for it to also be in charge of keeping them tidy.

It's worth calling out that the ordering of import statements is not always strictly alphabetical.

In JavaScript and TypeScript, for example, it's common to sort relative imports after absolute imports:

import { fromAnotherExternalPackage } from 'another-external-package';
import { fromExternalPackage } from 'external-package';
import fs from 'node:fs';
import { fromSuperParent } from '../../super-parent';
import { fromParent } from '../parent';
import { fromSibling } from './sibling';

Similarly, a common idiom in Rust is to create three separate groups of imports:

std, core, and alloc
external crates
self, super, and crate imports

Each group is delineated by a blank line, and imports are then sorted alphabetically within each group:

use alloc::alloc::Layout;
use core::f32;
use std::sync::Arc;

use broker::database::PooledConnection;
use chrono::Utc;
use juniper::{FieldError, FieldResult};
use uuid::Uuid;

use super::schema::{Context, Payload};
use super::update::convert_publish_payload;
use crate::models::Event;

These rules are automatable with rustfmt, meaning the author doesn't need to spend any time thinking about how to order the imports.

This is a good example of how code can be alphabetized while still retaining meaning. The categorization of imports into these separated groups improves the readability of the code by making them easily distinguishable at a glance.

Conclusion: Alphabetize your imports!

Wrapping up

This post does not intend to be an exhaustive look at all of the different scenarios and whether alphabetization does or or doesn't make sense for each one.

Rather, I wanted to provide some examples of how to weigh the pros and cons of alphabetization in different contexts that can be extended to different situations.

Ultimately, my position is that it's useful to able to imbue code with additional meaning through the intentional grouping and ordering of constructs, and that this takes precedence over consistency for consistency's sake or reducing some already minimal cognitive overhead.

There are some cases—like imports—where alphabetization is a no-brainer, but I think these cases are moreso the exception than the rule.

#programming #code-style

Marshall Bowers

Conjurer of code. Devourer of art. Pursuer of æsthetics.