Language Games: Wittgenstein and DDD

2023/06/05

Categories: programming Tags: language

Link to this section  Overview

Imagine standing in a hallway in a music conservatory. It’s lined with small practice rooms, each one containing its own piano. From a room to your left, you hear Brahms Op 118 No. 5, Romance in F Major, a gentle and perhaps melancholic piece, and from a room to your right you hear Ligeti’s piano etude No. 4, Fanfares, a frantic excursion into madness. Sometimes gathering requirements feels like being stuck in this hallway, hearing two very different things at once and trying to make sense of them and unify them.

As programmers we’re trained to find commonalities and create abstractions that unify related things. However, this can backfire: sometimes the inherent messiness of natural language misleads us into thinking there’s a sound overarching abstraction when there’s not. Attempting to unify disconnected or only-somewhat-overlapping uses of a word leads to brittle abstractions that make a codebase hard to maintain. Two stakeholders may use the same words – “order”, “customer”, “product” – but they may mean such different things by them that they’d be better off as different words.

In this post, I explore the relationship between Wittgenstein’s philosophy of language and Domain Driven Design (DDD). I argue that requirements gathering and domain modeling in the DDD style is enhanced by Wittgenstein’s concept of Language Games, which are a person-focused complement to Bounded Contexts. Before we begin writing software, we must play Language Games.

Link to this section  Intuition: An Alien in a Restaurant

You’re an alien who just recently arrived on Earth and you’ve been hired to write a comprehensive software solution for a restaurant. It needs to track who’s working what hours and when, whether they’re running low on raw ingredients, whether any pots/pans/stoves/etc. need to be replaced, finances, customers' orders at different tables…

You start by talking to the manager. She uses the word “order” a lot – “we just placed an order for 1,000 bananas from our produce supplier” and “we have a recurring order for steak from our meat supplier”.

You’re figuring it out: an order is associated to a supplier, and can be recurring or one time. It has a date it was placed, and a date it was completed.

Then you talk to a waiter. He informs you that every order is associated to a table. You find this interesting. You go back to the manager and ask her what table the order for 1,000 bananas was for. She laughs but you don’t understand what’s so funny about trying to write an accurate type definition. You walk back to the waiter and ask if orders are ever recurring.

“Do you mean, like, there’s a regular customer who orders the same thing every time they come in?”

“No,” you say, “more like, does table 4 ever get a chicken pot pie every Tuesday morning at 4am?”

The waiter laughs and he then goes on to talk about this physics book he’s reading and how there’s a grand cosmic order underpinning everything and something about four fundamental forces of nature but you interrupt and ask if “grand cosmic order” is shorthand for really big orders of bananas or something.

The manager hears you talking and orders you both to get back to work and order by date the receipts for the banana orders because the orderliness of the grand cosmic order depends on a customer’s ability to order a meal.

And it hits you: you simply can’t unify the multiple meanings of “order”.

Link to this section  Bounded Contexts

Every piece of software lives in a domain and solves a problem in that domain. Programmers always start off as aliens in that domain, and before they can write useful code, they need to understand the domain. Arguably it’s more difficult for a programmer than an actual alien because the words being used in the domain often have commonplace everyday meanings that only somewhat accurately match the meaning in the domain.

Domain modeling is the act of taking messy natural language and formalizing it by encoding it in the restrictive grammar given to you by your programming language. Because natural language is so permissive and liberal, encoding it in a programming language is not an act of mere translation, but a strenuous and creative act that entails discovering abstractions and crystallizing ideas.

The core principle of Domain Driven Design is constructing a Ubiquitous Language (UL), which is a shared language between stakeholders and developers. A good UL occupies a Goldilocks Zone of being precise enough that it can be written directly as code, but abstract enough that it doesn’t contain any implementation details related to infrastructure considerations like database platforms.

Constructing a UL is especially difficult for large domains since they are often comprised of multiple sub-domains, each with different dialects. In the restaurant domain above, the manager and waiter clearly mean different things by “order”. These “regional dialects” in the domain are called “Bounded Contexts”, which are internally consistent subset of the Ubiquitous Language. Quoting Martin Fowler:

As you try to model a larger domain, it gets progressively harder to build a single unified model. Different groups of people will use subtly different vocabularies in different parts of a large organization. The precision of modeling rapidly runs into this, often leading to a lot of confusion. Typically this confusion focuses on the central concepts of the domain… In those younger days we were advised to build a unified model of the entire business, but DDD recognizes that we’ve learned that “total unification of the domain model for a large system will not be feasible or cost-effective” [1]. So instead DDD divides up a large system into Bounded Contexts, each of which can have a unified model - essentially a way of structuring MultipleCanonicalModels.

Now that we’re armed with the concept of Bounded Contexts, let’s revisit the restaurant domain. What is an order? It can be at least one of these four things:

  1. The request for food a customer places with a waiter.
  2. The request for a recurring shipment of ingredients that the restaurant places with a supplier.
  3. A command issued by a boss to an employee.
  4. The sequence with which ingredients must be combined to make a dish (like order of operations).

Imagine striving for “total unification of the domain model” and creating an abstraction that encompassed all of these meanings. It would be exquisitely painful to work with. We’re better off breaking these into separate Bounded Contexts:

  1. Meal ordering context
  2. Food procuring context
  3. Employee management context
  4. Cooking context

Inside these Bounded Contexts, the word “order” has precisely one meaning, and a very specific one.

Link to this section  Language Games

In 1933, three years before Alan Turing invented the Turing Machine and long before DDD existed and programmers were talking about ULs, Ludwig Wittgenstein critically examined the relationship between philosophy and natural language. Quoting a summary of his work:

[To Wittgenstein] philosophy serves, first, as critique of language. It is through analyzing language’s illusive power that the philosopher can expose the traps of meaningless philosophical formulations. This means that what was formerly thought of as a philosophical problem may now dissolve “and this simply means that the philosophical problems should completely disappear” (PI 133).

Replace “philosophy” with “domain modeling” and it sounds like Wittgenstein is making the same argument against “total unification of the domain model” that Fowler is making. Trying to unify all uses of “order” and find the Platonically ideal abstraction that underpins all of them is a fruitless and hopeless philosophical exercise. We can dissolve it with Bounded Contexts.

A good philosopher takes a question that’s plagued intellectuals since ancient times (like “what is the nature of good”) and dissolves it by disentangling the messiness of natural language. A good programmer takes a question that’s plagued software architects (like “what is the abstract class for ‘order’") and dissolves it by disentangling the messiness of natural language.

One of Wittgenstein’s key insights was that a word is given meaning by its use, not by what it might reference. A common viewpoint is that a word is given meaning by referencing some external object. So for Plato, “chair” has meaning because there is a Platonic Ideal of a chair, which is kinda like an abstract class from which all specific chairs (office chairs, bar stools, armchairs, chaises) inherit. As programmers, we’re trained to think Platonically – what is the underlying abstraction or abstract class that connects all the uses of the words together in some grand cosmic order of meaning?

But for Wittgenstein, there is no single underlying meaning for each word. Instead, the meaning of a word is its use, and because a word can be used in many ways, it has many meanings. There is not one overarching meaning, but instead the meanings resemble each other like family members. You’re 50% of each parent and 25% of each grandparent and so on, and your sibling is 50% of you on average, and your cousin is 12.5% of you on average. But there is no essence of your family lineage. There’s you, your siblings, parents, grandparents, cousins, etc., but there’s no pure sequence of DNA that defines what it means to be a member of your family. Quoting a summary of Wittgenstein:

There is no reason to look, as we have done traditionally—and dogmatically—for one, essential core in which the meaning of a word is located and which is, therefore, common to all uses of that word. We should, instead, travel with the word’s uses through “a complicated network of similarities overlapping and criss-crossing” (PI 66).

The meaning of a word is its use in a context, and Wittgenstein formalizes the notion of context with his idea of Language Games. I understand “Language Game” as something like a behavioral context where language is used, and where the behaviors attached to the language give it meaning. Every time we communicate, we play Language Games. Flirting and business meetings are different Language Games, so we’d expect “shut up” to have different meanings in them. While flirting, “shut up” is said coyly and playfully; it means “I like what you said, tell me more”. In a business meeting, it’s said inappropriately and sternly; it means “stop talking.”

Ordering a meal at a restaurant and managing food inventory are two different Language Games, so we’d expect “order” to have different meanings in them. And while these meanings resemble each other, there is no essential core meaning that unifies the different usages. Brian Greene says “grand cosmic order”; the restaurant manager says “an order for 500lbs of ground beef”; the waiter says “order at table 4”; the DBA says “this ‘order by’ is taking too long”; the drill sargent says “obey my orders”; and the programmer says “ah I see even though these words have family resemblances there is no way to unify their meanings and I will not fall into the trap of trying to write an abstract class that underpins all of them.”

Our job as programmers is to identify the Language Games people are playing and make the different types of games explicit. When we do this, we construct a Bounded Context.

Let’s revisit the four meanings of “order”, but this time focusing on the Language Games:

  1. Ordering food as a customer at a restaurant / taking a customer’s order
  2. Placing recurring orders for food from a supplier
  3. Managing your employees / talking to your boss
  4. Distributed cooking in a kitchen with multiple chefs

These are similar to the Bounded Contexts above, but for me, thinking about them as Language Games emphasizes real people and real world interactions while Bounded Contexts emphasize entities and commands.

Link to this section  The Value of Language Games

Given that we already have Bounded Contexts as an idea, why even bother thinking about Language Games?

A Bounded Context is a subset of a domain model; it’s static, type-safe, and precise. A Language Game is grounded in a relationship between people; it’s fluid, ambiguous, and organic.

If we’re writing software to automate complex processes between teams, we need to first understand the people we’re writing software for and the way they communicate with others. Bounded Contexts help us partition a domain model into consistent and locally coherent smaller models, but they don’t help us understand how people communicate. They are the result of understanding how people communicate and rigorously formalizing it into a type system. So before we can encode a Bounded Context in the type system, we need to first become fluent in the Language Games people are playing.

Software design often falls flat because teams over-prioritize technical matters and under-prioritize social and interpersonal matters. One of the great value-adds of DDD is the importance it places on a shared understanding of the domain with stakeholders: software design starts with clear communication and iterative pointed questions. But DDD, like any methodology, is incomplete by itself, and adding Language Games and Wittgenstein’s philosophy of language augments our understanding of requirements gathering. Bounded Contexts, as a domain modeling tool, are technically focused; Language Games, as a concept of social interaction, are interpersonally focused.

Link to this section  In Closing: Dialect Continuums and Dialect Leveling

Before Portuguese, Spanish, French, and Italian had formal language standards, there was a
Dialect Continuum across Western Europe. Any two nearby towns had mutually intelligible dialects with slight differences. But over greater geographical distances, these differences compounded until the languages were no longer mutually intelligible. There was no essential/real/standard/legitimate dialect that you could point to as “Spanish” or “Italian”; instead, there was a fluid continuum of dialects from one country to another.

Today we take for granted that languages have standards and that minus accents or a handful of regional expressions/idioms we can understand other speakers of our language across great distances. People in Spain can talk to people in Mexico; someone from Boston can talk to someone from Texas. But without powerful nations forcefully imposing standards, and technology enabling the spread and enforcement of these standards, we’d have no reason to expect people who live great distances from each other to be able to communicate. In fact, we’d expect the opposite: languages naturally drift and dialects transform into distinct and unrecognizable languages. For example, Spanish, French, English, Irish, German, Russian, Hindi, Sanskrit, Bengali, and Greek all descend from one language: Proto-Indo-European. This blows my mind every time I think about it.

Through a process called Dialect Leveling, different dialects are combined into a single standardized language. Often times, this process is intentional, laborious, forceful, and political – very political. It took hundreds of years to standardize German.

When we start gathering requirements for a new project, we find ourselves in a situation like Western Europe before modern Romance languages existed. There’s a dialect continuum across the company; everyone uses words a little differently. People on the same team can understand each other. People on related teams, like networking and software development, can understand each other with some effort. But people on unrelated teams – consider a heads-down programmer and the rock star salesperson – are likely to have such different backgrounds and mindsets that they misunderstand each other quite a bit.

Our job during domain modeling is to perform something like dialect leveling. Just like dialect leveling across Europe yielded Spanish, Italian, French, etc., our dialect leveling yields Bounded Contexts. But the first step is to immerse ourselves in each dialect and learn the Language Games people are playing.