The Behavior-First Mindset

Part 6 of 9 in the series: Unit Testing — A Behavior-First Approach

This post explores the mindset shift from code-first to behavior-first testing. While it's part of a weekly series, you can read it independently if you're familiar with basic test patterns.

Reading Time: ~9 minutes

Context for New Readers

The core idea is simple: the way you think about testing affects the code you write. Most developers start by thinking about code — the classes, methods, and implementation details. But that leads to brittle tests. The behavior-first mindset flips this around: you start with what the system should do, not how it does it.

The Mindset Shift: From Code to Behavior

When I first started writing tests, I thought about testing in terms of code coverage. I'd look at a class and think: "What methods does this class have? I need to test each one." This is code-first thinking — starting with the implementation and working backwards.

This approach has a fundamental flaw: it couples your tests to your implementation. When you refactor code, your tests break — not because the behavior changed, but because the implementation changed.

Behavior-first thinking flips this around. Instead of starting with the code, you start with the behavior: "What should this system do?" Then you write tests that verify those behaviors, regardless of how they're implemented.

Code-First vs. Behavior-First: An Example

Code-First Approach — tests method calls, internal properties, helper methods:

describe('Task', () => {
  it('should call calculatePriorityBetween with correct parameters', () => {
    const spy = vi.spyOn(task, 'calculatePriorityBetween');
    task.prioritize({ below: belowTask, above: aboveTask });
    expect(spy).toHaveBeenCalledWith(belowTask.priorityIndex, aboveTask.priorityIndex);
  });
});

Behavior-First Approach — tests outcomes and value:

describe('Feature: Prioritize Task', () => {
  test('Scenario: Task is prioritized after another task', () => {
    const testDsl = TaskTestDsl();

    const taskA = testDsl.generate.task().withId('task-a').prioritizedToTopOfList().build();
    const taskB = testDsl.generate.task().withId('task-b').prioritizedAfter(taskA).build();
    const taskC = testDsl.generate.task().withId('task-c').prioritizedAfter(taskB).build();

    taskC.prioritize({ below: taskA, above: 'END_OF_LIST' });

    testDsl.assert
      .tasks([taskA, taskB, taskC])
      .sortWith(Task.PRIORITY_COMPARATOR, (c) =>
        c.toExpectedOrder([taskA, taskC, taskB])
      );
  });
});

When we refactor — extract utilities, change algorithms, reorganize code — the behavior-first tests don't break.

Why TDD is Powerful: Design Pressure on the API

The most powerful aspect of TDD isn't the tests themselves — it's the design pressure that writing tests first creates. When you write a test before implementing code, you're forced to design the API from the user's perspective.

Notice something important: we use the Test DSL for setup and assertions, but we don't abstract the execution itself. The line taskC.prioritize({ below: taskA, above: 'END_OF_LIST' }) is written directly in the test. By keeping the API call visible, the API under test becomes the center of attention. If it's awkward, it stands out immediately.

TDD and Code Coverage: A Side Effect, Not a Goal

When you practice TDD properly, you naturally get close to 100% code coverage. But coverage isn't the goal — it's a side effect of organized, methodical testing.

The Future of Engineering: Agents and Leads

I believe the future of software engineering looks like this: AI agents write most of the code, and human developers act as "Leads" who supervise, review, and guide.

I've found a huge amount of overlap between leading small teams of developers and managing AI agents. Everything in this blog series has helped me as an architect and lead developer review, manage, and work alongside humans first. These patterns worked so well with human teams because I never had to open a test file and figure out how a particular team member likes to mock or spy on imported modules. If I see a commit tagged with refactor:, I don't expect to see updated tests. If I see an updated test, I ask: what requirement changed? This made collaboration and review dramatically easier.

With a shared culture around these patterns, tests become communication. They've helped my team think more from the customer's perspective. And when someone doesn't know what to test, it's usually a sign that we haven't spent enough time defining the behaviors our customers need to solve their problems.

As I work more with AI, I believe that all developers will be elevated to the role of Lead, and the ability to review code contributions will become the bottleneck. My day used to look like checking in with my team, reviewing any open PRs, and then diving into my own IC work. The daily or every-other-day cadence of project check-ins has turned into every 10 or 20 minutes an agent has something ready to review. Having patterns and standards as anchors has helped immensely. Without them, review at that pace would be impossible.

This is why behavior-first testing and TDD are more important than ever. When you're reviewing a PR from an AI agent, the tests tell you what behavior was intended, how the API was designed, and whether the implementation is correct. Without good tests, reviewing AI-generated code is like reviewing code written by a developer you've never met. But with behavior-first tests, you can quickly understand what the code does, why it exists, and whether it's correct.

What's Next

In the next post, we'll explore how tests serve as PR documentation — how reading tests first in a pull request helps reviewers understand intent, see what changed, and identify what didn't.