Buckaroo: Conflict-Driven Dependency Resolution for C++

How Buckaroo's dependency resolver borrows ideas from SAT solvers — using conflict-driven clause learning to prune the search space and resolve C++ packages blazingly fast.

March 1, 2019·6 min read

Dependency resolution is one of those problems that looks simple until it isn't. Pick versions. Satisfy constraints. Done — right?

Not quite. For languages like C++, where the ecosystem is vast, builds are slow, and transitive dependency graphs can run deep, a naïve resolver quickly becomes unusable. Buckaroo — a decentralised package manager for C++ — tackles this with a resolver that borrows a key idea from the world of SAT solvers: Conflict-Driven Clause Learning (CDCL).

What is Buckaroo?

Buckaroo is a decentralised package manager for C++ (and friends). Instead of relying on a central registry, packages live directly in source control — GitHub, BitBucket, GitLab, or any hosted Git/HTTP endpoint. There is no publish step. A manifest file describes dependencies, which point to further manifests, forming a dependency graph that Buckaroo traverses directly over Git and HTTP.

A typical workflow looks like:

$ buckaroo init
$ buckaroo add github.com/buckaroo-pm/boost-thread@branch=master
$ buck run :my-app

Key features include fully reproducible builds, private and public dependencies to avoid diamond-dependency hell, semantic versioning support, and the ability to depend directly on Git branches — what the project calls "live at head".

The Resolution Problem

At its core, dependency resolution is a constraint-satisfaction problem:

Each package identifier is a variable
Each version is a possible value for that variable
Version constraints form an equation that must be satisfied

This is equivalent to a boolean satisfiability (SAT) problem, which is NP-complete in the general case. But Buckaroo faces an even harder variant: it must resolve with partial information.

Because package manifests live inside source control, the resolver has to fetch them on demand. To reveal the entire search space upfront would mean downloading every version of every package in the transitive closure of dependencies. For real-world projects, that could take days. So the resolver must make decisions without knowing all the constraints — and sometimes it will guess wrong.

How the Resolver Works

The resolver delegates all fetching to a component called the Source Explorer, which exposes two operations:

Fetch the list of available versions for a package
Fetch the manifest for a specific package version

Given these, the resolver proceeds as a backtracking search.

The Happy Path

Suppose a project depends on A@1.x and B@2.x. The resolver picks a constraint to explore — say A@1.x — and selects the latest candidate: A@1.0.

A@1.0's manifest declares a dependency on B@2.0. The resolver intersects the constraints:

B@2.x ∩ B@2.0  ⇔  B@2.0

One candidate remains. B@2.0 is fetched, has no further dependencies, and the resolution is complete.

When Things Go Wrong: Backtracking

Now suppose A@1.1 — the newest version — depends on B@3.0. The constraint intersection becomes:

B@2.x ∩ B@3.0  ⇔  Ø

An empty set. There is no version of B that satisfies both constraints simultaneously. The resolver backtracks: A@1.1 is deselected, and A@1.0 is tried instead. A@1.0 depends on B@2.0, which satisfies B@2.x, and resolution succeeds.

Simple enough. But in a large graph, naïve backtracking can revisit the same dead ends repeatedly. This is where CDCL comes in.

Conflict-Driven Clause Learning

CDCL is a technique from the SAT-solving world. When the solver reaches a contradiction, instead of simply backtracking one step, it analyses the conflict to derive a new constraint (a "learned clause") that rules out entire families of bad assignments in one go.

Buckaroo applies the same principle to package resolution.

Unsatisfiable Cores

When the resolver discovers that B@2.x ∩ B@3.0 = Ø, it doesn't just forget this fact after backtracking. It records the unsatisfiable core — the minimal set of clauses that together produce the contradiction:

B@3.0 was introduced because A@1.1 was selected
B@2.x was a top-level constraint
Therefore: the combination {B@2.x, A@1.1} is unsatisfiable

Any future resolution path that would re-introduce this combination can be skipped immediately, without re-fetching any manifests or exploring any sub-trees. The resolver learns from its mistakes.

This is precisely the mechanism that makes Buckaroo fast on complex graphs: conflicts are not just obstacles to route around, they are information that prunes future work.

Breadth-First vs Depth-First — Neither

A natural follow-up question: should the resolver explore breadth-first or depth-first?

Buckaroo benchmarked both strategies and found no clear winner — in pathological cases, the wrong choice can turn a seconds-long resolution into one that takes hours.

So Buckaroo takes a different approach: it prioritises constraints most likely to fail. A version constraint like @1.2.3 is typically more restrictive than a branch constraint like @branch=main, so it is explored earlier. The key insight is that checking whether a set of already-known clauses forms an unsatisfiable core is a purely local, cheap operation — no network calls required. Network round-trips only happen when a new manifest needs to be fetched. So before issuing any new fetch, the resolver checks whether the current path is already doomed by a previously learned core and prunes it immediately if so. Early conflict discovery means more paths are eliminated locally, before any expensive I/O is ever triggered.

Manifest Deduplication

One further optimisation: from the resolver's perspective, two package versions that produce identical manifests are interchangeable. Buckaroo groups versions by their manifest hash to avoid redundant work. If libfoo@1.2.0 and libfoo@1.2.1 have the same manifest, the resolver only needs to explore that dependency graph once.

Why This Matters for C++

Most language package managers operate in an environment where fetching metadata is cheap — a single HTTP call to a central registry. In that world, even a slow resolver is fast enough.

C++ is different. There is no central index. Manifests are embedded in Git history. Fetching a manifest means a network round-trip to a Git host, potentially cloning partial history. In this environment, every unnecessary manifest fetch is expensive, and the ability to prune the search space via CDCL is not a micro-optimisation — it is what makes the tool usable at scale.

Summary

Concept	Role in Buckaroo
Source Explorer	Lazily fetches package versions and manifests on demand
Backtracking search	Core resolution strategy when initial guesses fail
Constraint intersection	Merges version constraints from multiple dependents
CDCL / Unsatisfiable cores	Derives learned constraints from conflicts to prune future paths
Conflict-first ordering	Prioritises likely-to-fail constraints to maximise early pruning
Manifest deduplication	Groups identical manifests to avoid redundant sub-tree exploration

Buckaroo's resolver is a reminder that dependency resolution, done seriously, is a hard combinatorial problem — and that techniques from the SAT-solving literature translate surprisingly well to the package management domain. If you work in C++ and have been burned by slow or broken dependency resolution, it is worth understanding how Buckaroo approaches the problem differently.

⊛

Jun 1, 2019·4 min read

buckaroo.pm - Find C++ packages

A companion website for the Buckaroo package manager — browse ~400 C++ packages from GitHub, explore the dependency graph, and find what you need instantly with precompiled fuzzy search.

c++package-management search

⊘

Sep 10, 2018·7 min read

How to Find a #deadbeef: Fetching Arbitrary Git Commits

Fetching a specific commit hash from a Git repository is surprisingly non-trivial. Here's why — and the progressive strategy that makes it fast.

git package-management c++

Feb 28, 2026·8 min read

Perfect Hash Functions: Collision-Free by Design

A deep dive into perfect hash functions — what they are, why they matter, how to find one efficiently with parallel search, and how they compare to JavaScript's built-in Map.

algorithms data-structures typescript