directed acyclic graph

  • directed acyclic graph
  • directed acyclic graphs
  • DAGs
  • DAG
Image of Author
June 6, 2023 (last updated August 30, 2023)

A graph is merely a collection of edges (aka, lines, arrows) and vertices (aka, dots, squares, circles). If you draw some boxes and arrows on a whiteboard, you probably just drew a graph.

A directed graph means each edge points in one direction. I.e., each edge has an orientation from one vertex to another. From this you can get the notion of a path. That is, you can start at a vertex, follow an edge to the next vertex, then another edge to another vertex, etc.

A directed acyclic graph means there are no loops. There are a few way to say this, for example, no vertex can reach itself via a (non-trivial) path.

You will often see this referred to as a DAG.

Merkle DAGs

Merkle DAGs are used in multiple protocols, including git and IPFS. In fact, IPFS has a great resource for learning about DAGs in general, and Merkle DAGs in particular: ProtoSchool: Merkle DAGs

Topological ordering

Topological ordering is the minimal property of a DAG. You can think of edges in a DAG as a 'before than' relation. You can construct a sequential ordering (aka, a list) such that the 'before than' relation is respected, i.e., for each edge, the starting vertex is prior in the list to the ending vertex. Any sequence that satisfies this is a topological ordering of the DAG (no uniqueness guarantees for arbitrary DAGs).

Why are DAGs important

In my opinion, DAGs are important because they are in a goldilocks zone of stricture. Out of the box, a DAG has only one property: it has to have at least one topological ordering. This is a relatively weak requirement. This means the initial scope of applicability is wide. I.e., a lot of things could be represented as a DAG. At the same time, this property is provably conducive to sequential ordering. I.e., there is such a thing as a "correct way to represent a DAG as a list". For computers in particular, this is a nice feature of DAGs.

Causal order

I think one of the most powerful and existential features of pure DAGs is their ability to represent an intuitive notion of causation. The 'before than' relation could be construed as 'caused by' as well.

Time is more strict than causation. Let me explain. Let's say I had different times t1, t2, and t3. I then claim that, t1 < t3, and t2 < t3. Do you have an intuition about the relationship between t1 and t2? I do! My intuition is that one is before the other. That is, either, t1 < t2 < t3, or t2 < t1 < t3. The implication is that time is a total order. Now replace t with e. That is, instead of different times, you had different events. Also, < means "causes". That is, e1 < e3 means e1 caused e3. What are your intuition about the relationship between e1 and e2, now? I have no intuitions. Neither e1 nor e2 need to have caused each other. They can both have caused e3 without any causal relationship between them. If I run into a friend on the street, the causes could be unrelated (e.g., I'm walking my dog, and they are going out to eat). I do not expect the fact that I'm walking my dog to have any affect on my friend going out to eat. The implication is that causation is a partial order.

The fact that DAGs can represent temporal relations is already a particularly useful application. To then also be able to represent the weaker ordering of causation makes it incredibly useful.

When you think about causation, and particularly causation modelling, representing as a DAG is something to keep in mind.