Martin Fowler gives a great introduction to the idea of event sourcing when he writes:
The fundamental idea of Event Sourcing is that of ensuring every change to the state of an application is captured in an event object, and that these event objects are themselves stored in the sequence they were applied for the same lifetime as the application state itself.
You can think of it as storing events instead of storing state.
For example, say you have a user blob. You could store the state in a typical manner, in a typical database.
{
"id": 1,
"name": "Bob",
"age": "50",
}
That data had to get there somehow. It was through user (or admin) edits. Those edits can be captured as events. (The following is contrived and likely algorithmically problematic, but hopefully it quickly captures the idea.)
[
{ type: 'new', id: 1 },
{ type: 'modify', id: 1, data: { 'age': '50' } },
{ type: 'modify', id: 1, data: { 'name': 'Bob' } }
]
You can calculate the same state from both.
Databases are often referred to as "the source of truth", aka, "the source". But how "the source" is shaped is what is being manipulated here. I like thinking of it as event-sourcing versus state-sourcing (though I also find both phrases vague).
Tradeoffs
One potential upside of event sourcing is that you get an event log "for free". Your event log is you data source. For example in the event sourced data above, you can see that age
was added before name
. Maybe that caused a bug in your system. You would not have found that in a state-sourcing system without something like an additional event log.
One potential downside of an event sourced system is state management is more complicated now. Recomputing state is infeasible at scale, and as a result, systems will maintain snapshots of state. For example, everyday, you can compute a snapshot and then use that as the base state for the next day.
The what-when tension
There is a symmetry here. One the one hand, a state source with a secondary event system, and on the other hand, an event source with a secondary state system. I'm currently thinking of it as a when/what distinction. An event captures 'when' trivially, and 'what' non-trivially. State captures 'what' trivially, and 'when' non-trivially.
The past-present tension
Another way of looking at it is event-sourcing gives primacy to past-truth in that the history of what happened is preserved first, and deriving the present state is the secondary concern. Meanwhile state-sourcing gives primacy to present-truth in that the current state is preserved first, and deriving the past states is the secondary concern.
The question then become: What do you want to derive? And what do you want to have "for free"? Do you want to have the present and derive the past? Or do you want to have the past and derive the present? I suspect different applications and user experiences will have different answers to these questions.
The space-time tension?
This one gets a little loose, but events are temporal and data-objects are spatial, so perhaps there is a space-time tension here as well. Objects are things in space. Events are moments in time. This is not space in the computer storage sense, but in a physical analogy sense. You must store the state and the events as objects in the computer storage. But philosophically it feels like what is being stored is the thing verse the time-slice. I can't figure out how useful this analogy feels.