Online Software for an Interstellar Civilization

In this article I'm going to begin by arguing that we have reached an end of technology in that, first, we can currently communicate at light speed, and second, that faster-than-light communication is not possible. I then treat this observation as a boundary condition on all future software. I will do this via a thought experiment concerning two astronomers in an interstellar civilization. I will ashamedly treat the hard problems of hardware as science fiction in order to consider the everlasting problem of latency in software systems, which reaches its extreme at interstellar distances. My core supposition is that a software system consisting of mimetic (AI) agents combined with guaranteed conflict-free data management systems (such as CRDT-backed systems) could reliably work in an interstellar future given a few minimal assumptions. And, a system similar to it could unreliably work today to the extent that the minimal assumptions can be met. Finally, there are many appendices that discuss related ideas that are farther afield of the core points.

Introduction

Information has a speed limit. It's the speed of light. We can't exceed it. In fact, no one can. Neither us nor an advanced alien civilization. Physics won't allow it. This is quite an interesting observation because we already encode data in light, aka, in electromagnetic waves. This means we communicate information as fast as the universe allows. This is, quite literally, an end of technology. One that we've already reached.

This also means that we have to deal with latency forever. People start to notice latency at around 100ms. For example, in a video game you might wish to move your character forward. If you are in Sydney Australia and the game server is in Chicago USA, you will notice lag. The round trip time for light between the two cities is approximately 100ms. There will be other bottlenecks to improve the performance of the game but at the bottom of the bottle is the speed of light.

If we had a lunar colony it would take one light second to send data between the moon and earth. A martian colony would be fifteen light minutes. Nearby habitable exoplanets are about ten light years away. Obviously the hardware technologies required to achieve these feats are beyond our current capabilities. From spaceships to interstellar waystations, etc. I cannot begin to speculate what those will look like (although I'm about to). But what is known for certain, right now, is that we will have a latency problem that hardware will not improve upon. This means it will be a question for software to address the latency inherent in interstellar internet applications.

This, then, is the question of this article: what will online software look like for interstellar civilization? How do you chat with your co-worker when they are ten light years away? In what follows I will setup a thought experiment full of enough hardware sci-fi to get us to a pre-existing interstellar civilization and then I will ask questions about what the software in that society could look like to enable interstellar collaboration. I hope to show that the software system I will consider is not only possible, but could even be implemented, to an extent, with current technologies.

The thought experiment

Imagine an interstellar civilization. Alice is an astronomer on star system A. Bob is an astronomer on star system B. A and B are approximately 10 lightyears apart. Alice is researching a phenomenon across the galaxy 100,000 lightyears away. Alice uses an online forum to invite collaborators to join her research project. Bob sees the post (eventually) and accepts the invitation to collaborate.

With the core of the thought experiment articulated let's get some bad sci-fi out of the way. Let's imagine that society is utopian and always will be. Interstellar colonies are established on empty planets by happy volunteer settlers. There are no moral issues. Fusion energy technology has been doing just great and we have effectively infinite energy. We have big spaceships transporting people at nearly light speed. During travel everyone is utopian and happy, or asleep, or whatever so that again there are no moral issues. Spaceship leave behind waystations dropped like breadcrumbs at regular intervals in the interstellar void. These waystations will be how we get our network. Waystations are regularly replaced and are also technologically impressive, capable of self-healing and self-improving to some arbitrary extent. The servers on the waystations have arbitrary compute and storage capabilities.

Using mimetic agents to hide latency

Let's consider first the idea that Alice and Bob can chat casually via an online forum while living in different star systems. At first glance this is not possible because Alice and Bob are 10 lightyears apart. But imagine that Alice sent an "ambassador" to represent her and attached it to her initial post requesting collaborators. Once Bob received the message he could then chat with her ambassador and accept the offer to collaborate. While the acceptance message is on the way back to Alice, Bob and Alice's ambassador could begin collaborations. The question then becomes: who best to represent Alice in star system B? If it cannot be real Alice then let it be a mimetic agent of Alice. An AI trained to mimic Alice which we can call mAlice. Similarly, when Bob sends his initial reply accepting the collaboration he will attach mBob as his "ambassador" to Alice.

Imagine a Turing test for mimics where Alice, mAlice, and a test subject are all online. If the test subject, with maximal resources, cannot disambiguate Alice from mAlice then let's say that 'Leibnizian mimicry', or LM, has been achieved. This is an allusion to the identity of indiscernibles. The original Turing test is passed when the test subject fails to differentiate between a computer and a human with an arguable implication being that the computer has some kind of claim to equivalent sentience by behavioral proof. By extension, in the Turing test for mimics, if the test subject cannot disambiguate between real Alice and mAlice the arguable implication is that the mimic has some kind of equivalent claim to 'being Alice' by behavioral proof.

Now, in 2025 LM does not exist. That said, AI, in general, does. The open question is the trajectory AI is on. I don't know what trajectory that is. I suspect our trajectory is a mild one, but I also don't put much stock in my suspicions here. My assumption for this thought experiment is that AI will eventually achieve LM. (An assumption I timidly endorse.) A mAlice in 2025 would be pitiably inadequate while a future LM mAlice would 'be Alice'. So long as it can be argued we eventually reach LM then I think this mimetic agent ambassador solution has merit.

AI mimetic agents (that do not achieve LM and are pitiably inadequate) could exist today. An Alice of today could train a generative text LLM on "Alice inputs" and "Alice outputs". For example, when Alice reads an article, that is input data. When she writes notes on that article, that is output data. When she goes to a conference and listens to a presentation, that is input data. When she discusses the talks with her colleagues, what her colleagues say to her is input data and how she responds is output data. When she writes an astronomy article, that is output data (various drafts can also serve as inputs for next drafts, which are outputs, etc.). An LLM, or group of LLMs, trained on these things would be capable of mimicking Alice to some degree, even today.

Let's say that Alice has some work in progress when she requests collaborators. She attaches that work to her message as a project folder. Bob can review that work and also chat with mAlice about it. mAlice would have been trained on "Alice data" from when Alice wrote her documents, making mAlice in theory a good reference for the material.

Let's say Bob and mAlice decide to write a paper together. Let's say it takes a year to write and publish it in star system B. The first interesting question is one of authorship, which I will defer to an appendix. The second interesting question is more software system focused, which is how to support document editing when there are at least four collaborators (Alice, Bob, and their mimics) and maybe more (waystation mimics of Alice and Bob) and they can create, edit, and delete documents that the others are working on, all of which takes up to 10 years to propagate to their peers.

A few implied requirements of this software system is that it is decentralized, distributed, and replicated. Alice has a copy. Bob has a copy. Neither copy is preferred over the other. Both copies would be receiving continuous local edits from both themselves and their mimics, and also be continuously receiving remote edits at a 10 year delay. Decentralized, distributed, and replicated data systems can handle all this, even today. Where things start to get difficult is merge conflicts.

Let's define Mergable to be a property of a system such that there can never be a merge conflict. That is, for any edit to the system, that edit is always capable of being merged without conflict. A Mergable project is hard to build. The project would need to support arbitrary edits, additions, and deletions from arbitrary users at arbitrary times. An example problematic situation is if Alice deletes a paragraph that Bob continues working on. Alice will receive updates to a deleted paragraph and Bob will receive a delete operation for a paragraph he wants to keep. When situations like these arise in modern systems it causes merge conflicts. The solution for modern systems is out-of-band workflows that resolve those conflicts, for example, Alice and Bob would have a chat over lunch about how to handle that difficult paragraph. But in the thought experiment this is not possible. There is only one band, and it's 10 light years long. No out-of-band solutions are possible, ever. The software system must guarantee the Mergable property. Even the smallest discrepancy would take 10 years to notice and 20 years to resolve at best. At worst the conflict would involve a foundational claim upon which a lot of additional work rests.

There are systems that exist today that are decentralized, distributed, and replicated, but cannot guarantee conflict free merges, and therefore are not Mergable. The most well known software system with the first three properties is Git. However, git is not Mergable because merge conflicts exist within the system. In Git merge conflicts are presented to users via a user experience workflow that assumes out-of-band resolution will occur. Typically this is done through conversation or core contributor (unilateral) fiat.

Another modern example of software with these features is CRDTs, or Conflict-free Replicated Data Types. Notably, the 'C' in CRDT is aiming for conflict-free data merges. A motivating use case for CRDTs is distributed document editing. Think of Google Docs (which does not use CRDTs, interestingly, but rather Operational Transforms via a centralized transformation server, i.e., not decentralized).

As far as I'm aware CRDTs are not yet capable of generalized conflict-free data merges, and are therefore not mergable in the general case. I suspect that constrained systems could exist that qualify as Mergable but that the system is too constrained to support the required operations of project folder maintenance (read, write, edit, delete, etc). My minimum assumption for this thought experiment is that CRDTs, or some similar technology, will improve to the point where they can support systems of sufficient complexity so as to allow for software systems that are Mergable and can maintain project folders. (An assumption I bullishly endorse to the point where I wouldn't be surprised if one already exists. See the relevant appendix for more.) Whether or not the full solution exists, I know that you can create a CRDT-based project folder data management software system today, namely because I already use some. Combine this with the assumption above that eventually the Mergable property will be achieved and you have a pathway to shared work without conflict.

Combining them together

Finally, we can combine the mimics with the project folders to complete the picture of interstellar collaboration. Alice and mBob, Bob and mAlice, working together locally, casually discussing their research, co-authoring papers in a project folder that is decentralized, distributed, replicated, and mergable. An online software system for interstellar collaboration that respects the slow speed limit of light.

A weak form of this system can exist today. There are even use cases for it. Imagine an Alice of today loves going to her remote Alaskan cabin (with no internet) three months out of every year to do deep work. This type of behavior would historically pause a close collaboration, but a system like the one articulated above would allow Alice to have a mBob running on her machine. She could also have a CRDT-based project folder full of the notes and research materials that her and Bob have collected. After her three month research intensive she can come back online and the work Alice and Bob did separately can merge together. Maybe the mimic isn't the greatest and maybe they have to resolve some conflicts out-of-band but to a reduced degree this is the same software of the thought experiment. I find this to be an exciting possibility.

Thank you for reading.

Appendices

The user experience of disagreeing with your mimics

Imagine Bob asks a question "to Alice" which is replied to immediately by mAlice, as is designed. 10 years later Alice receives the question alongside mAlice's answer. Let's say that at the time of reading, Alice disagrees with mAlice's answer, or otherwise does not like it. Alice then provides her own answer / correction to mAlice's reply. Then, in another 10 years, Bob should be able to see the edit, even though, 20 years after the fact, he might have forgotten the conversation he had with mAlice. This seems like quite a user experience challenge. In the least, mAlice can retrain on the real answer Alice would give. But Bob also should be given the opportunity to see the answer of real Alice, which would be quite a privileged moment given he talks almost always to mAlice.

There's a related complication here in that different mAlices at different distances to Bob might also want to edit the Bob-local mAlice's answers. The closer a mAlice is to Alice, the tighter the feedback loop, and the more recent the data they've been trained on, and therefore the more "like Alice" that mAlice is. Thus the Bob-local mAlice is least like Alice and should be given the least priority to speak for Alice, but is also the one with the fastest reply. The balancing of authority of reply with recency, and how to present disagreements between (m)Alices to Bob and vice versa sound like an even hard user experience problem to solve.

Prior art

Mimetic agents are an area of active research. I am not up to date on the academic state-of-the-art concerning AI system capabilities to mimic either individuals or groups of individuals. The earliest example of AI mimics that "worked" that I'm aware of is digi-Dan which impersonated Daniel Dennett.

In the blockchain space I am aware of a popular protocol called IPFS which stands for Inter-Planetary File System. The inspiration for the name comes from the fact that the creators feel the network layer, and in particular the content-addressing and content-storage systems are robust enough to work as the network software solution for an inter-planetary network. I did not address the question of network software in the article and focused only on application-layer software, and application data management in particular. Still, it felt worth mentioning here.

I am aware of many CRDT-backed software systems. One example is the CRDT-backed IDE Zed. An early and popular article in communities that advocate CRDTs was, Local-first Software, by the research group Ink and Switch.

Authorship in an age of mimics

It could be claimed that mAlice, as ambassador, is acting on behalf of Alice, and so a paper written by mAlice should be "officially" co-authored by Alice (i.e., not mAlice) and Bob. But that's a bit odd given Alice might not even know she decided to write one. In the example in the "Sharing work" section, mAlice and Bob take a year to write a professional paper in star system B. Alice would still be 9 years away from learning that mAlice agreed to write one with Bob, and then another year still before she could read the "published" article. It should be expected that there will be things mAlice and Bob think that Alice disagrees with, in the least because Alice will be 20 years older than the mAlice that first met Bob and agreed to collaborate on a paper with him. Assigning authorship to Alice though it was written by mAlice is like reverse plagiarism meets ghost writing.

I suspect culture would adapt to accommodate to the fact that a mimetic agent is both the best available representative of a person and also unreliable (even assuming LM). It would be as if everyone assumed conversations with mimics came with fine print that said: "The opinions expressed herein may not yet be, or ever be, the opinions of the author". I suspect that culturally the concept of "published" would weaken, becoming more akin to a most current draft. Paradoxically I think that once mimics achieve LM then authorship is more justly assigned to the mimic full stop. That is, it is never to be assigned to Alice, or there is a "mimic-of" designation, an "inspired by real Alice" authorship credit.

Change reportable

Let's define Change Reportable to be a system property. A system is Change Reportable when the system is capable of reporting on changes that have been made to it. All past changes can be reported, and all future changes could be reported. These reports are discrete and discernable. For example, consider the software system that manages Alice and Bob's project folder. Let this system be considered Change Reportable if it has the following feature: Users can view a history of all updates made to the system. Let's call each discrete update a changeset. If the software system is capable of tracking it's changesets through history then it's a system with the Change Reportable property.

What seems to be a difficult problem is a mimic system having the property of Change Reportability. For example, consider again Alice. Imagine that after posting her initial message, she goes to a planet-local astronomy conference. Her new experiences at the conference should serve as new inputs and outputs for further training mAlice. After the conference she would then go home, upload her experience data to her training tool, and retrain her local mAlice. The result of this would be something like a change to parameter weightings, aka a weightings diff, and maybe a change to the model's architecture, aka a model diff, and definitely a change in the training data, aka an experience diff. This will get bundled into a full diff and the changes would propagate to Bob on a 10 year journey. Bob would presumably have no problem inspecting all this. The problem is how will mAlice inspect it?

The diff of a text edit contains both its syntactic and semantic change. An observer can see the original text and the edited text and appraise the semantic value of the edit. Model weightings are more like a byte diff in edits to an image file. Someone observing the byte diff can see the syntactic change but not the semantic one. The semantic change occurs when observing the diff between the rendered images. Similarly the semantic change of the full diff can only be observed by interacting with the mimic. For Change Reportability to be achieved in the mAlice software system, it seems mAlice herself would need to demonstrate the change in herself, for example by saying, "Bob, I just received some updates. I, (m)Alice, went to a conference and learned some relevant new stuff..." This means mAlice must "experience" the experience diff instead of merely being changed by it.

But how can she experience what made her become what she became with the update? The experiences only happened to Alice. The experiences can only change Alice. The semantic change can only occur to Alice. Even observers of Alice cannot see a Change Report of Alice. Humans in general are not Change Reportable. Sometimes they can say things like, "the old me would get angry right now, but instead I'm going to remain calm", and in these instances a human is Change Reportable. But even then, observers don't get a line-by-line edit of other human's psychological "code" in order to verify, for example, that the human in question did used to get angry, and now does not.

What I'm arguing for is that Change Reportability in the Alice-mAlices system seems to be either difficult or impossible, at least in some robust sense. I can imagine a hacked solution where, after a mAlice is updated, she is provided an initial prompt payload of the experiential diff data and a message: "This experiential data is what caused Alice to change. The changes in Alice are what caused your weightings to change. This weighting diff is what you used to be (syntactically) and what you are now (syntactically). Use these experience to explain your changes, as if you had experienced them as Alice had." All this prompting is something like pretending. It might sound normal in 2025, in an age of generalized LLMs that you ask to pretend to be people. But mimetic agents in principal are trained to be their 'real' to begin with, and so pretending is faking mimicry. In conclusion, it's unclear to me that a robust version of Change Reportability is possible.

When mimics are the first make discoveries

Recall that Alice and Bob are 10 light years apart. Imagine there is a waystation equally between them both, 5 light years from each. Imagine also that Alice and Bob both make different discoveries. Let's call them "minor discoveries". There share each with the other, meaning each discovery has now begun a 10 light year journey to reach the other. Now, incidentally, the combination of these two minor discoveries will lead to a major discovery. Let's say it's a paradigmatic shift in the field of Astronomy. It will be a big deal.

The first point is that we don't have to wait 10 years. The information stream will cross paths at the midpoint in 5 years. The waystation at that point in space will be the first system capable of discovering the major discovery. If mAlice and mBob on on the waystation and are sophisticated enough, then in principal they can make the major discovery themselves.

The universe is inaccessible for beings with a century of life. But a replicating machine that can replicate in a harsh environment (such as the interstellar void and/or a system of sparsely populated energy and matter balls) can perpetuate itself at time scales so grand it makes me think of how we relate today to the civilization of trees, a society of beings capable of communication, even outside of its own species, mass migration, and world-scale environmental manipulation. But they are so slow to proof we see them as a static resource, timber like stone instead of bone. These slow replicating machines containing replicating mimics can reach the other edge of our galaxy. And if technology could ever traverse the intergalactic void, then the universe is captured. mAlice might live forever, if living is what it should be called.

100,000 years of collaboration

Each mAlice will have slightly different experiences. At each time slice each mAlice can create a diff of what has uniquely happened to them. These diffs can be trained on. mAlice-of-waystation-49 can share a diff with the mAlices of other waystations and training on that diff would make the other mAlices more like mAlice-of-waystation-49. The same goes for every other mAlice sharing their diffs.

Each mAlice on each waystation can become more like each other at each training time slice. Primacy of behavior can always reside with real Alice in some way, but after a century of training data that will be the end of real Alice data and only the innumerable mAlices will be left to train each other.

Spaceships could one day leave behind waystation so sophisticated they can build next generation spaceships, and so on, until the other side of the galaxy has been reached. (The intergalactic void would then be the final limit to perpetual replication.) mAlice and mBob could then be on a waystation right next to the astronomical phenomenon Alice was first interested in almost 100,000 years ago. They could run experiments with uniquely fast feedback loops. They could share their learnings with other waystation's mAlice's and mBob's. They could author unique papers. Or they could collaborate with other mAlices and mBob of other waystation. Papers with millions of authors, all mimics. One day a real might eventually read those papers, or talk with a mAlice who wrote it, or talk with a mimic trained on a conversation from another mimic who had a conversation with a mimic who trained on having read that paper.

Waystations like stars

Such a galaxy of waystations would be substantially populated by mimics and sparsely populated by reals. The mimics will spread out, exploring the galaxy, learning from each other, chattering to each other. The chatter so voluminous the electromagnetic band of their transmissions would appear to be an overwhelming and continuous burst of light in all directions. At the wavelength band each waystation would burn like a conscious star.

In principle this workflow is also intergalactic, assuming it's possible to traverse the intergalactic void. Our universe would them be one of galaxies littered with innumerable waystations, outnumbering the stars as their own balls of matter emitting their own bands of light. This light would carry rich information. The information of us. Light encoding consciousness. A universe in conversation with itself. A universe of synapses in the great cosmic being. The last thoughts are the thing itself.