Replacing Myself with an LLM

I, like many normal people, have a close group of friends (I swear).

I have several Signal groupchats with these friends, where we send each other memes, discuss philosophy Socratic seminar style, and generally just clown around.

One day, one of these friends and myself decided it would be a fun experiment to see if we could replace me with a local LLM without the others knowing.

I’m dubbing this “the ultimate Turing test” because it takes the original test further in not just imitating a human, but actually replacing a specific human. Which as it turns out (at least in this case), is way, way harder.

Setup

To actually pull this off was going to be extremely difficult, because of several reasons:

This is an intimate group of friends that had known each other personally for years
We have many in-jokes and set communication styles
The groupchat talks near constantly

Tech Stack

Luckily, setting up the tech side of this imitation game was actually the easiest part.

Even though Signal is End-to-End Encrypted (E2E), it’s also open source and someone wrote a CLI-based Signal client for us to use. And even better, the CLI client also has a REST API wrapper server thanks to bbernhard, that we can use to create an Ollama to Signal bridge.

Thus, the only thing left to do was actually write a bridge to connect a local Ollama instance to this Signal REST API Client, which lucky for you I already did.

To recreate this setup yourself, you’ll need a few components up and working:

signal-cli a signal client (that allows you to receive and send messages)
signal-rest-api a wrapper for the CLI client (that will allow us to interact with it via HTTP)
ollama-api a Ollama client running your favorite LLM (I tried everything from Deepseek to Gemma3)
signal-to-ollama-bridge a nodejs server that takes the incoming messages from Signal, feeds them to Ollama, and then spits the response back to Signal.

I vibecoded the Signal to Ollama Bridge in Nodejs. At a basic level, it calls the Signal endpoint for new messages, and then when it receives a message in a specific groupchat, it processes the message and does an action depending on what it was.

Setting up signal-cli and the REST API

To setup the Signal client, you have to sign in and authorize it just like any Signal app. The docker container I used had you authenticate with a QR code on the browser window for your particular Signal account:

Then, it would give you access to the swagger docs for the available endpoints:

Once the client was setup, I could begin developing the bridge by sending endpoint requests and processing the responses.

Something I discovered on accident while building this, was that although Signal allows you to “hide” your phone number from contacts, it still requires the phone number for direct messages- also, you can still find this number from any groupchat you’re in with that person using the signal-cli client, it still exposes the number (even if it isn’t visible anywhere on the normal Signal interface). I’m assuming this is intentional functionality from the Signal team, but it is somewhat disappointing.

Setting up the Ollama bridge and Ollama

First, clone my repository for the bridge.

git clone https://github.com/asynchronous-x/signal-to-ollama-bridge

Navigate into the directory and install the dependencies

cd signal-to-ollama-bridge && npm i

Next, rename the .env.example file to .env, and change the variables to the ones you want to use. This includes the initial context prompt, your Ollama and API keys, and the ID of the Signal group you want to respond to.

Then run the bridge by starting the node server:

node server.js

Testing and Refining

The basic jist of the bridge is these two different types of functions, one for sending messages to signal:

And another for sending context and prompts to Ollama:

To test and refine this process, I created a mock groupchat with the friend I was setting this up with. Over the course of a few days I added more and more features, so the bot would do things like “emoji react” to messages, ignore some messages, and respond when someone @ me specifically in the server. I also added a “killswitch” keyword so that we could disable it without being at my machine. We experimented with a couple different models, and even tried subbing in chat-gpt-4o to see if it made a difference on the plausibility.

ironically this process taught me a lot about prompt injection too

One thing really interesting is that the messages sent from the Signal API show up as a separate person to my specific phone - so they almost appeared as another person in the chat, not myself sending them, but to everyone else in the chat they looked completely normal.

Running the experiment

Ultimately for the actual experiment I ran Gemma3, using my own hardware on a local server, so no data about our groupchat actually left my network.

This was the first message the LLM sent on my behalf, so the experiment was started on a Wed evening.

After sending this message (my bot is in purple), I watched from the shadows- I did not send a message or reaction image until it was over

I won’t bore you with the play by play, I’ll just send some of the highlights. For about two days the bot went relatively undetected, with only the occasional odd message, but nothing bad enough to warrant a confrontation (which was my self selected end of the test).

Around the second day it went off the rails and sent this very out of context message:

Which garnered some confusion and questioning from the others in the chat. Luckily, my co-conspirator could do some damage control and keep the experiment running for a while longer.

Then, on Friday around noon (~42 hours after the imitation game began) the bot responded in such an unhinged manner one of the guys finally called it out:

After the guys called it out, we broke down the experiment and asked a few questions to try and pinpoint how the suspicion grew, and let them in on the joke the other friend and I shared.

One thing to note, I had coded the bot to flip a coin on each image sent to the groupchat, and 50% of the time react to it with a preset of emojis. This ‘algorithm’ was so close to perfect for my typical reactions I’m shocked.

Once the cat was out of the bag and the gig was up, we started just messing around with the bot and trying to break it.

(I had coded the bot to only respond every so often, so we had to send a bunch of messages to ensure a response to that prompt injection)

You can also see here the “random emoji” reaction falling to pieces in an edge case of someone sending this many short messages.

Eventually after an absurd amount of time the prompt injected response came back, and wow was it a doozy:

In fact it actually broke the bot, it started leaking training data the response was so screwed up:

I thought we had killed the bot, since I wasn’t home to check the server logs at the time.

But then the bot just straight up started refusing to obey the prompt injections??

Conclusions

As it turns out, this enhanced imitation game is really, really hard. Most of the local models we tried out are severally context window limited, unable to process large amounts of messages to make a sensible response. And when I tried paid larger models like 4o, they’re too politically correct for as casual a conversation as this.

Overall, here’s how I’d rank the models for this particular challenge:

Gemma3
ChatGPT 4o
Deepseek R1
Llama 3.3
Llama 3

Improving the bridge

There’s a bunch of ways I would improve this bridge to make it more believable.

Change the emoji reaction process to work based on a decision from the LLM, rather than a raw chance
Rework the “ensure complete sentence” function to be way less brittle, and work with the LLM rather than a regex check
Improve the response length/size so it’s much less random and makes way more sense

I highly encourage anyone interested to fork the codebase and try and improve the server, contributions welcome <3