Replacing Myself with an LLM

Can I fool some of my close friends in a Signal groupchat?

Mar 24th, 2025
ai tutorial nodejs

I, like many normal people, have a close group of friends (I swear).

I have several Signal groupchats with these friends, where we send each other memes, discuss philosophy Socratic seminar style, and generally just clown around.

One day, one of these friends and myself decided it would be a fun experiment to see if we could replace me with a local LLM without the others knowing.

Iā€™m dubbing this ā€œthe ultimate Turing testā€ because it takes the original test further in not just imitating a human, but actually replacing a specific human. Which as it turns out (at least in this case), is way, way harder.

Setup

To actually pull this off was going to be extremely difficult, because of several reasons:

Tech Stack

Luckily, setting up the tech side of this imitation game was actually the easiest part.

Even though Signal is End-to-End Encrypted (E2E), itā€™s also open source and someone wrote a CLI-based Signal client for us to use. And even better, the CLI client also has a REST API wrapper server thanks to bbernhard, that we can use to create an Ollama to Signal bridge.

Thus, the only thing left to do was actually write a bridge to connect a local Ollama instance to this Signal REST API Client, which lucky for you I already did.

To recreate this setup yourself, youā€™ll need a few components up and working:

I vibecoded the Signal to Ollama Bridge in Nodejs. At a basic level, it calls the Signal endpoint for new messages, and then when it receives a message in a specific groupchat, it processes the message and does an action depending on what it was.

Setting up signal-cli and the REST API

To setup the Signal client, you have to sign in and authorize it just like any Signal app. The docker container I used had you authenticate with a QR code on the browser window for your particular Signal account:

Then, it would give you access to the swagger docs for the available endpoints:

Once the client was setup, I could begin developing the bridge by sending endpoint requests and processing the responses.

Something I discovered on accident while building this, was that although Signal allows you to ā€œhideā€ your phone number from contacts, it still requires the phone number for direct messages- also, you can still find this number from any groupchat youā€™re in with that person using the signal-cli client, it still exposes the number (even if it isnā€™t visible anywhere on the normal Signal interface). Iā€™m assuming this is intentional functionality from the Signal team, but it is somewhat disappointing.

Setting up the Ollama bridge and Ollama

First, clone my repository for the bridge.

git clone https://github.com/asynchronous-x/signal-to-ollama-bridge

Navigate into the directory and install the dependencies

cd signal-to-ollama-bridge && npm i

Next, rename the .env.example file to .env, and change the variables to the ones you want to use. This includes the initial context prompt, your Ollama and API keys, and the ID of the Signal group you want to respond to.

Then run the bridge by starting the node server:

node server.js

Testing and Refining

The basic jist of the bridge is these two different types of functions, one for sending messages to signal:

And another for sending context and prompts to Ollama:

To test and refine this process, I created a mock groupchat with the friend I was setting this up with. Over the course of a few days I added more and more features, so the bot would do things like ā€œemoji reactā€ to messages, ignore some messages, and respond when someone @ me specifically in the server. I also added a ā€œkillswitchā€ keyword so that we could disable it without being at my machine. We experimented with a couple different models, and even tried subbing in chat-gpt-4o to see if it made a difference on the plausibility.

ironically this process taught me a lot about prompt injection too

One thing really interesting is that the messages sent from the Signal API show up as a separate person to my specific phone - so they almost appeared as another person in the chat, not myself sending them, but to everyone else in the chat they looked completely normal.

Running the experiment

Ultimately for the actual experiment I ran Gemma3, using my own hardware on a local server, so no data about our groupchat actually left my network.

This was the first message the LLM sent on my behalf, so the experiment was started on a Wed evening.

After sending this message (my bot is in purple), I watched from the shadows- I did not send a message or reaction image until it was over

I wonā€™t bore you with the play by play, Iā€™ll just send some of the highlights. For about two days the bot went relatively undetected, with only the occasional odd message, but nothing bad enough to warrant a confrontation (which was my self selected end of the test).

Around the second day it went off the rails and sent this very out of context message:

Which garnered some confusion and questioning from the others in the chat. Luckily, my co-conspirator could do some damage control and keep the experiment running for a while longer.

Then, on Friday around noon (~42 hours after the imitation game began) the bot responded in such an unhinged manner one of the guys finally called it out:

After the guys called it out, we broke down the experiment and asked a few questions to try and pinpoint how the suspicion grew, and let them in on the joke the other friend and I shared.

One thing to note, I had coded the bot to flip a coin on each image sent to the groupchat, and 50% of the time react to it with a preset of emojis. This ā€˜algorithmā€™ was so close to perfect for my typical reactions Iā€™m shocked.

Once the cat was out of the bag and the gig was up, we started just messing around with the bot and trying to break it.

(I had coded the bot to only respond every so often, so we had to send a bunch of messages to ensure a response to that prompt injection)

You can also see here the ā€œrandom emojiā€ reaction falling to pieces in an edge case of someone sending this many short messages.

Eventually after an absurd amount of time the prompt injected response came back, and wow was it a doozy:

In fact it actually broke the bot, it started leaking training data the response was so screwed up:

I thought we had killed the bot, since I wasnā€™t home to check the server logs at the time.

But then the bot just straight up started refusing to obey the prompt injections??

Conclusions

As it turns out, this enhanced imitation game is really, really hard. Most of the local models we tried out are severally context window limited, unable to process large amounts of messages to make a sensible response. And when I tried paid larger models like 4o, theyā€™re too politically correct for as casual a conversation as this.

Overall, hereā€™s how Iā€™d rank the models for this particular challenge:

  1. Gemma3
  2. ChatGPT 4o
  3. Deepseek R1
  4. Llama 3.3
  5. Llama 3

Improving the bridge

Thereā€™s a bunch of ways I would improve this bridge to make it more believable.

I highly encourage anyone interested to fork the codebase and try and improve the server, contributions welcome <3

šŸ’Æ Asynchronous Twitter/X Bluesky/X GitHub Twitch Reddit