Remember how I turned my old gaming laptop into a private LLM server with a Phoenix web interface? Maybe you remember how I was hindered by the fact that at the end of the day, my machine is a museum piece and barely runs models over 8GB?

Well, now I’m having the last laugh, because it turns out there’s a revolution occurring in the Large Language Model world, and it has the power to soup up the weak-sauce quantized (read: lobotomized) models that my aging system can barely run into true AGI (or at least, make them a bit better.)

As always, if you don’t care about the why and just want the how, I’ve put a link to the complete code down at the bottom of the article. If you are interested in the why…let’s talk about the Model Context Protocol (MCP).

What is the Model Context Protocol and Why Should You Care?

The Standard LLM chat interfaces of yesteryear (literally: last year) were typically just glorified text boxes – you type something in, the model spits text back out. That’s how I built my LLM interface in my last blog post.

But what if the LLM could do more than just respond with text? What if it could use the Google API, post something on X for you or use Retrieval Augmented Generation (RAG) to beef up your prompt? And most importantly, what if it knew when it should use these tools?

All of these abilities have been around for a while now; but there was no standard protocol for the models themselves to ask for and receive that external context. For example, over on Revelry AI, our first RAG implementation was pretty manual: we used a vector database to semantically search for relevant information *before* we sent anything to the LLM.

But what if the LLM could decide to do that for us by choosing from a menu of available tools?

Enter MCP (Model Context Protocol)

The Model Context Protocol provides a universal way to communicate between tools and models. This concept builds on the idea of AI “agents”, which have been around for a while. However, integrating each tool with each LLM usually required a bespoke, custom solution for every combination. MCP aims to solve this problem by establishing itself as a de-facto method of model/tool communication.

All MCP is, at its core, is a way to plug’n’play data sources into the LLM, that the LLM *chooses* to use or not, based on context.

Instead of the LLM trying to hallucinate answers about things it can’t know (like current data or your personal files), it can ask your application for that information directly using the tools available via MCP.

Hermes MCP: Elixir’s Bridge to Structured LLM Communication

When you find the dependency you want on GitHub, you breathe a sigh of relief. And I found `hermes_mcp`, an Elixir package that provides a clean, idiomatic way to work with the protocol. It’s new, it’s under development and *it works*.

Gif of llm interface using the google maps to get coordinates — I can never remember where stuff is

Getting started with `hermes_mcp` is straightforward. First, I added it to my mix.exs dependencies:

def deps do
  [
    # ... other deps,
    {:hermes_mcp, "~> 0.1.0"},
  ]
end

Getting started beyond that however, was slightly challenging. There’s **fairly** decent documentation over on hexdocs but I hit some snags on the way – so let me show you what I came up with, so you don’t have to wallow in despair, as I did.

1. Getting MCP servers running and accessible

First question: What MCP Servers providing which tools should I integrate?

I’m lazy – I’d already been tinkering with Cursor’s capabilities, I already had a couple of MCP servers installed, including one for searching and injecting Hex PM docs as context.

This consists of a TypeScript server that offers two tools, the ability to fetch and encode Hex documentation (using Ollama) and the ability to search that documentation using Retrieval Augmented Generation.

So…sorted. Conceptually, the next step is now this: We have to run this TypeScript server, offering these tools locally on the machine. Then (if we can get it setup) we’ll use the MCP to allow our model and tools to communicate over STDIO. The model will know what tools it has available; the results of these ‘tool calls’ will be fed back to the model.

MCP supports two different kinds of transport layers which handle the actual communication between the clients and servers: STDIO, or Standard Input-Output, is ideal for communication with processes running locally, and HTTP with Server-Sent Events. Everything uses JSON-RPC 2.0 as a common language to communicate.
`hermes_mcp` allows for both, and that means I don’t have to think about the transport layer beyond the initial set-up. Thank god. I’ll just be using STDIO for my purposes since everything is running locally.

However, if your LLM client runs in the cloud, or if you prefer to manage your tool servers separately, running them in a cloud environment is simple enough – many MCP servers are conveniently packaged as Docker images, which makes it straightforward to deploy them on the cloud.

Most servers natively communicate over STDIO, but if you want to make them accessible via HTTP/SSE, it’s easy enough to wrap the servers as discussed over here in a Fly Forum post.

So let’s go already! Here’s our application supervision tree.

# A structured supervision tree for MCP components
  children = [
      {LlmInterface.MCPSupervisor, []}
    ]

    opts = [strategy: :one_for_one, name: LlmInterface.Supervisor]
    Supervisor.start_link(children, opts)
  end

and let’s take a closer look at that `LlmInterface.Supervisor`:

children = 
  [
    # Group 1: Transports and Servers first
    Supervisor.child_spec(
      {Hermes.Transport.STDIO,
       [
         name: @hex_docs_mcp.hexdocs_mcp_transport_name,
         client: @hex_docs_mcp.hexdocs_mcp_client_name,
         command: "npx",
         args: ["-y", "hexdocs-mcp@0.2.0"]
       ]},
      id: :hexdocs_mcp_transport
    ),
    # ...other transports and servers
    # Group Two: Clients afterwards
    Supervisor.child_spec(
      {Hermes.Client,
       [
         name: @hex_docs_mcp.hexdocs_mcp_client_name,
         transport: [
           layer: Hermes.Transport.STDIO,
           name: @hex_docs_mcp.hexdocs_mcp_transport_name
         ],
         client_info: %{
           "name" => "LlmInterfaceWeb",
           "version" => "1.0.0"
         }
       ]},
      id: :hexdocs_mcp_client
      # ...other clients
    ),
    # Group 3: Tools Registry - starts after all clients are ready
    {LlmInterface.MCPTools,
     [
       clients: [
         {@hex_docs_mcp.hexdocs_mcp_prefix, @hex_docs_mcp.hexdocs_mcp_client_name}
         # ... other clients
       ]
     ]}
  ]

Now you might think that looks a little strange, and you’re not wrong. The first group handles setting up the server(s) and the transport(s), and our second group starts up the actual clients, using those transports.

Interestingly, the transport layer is implemented as its own process and added to the client – rather than just being configured within the client (which seems like overkill for STDIO). The transport could be theoretically implemented as functions within the client – and this would be perfectly fine (after all, the transport is just ‘how the client connects to the server’). But you gotta draw your boundaries somewhere! SSE or STDIO: all clients interact with their servers through a transport layer, regardless of the underlying communication mechanism. It’s the Elixir way: Why solve a problem with one process when you can solve it with three.

We’re using the `rest_for_one` strategy – this is important because of the order in which the processes are setup (Transports and Servers first; then Clients). If a process fails, all processes started after it (in order) are restarted, which (should) maintain the proper dependency chain during recovery.

The third thing we start up in our supervision tree is the MCP Tools GenServer – and this is the place where we’ll actually do the business of ‘discovering’ and then ‘calling’ the tools from our servers.

2. MCPTools GenServer

These servers provide capabilities to our application when they started up. All tools are capabilities, but not all capabilities are tools – ‘capabilities’ encompasses the full scope of what an MCP server advertises it can support or handle (including tools).

The thing is, we don’t know that off the jump. Sure, you could (and I did) hard-code the tools and the inputs that you get from `Hermes.Client.list_tools(client, opts \\ [])` so you know what to offer the LLM.

Here’s what a tool specification looks like, as returned by `hermes_mcp` (I’ve just included one tool rather than both for brevity):

%Hermes.MCP.Response{
   result: %{
     "tools" => [
       %{
         "inputSchema" => %{
           "$schema" => "http://json-schema.org/draft-07/schema#",
           "additionalProperties" => false,
           "properties" => %{
             "force" => %{
               "default" => false,
               "description" => "Force re-fetch even if embeddings already exist",
               "type" => "boolean"
             },
             "packageName" => %{
               "description" => "The Hex package name to fetch (required)",
               "type" => "string"
             },
             "version" => %{
               "description" => "Optional package version, defaults to latest",
               "type" => "string"
             }
           },
           "required" => ["packageName"],
           "type" => "object"
         },
         "name" => "fetch"
       }
     ]
   },
   id: "req_GDiDHc3JVUS4oglPFB0=",
   is_error: false
 }

What would be great is if we could ask the servers about their capabilities and tools once they’ve started up, and then maintain a record of what each server can do. That means holding state, and that means a GenServer!

Our GenServer is going to maintain two key pieces of state:

– `tools`: A list of all available tools for the LLM to use.

– `tool_map`: A mapping from tool names to their implementing clients.

This way, we can fetch what we need to know about the server(s) first and dynamically refresh them if needed.

3. Executing Tool Calls

The execute_tool_call function provides an interface for LiveView to interact with MCP tools:

  def execute_tool_call(%{"name" => name, "arguments" => arguments}) do
    IO.puts("Executing tool: #{name} with arguments: #{inspect(arguments)}")

    case GenServer.call(__MODULE__, {:get_tool_info, name}) do
      {:ok, client, tool_name} ->
        case Hermes.Client.call_tool(client, tool_name, arguments) do
          {:ok, %Hermes.MCP.Response{result: result, is_error: false}} ->
            result

          {:ok, %Hermes.MCP.Response{result: error_info, is_error: true}} ->
            %{error: "MCP error: #{inspect(error_info)}"}

          {:error, error} ->
            %{error: "Tool execution failed: #{inspect(error)}"}
        end

      :error ->
        %{error: "Unsupported tool: #{name}"}
    end
  end

This function:

1. Looks up which client is responsible for handling a particular tool

2. Extracts the tool name

3. Delegates the call to the appropriate client

4. Handles and formats various error conditions

Hey presto! I can dynamically read the capabilities of each server and provide them to the LLM.

API Key Related Setbacks

After getting the basic MCP tools working, I naturally wanted to expand my model’s capabilities with some real-world data sources. What good is an agent if it can’t actually interact with the world?

My ideal candidates were the Google Maps API (because that’s a fun way to put an LLM through its paces) and the Brave Search API (because it’s free under a certain amount of usage and my model will no longer think it’s 1969).

I’m not exactly sure why – either did something wrong (likely) or the library has a bug (unlikely) – but this is where things started to go wrong. I was facing maddeningly cryptic JavaScript errors, and things just weren’t working. I basically re-wrote the entire way my app worked, but nothing fixed my issue.

Eventually though, I nailed it down: I had trouble specifically running TypeScript MCP servers that required an API key set in an env variable. It worked roughly a third of the time, which makes me suspect it’s a race condition due to the way I was starting things up.

But both these servers could also be run using Docker (always the solution). I ended up running the three servers and my application, and everything worked flawlessly. If somebody reading this knows or suspects why this is happening: please let me know!

Implementing Your Own MCP Clients

You’re not limited to the clients I’ve included. Creating your own MCP client involves:

1. Setting up the transport layer (STDIO or HTTP)

2. Defining the client configuration

3. Implementing the tool handlers

and there already exist so so many servers, with more available everyday. Check out this list!

Conclusion: Taking Your LLM Server to the Next Level

Is my LLM still a lobotomized model which can barely keep the thread of a conversation? Yes. But by integrating MCP, I’ve fundamentally changed the relationship it has with information.

Before:

Llm interface not using an mcp tool — Please google this for me

After:

Llm interface successfully using mcp — Allow me to google that for you

Rather than trying to make my model know everything, I’ve given it something far more valuable: the ability to know how to find thing out. In a strange way, I’ve essentially made it more human – I don’t know what the capital of Belize is: but I do know where to find out. And now so does my poor hardware-constrained LLM!

Link to the Code

It’s Belmopan, by the way

AI Elixir Phoenix

How to use Hermes MCP to boost your AI Chat App