Upskilling Week workshop on Building a production-grade AI agent with Python

It’s hacking time!

Yesterday I delivered a workshop at the Upskilling Week organised by Mobile WorldCapital Barcelona, as a collaboration through PyBCN and delivered at Barcelona Activa’s Cibernàrium.

My purpose was to show how to apply the good old fashioned software engineering good practices to an agentic application. For this purpose, we built an agent with a raw ReAct loop capable of converting natural language questions about weather into queries against an SQLite database containing historical data from the Meteobeguda weather station.

So questions like “Are we currently experiencing a heatwave?” or “What is the longest streak of tropical nights on record?” can be answered by the agent by using a tool capable of running read-only queries against the database.

After building this agent, we reimplemented several parts of it in order to show how to make the application more robust:

We rebuilt the main loop as a LangGraph StateGraph.
We built an independent service hosting the tools as an MCP via FastMCP.
We added callbacks to Langfuse in order to register traces of agent calls.
We added a set of evals in Langfuse order to assess agent performance.

The whole workshop was setup as a series of TDD steps separated into git branches by which, after solving one part of the workshop and getting pytest green, one could checkout the branch with the next part of the workshop, with new failing tests and a new task to code.

We really insisted in model agnosticity and the need to provide a free experience (both in the sense of free speech and free beer), so workshop defaulted to the usage of small LLMs that run reasonably on a CPU via Ollama, while allowing participants to bring their own model from a third-party provider. We used Docker containers as a means of running a self-hosted local Langfuse deployment, and as a quick way to run Ollama locally for those who did not wish to install it directly on their devices.

Some takeways

A summary of some of the take-home ideas that were central to the workshop.

Demistify AI agents: at the core, an agent is just a loop involving calls to an LLM and calls to a list of provided tools.
Model agnosticity: think of your LLM as a commodity. Your application code should depend as little as possible on the choice of LLM and LLM provider. Avoid vendor lock-in.
Good software engineering practices are as relevant as ever: in the age of coding assistants, ensuring that your code is tested, robust and maintainable is as important as ever, and forcing coding agents to use methodologies such as TDD is a great way to get good software out of them.
I am personally very happy about how many people came to the workshop; I had a great time preparing and delivering it.

Upskilling Week workshop on Building a production-grade AI agent with Python

It’s hacking time!

Some takeways

Links