Building AI Agents with Pydantic AI

Working a bit in all areas of the Data spectrum but specializing in Data Science, ML Engineering Wouldn't call myself a python expert Thanks to the PyDay organisers for putting everything together

This applies to pretty much any trending tech. Seen it for Big Data, AI, etc

LLM is a very loose definition because it depends on the size of the NN and how feasible the generated text is Agent comes from RL

A simple starting picture This can become as complicated as necessary: sequence of actions, loops...

**Why a good example?** - Real-world PDFs with messy data - Structured but inconsistent formatting - Perfect for extraction tasks

Two-agent architecture Extractor is responsible for reading whole paper and getting info about authors and affiliations. Resolver is responsible for normalising affiliations into standardized names and raising issues. Automatic author affiliation btw is a real problem

A large part of the interaction with LLMs revolves around validating their outputs. Pydantic does modern python by leveraging type hints. There is a very nice pattern of modelling your IOs as Pydantic models and then building an API on top is almost automatic.

Model agnostic is important due to the speed at which SOTA changes Do not get married to a provider or a particular LLM

See Enric's workshop earlier for LangGraph Really useful tools each with its use case

Building AI Agents with Pydantic AI

Alberto Cámara

Alberto Cámara

AI is like high school sex

A glossary of terms

Today we will be working with arXiv papers

Our Challenge: Parse Author Data

Why use Pydantic?

Pydantic AI

Alternatives Comparison

Let's code!

https://github.com/ber2/2025-pyday-multiagents-pydantic-ai

Possible Extensions

Resources

Thanks