On the importance of data preprocessing ownership

4 June 2026 data preprocessingdata analysismodel selectionMLAI engineeringretrievalvector dbs

A quick thought on three parallels that I have recently observed. We keep pushing the needle of what is possible and what can be automated, but:

These are still true in the AI era, by the way, although now the task is often done much faster by an AI agent. Besides that, is there another parallel in the time of AI engineering?

I can think of at least one:

Picture with a bunch of documents in need of sorting by Wesley Tingey on Unsplash.

← back to posts