← Wiki

A Guide to Andrej Karpathy's AutoResearch

Link: https://www.datacamp.com/tutorial/guide-to-autoresearch

Source: DataCamp, 2026

This is a practitioner-oriented tutorial that walks through the autoresearch system step by step — how it is structured, how to set it up, and how to adapt its design principles to your own domain. DataCamp's framing is useful for people who want to go from understanding what autoresearch does to actually running something like it. The tutorial covers the three-file architecture in depth: prepare.py (human-managed data utilities), train.py (the target that the agent edits), and program.md (the research direction, set by humans and iterated over time). The 5-minute fixed experiment budget — which is central to making all results comparable — gets particular attention.

What makes this tutorial worth reading beyond the GitHub README is the "how do I adapt this?" section, which addresses the obvious next question for anyone not running ML experiments: can this pattern — human writes strategy in Markdown, agent iterates on implementation, fixed evaluation budget, reviewable diffs — apply to other tasks? The answer it gives is yes, with guidance on what the analogues are for different domains. The key design constraints that make autoresearch trustworthy (narrow scope, bounded diffs, fixed evaluation, human-readable logs) translate to other contexts.

This is the right starting point if you want to understand autoresearch well enough to build something inspired by it, rather than just reading about what Karpathy did. It is lighter than the GitHub repository and heavier than the press coverage, sitting at the practical-implementation level that is most useful for engineers and technically-minded founders evaluating whether and how to apply the pattern to their own systems.