my thesis on AI
My attempt to piece everything together… to understand the present, and make predictions of the future.
April 2025
Key pieces
1. Deep learning works. As "effective compute" used in deep learning scales in orders of magnitude (OOMs), performance consistently goes up. Effective compute OOMs are driven by (1) compute (FLOPs), (2) algorithmic advances, (3) "unhobbling gains" (such as access to internet, tools, or agentic workflows).1 This is predicted by The Bitter Lesson5 of 70 years of AI research: general methods that leverage computation (specifically, search and learning) are ultimately the most effective, and by a large margin. Sam Altman observes that "the intelligence of an AI model roughly equals the log of the resources used to train and run it."9
2. Three scaling paradigms exist: (1) Pre-training, (2) Chain-of-thought, (3) best-of-N. Pre-training on a large corpus of data is what the original ChatGPT 3.5 was based on; chain-of-thought unlocked performance (especially in computationally intensive domains) by scaling inference-time compute (allowing the LLM to "reason" or "think" instead of "blurting out" the first answer that comes to mind). Finally, we can generate large numbers of answers in parallel and pick the majority winner6. Each can be increased independently.
3. We are running out of training data, but synthetic data and self-play may break through the data wall, particularly in closed-form domains such as coding and mathematics, where self-play has enabled rapid gains (ChatGPT o3, for instance, outperformed >99% of competitive programmers on Codeforces). However, it is unclear if LLM performance will scale in "reward-sparse" domains (such as creativity, writing, humor), although releases such as GPT 4.5 and GPT 4o Image Generation continue to push the state of the art.
4. Reinforcement Learning from Human Feedback (RLHF) has been crucial in aligning AI with human preferences while ensuring safety. RLHF has been both a safety mechanism as well as a product innovation for current LLMs1, instilling important basics such as instruction-following and helpfulness, and generating human-preferred output, in addition to providing safety guardrails (such as refusing dangerous requests).
5. However, it is unclear whether RLHF will be enough to align more powerful systems, especially systems that are optimized for completing long-horizon, outcome-based objectives in a world in which humans are unable to understand the AI's outputs (as systems become more sophisticated - already, most humans are unable to verify the accuracy of PhD-level STEM solutions, for instance). New paradigms, such as a student-teacher approach in which RLHF'd weaker models supervise stronger models, have shown some promise.
6. Have we already achieved AGI? It depends on the definition - and everyone has a different one. What is undeniable is that current state of the art AI models are very capable. Gemini 2.5 Pro and ChatGPT o3 have an estimated >130 IQ (vs human baseline of 100); o3 scored 87.5% on ARC-AGI (vs. 85% human baseline); o3 scored in the >99th percentile on Codeforces (competitive programming); o3 scored 87.7% on GPQA Diamond (PhD-level STEM question in which domain experts achieve 81%). (Ioannis' note: my definition of AGI is above-average human-level performance across reasoning (ARC-AGI), mathematics, and written language, so I believe o3 is AGI).
7. However, significant "unhobbling" is still required to derive full utility from these models. Currently, models lack access to tools (such as calculators, web search, control of web browser and computer), lack integration within human workflows that may provide context (access to our thoughts, calendars, emails, drive) and suffer from limited context.
8. The present is increasingly agentic; the future will be even more so. AI agents are already showing their promise, with OpenAI's Deep Research replacing the first layer of primary research, OpenAI Operator and Manus performing repetitive tasks (although they are not generally capable yet), and coding agents such as Cursor, Windsurf, v0, and Lovable translating natural language prompts to full-fledged apps. A substantial amount of code today is being written by AI agents, instead of humans.
Predictions for the future
1. AGI will commoditize capability and prioritize creativity, judgement, and agency. Similar to how washing machines freed up time in the household, or how farming allows chefs to focus on creating dishes instead of watering radishes, AGI will free up time in our professional lives by automating routine tasks to allow us to uplevel into management and ownership. We will become tastemakers, creating companies and managing teams of AI agents. The barrier to entry to entrepreneurship and management will be agency, creativity, and judgment, not capability.
2. Abundance will result as the prices of goods and services fall and demand rises to a new equilibrium. Everyone will have a lawyer, an accountant, a software engineer to work on their custom projects; AI will struggle moving from 99% automation to 100%, implying that humans will remain in the loop for a while as managers and owners (and reap the benefits of AI agent orchestration).
3. Capital will be paramount in the post-AGI world. The price of labor will go down as cheaper and more capable AI substitutes for roles that are automatable; the rewards to capital, equity and ownership will magnify as humans uplevel into management and AI agent orchestration. The ability to generate better results by using more compute (scaling inference time compute, best-of-N approaches) points to the importance of capital post-AGI.
4. The future will be turbulent as our social order is reconfigured. Workers in industries that are disrupted faster by AGI will feel the pressure sooner to uplevel into management or change industries; this transition will be painful and tragically, some may be left behind. The government may step in with increased social welfare and universal basic income programs.
5. It will take years for the effects of AI to propagate through the economy. Human inertia to change, regulatory and legislative hurdles, and data privacy will all have to be resolved. It took many years for the effects of the internet, cloud, and mobile to be felt; I anticipate AI to follow a similar (but accelerated) adoption curve.
6. Large (or infinite) context sizes will unlock massive use cases. Models are currently limited by their context window; upcoming releases will feature large (5M+) context sizes capable of storing entire codebases, company databases, and other relevant data.
7. Startups will become smaller due to productivity gains - and we will see a single-person unicorn. Startups such as Cursor and Lovable are already generating $XXM+ revenue on very small teams; this trend will continue as the models progress.
8. AI models will become scientific innovators. The areas of scientific knowledge today are so vast and specialized that a lot of connections between fields that can lead to discoveries are missed; AI is better placed to find these connections and push science forward.
9. Open source and closed source will diverge, and closed source will win. The best models are currently all closed-source (even if by a small margin); this divergence will widen as proprietary algorithms, compute resources, and the data ownership of companies such Google, xAI, and OpenAI translate into enduring competitive advantages.
10. Product will matter. Raw model capability is important, but the way it is exposed to end consumers may be even more critical. Products such as Google's NotebookLM, OpenAI's "Ghibli moment" with 4o Image Generation, and OpenAI's Deep Research all captured the world's attention, creating delightful consumer interactions and generating signups in the process. I expect this to be a more meaningful differentiator among foundation models than benchmark results moving forward.
11. The foundation model layer will become commoditized; the majority of wrappers will be swallowed. Value will accrue to foundation model players with vibrant ecosystems and niche B2B vertical agents. Increasingly capable models will steamroll many in the application layer, especially in the basic-capability B2C space (PDF uploads, image editing, and so on). Foundation models will be commoditized, but ecosystems will maintain value (Gemini within Google Suite, ChatGPT as a first mover with XXXM MAUs, Grok within X).
Questions I'm pondering
- What is the role of humans in a post-AGI world? Do we model AGI as an increase in capital (per Jason Crawford2) which implies human prosperity through ownership and management, or an increase in labor (per Matthew Barnett7 at EpochAI), which implies sub-subsistence wages?
- How will society organize post-AGI? The end of the hunter-gatherer and agricultural eras saw fundamental shifts in the way society was structured; will there be a shift of similar magnitude post-AGI? If people do not need to work, will they be able to derive meaning in their life? Will we see universal basic income (UBI)? Will inequality be bridged or exacerbated?
- What will be the interpersonal impact of powerful AI models? As AI is shown to be more capable in traditionally "human" domains such as teaching, therapy, and healthcare, how will interpersonal human relationships be impacted? Will we spend most of our time talking to AI instead of other humans?
- Where will AI value accrue? Will foundation models become fully commoditized? Will wrappers survive, or will they be swallowed by ever more capable foundation models or built-on-demand custom software? Do competitive advantages even exist? What is the future of SaaS?
- How will we align future models? RLHF will break down when we cease to understand the outputs; can we train RLHF'd weaker models to supervise stronger models? What happens when models are architected to pursue long time-horizon tasks? What is p(doom)?
- What will the pace of progress be on non-closed-form tasks such as writing? Closed-form domains such as coding and mathematics benefit from self-play; how will the top AI writers, poets and artists fare against the top human ones?
- What is the role of the government in the race to AGI/ASI? Will projects move to state funding and control (per Leopold Aschenbrenner1)? How big of a role will espionage play (especially given the relative ease of stealing a model's weights?)
- How long will open source foundation model companies continue to publish their techniques? Meta Llama and Deepseek's published research is immediately copied by closed source model companies; how much longer will this continue?
Inspirations & footnotes
- "Situational Awareness" by Leopold Aschenbrenner. This piece really opened my eyes to the exponential rate of improvement and development we are facing with AI, and the risks and rewards along the way.
- "The future of humanity is in management" by Jason Crawford. A compelling piece in the post-AGI predictions debate. Jason argues that AGI is best modeled by an increase in capital rather than an increase in labor, leading to human prosperity through management and ownership.
- "By default, capital will matter more than ever after AGI" by L Rudolf L on lesswrong. Argues that social mobility will decrease and capital/ownership will matter more than ever in the post-AGI world.
- "The Zero-Day Flaw in AI Companies" by Aidan McLaughlin. Who will win - foundation models or wrappers? "General companies will steamroll narrow companies. Wrapper companies will outmaneuver model companies. Everyone is doomed…" says Aidan.
- "The Bitter Lesson" by Rich Sutton. General methods (specifically search and learning) that leverage computation are ultimately the most effective. Specialized models get left behind.
- "One Useful Thing" by Ethan Mollick. My favorite day-to-day AI writer.
- "AGI could drive wages below subsistence level" by Matthew Barnett. Matthew argues that AGI can (should?) be modeled as an increase in labor, driving down human wages to subsistence level.
- "Machines of Loving Grace" by Dario Amodei. Also, the inspiration behind the design of this website.
- "Three Observations" by Sam Altman.