What’s “Advanced Reasoning”? We saw this with the GPT-o1 rollout, but what does it mean, and how does it work? This new model design thinks before it answers, breaking down problems into steps, double-checking its work and delivering more accurate results. Aimed at developers, it’s smarter than most Humans, with a 120 IQ, making it ideal for building AI systems and apps. But interestingly, its design is even more like our own brains.
AI Professional – Fiona Passantino, late December, 2024
What’s different about this model?
The next breakthrough in system cognition came with the launch of ChatGPT-o1. Until that time, you would query a model, and it would spit back an answer, right or wrong. It tended to just guess, get it wrong, be corrected by the Human operator, continue guessing, get refined and corrected, and keep wasting everyone’s time in the process.[i]
GPT-o1, rolled out in September 2024, introduced chain-of-thought reasoning. This means that the system takes more time to think about the correct answer before returning a query, running a baked-in series of audits and filters before releasing the answer. The revision, prompt refining and pushback for accuracy are done internally by the Mixture of Experts in its own brain.
But the difference is, before it provides the best-first answer, it re-runs the process back through its parameters again using different, adjacent Narrow models to provide perspective, check for errors, and test for accuracy. This means it takes a bit longer to get an answer – up to 45 seconds – but the likelihood of accuracy in the end is higher, saving more time in the long run.
Chain of Thought Reasoning
What’s going on in the AI brain? The answer is Chain-of-Thought (CoT) reasoning.[ii] It’s a technique enabling models to dissect complex queries and tackle them step by step, much like we might when faced with a challenging dilemma. It’s a new pathway mechanism built into its hard code, developed using reinforcement learning. In short, an Advanced Reasoning model is designed to break down complex problems into bite-sized pieces, tackling them step by step, before starting on inference. This does not mean that it doesn’t hallucinate; while AI “creativity” is getting better, it’s still a problem, and one that requires constant Human intervention.
What’s interesting about this approach is that the AI brain is starting to function more and more like our own. As children and teenagers, we respond to requests quickly, blurt out our answers based on our first-best internal queries, often without fact-checking first on the inside. Answers are “right” or “wrong”, and we are quick to judge “good” or “bad”. It’s easier and more satisfying. We may get creative in our responses and don’t care as much about the consequences. We are impatient, after all, and want to get back to our Play-Doh-scapes or TikTok feeds.
As we grow older and, hopefully, somewhat wiser, more of our internal fact-checking and filters kick in. Has our response been internally road tested first for empathy, accuracy, relevance and appropriateness for the environment? Are we right or is our brain hallucinating? An incoming Human challenge might come in that requires a technical answer.
How Humans Run Inference
A typical Human would run inference via other, neighboring areas of the brain that are intuitive, emotional, visual-spatial or historical. We might blend our reply to offer another perspective. We might blend in ethics, feasibility and judgment. There’s very little in the world that’s “right” or “wrong”, “good” or “bad”, as there is always another way of seeing things, and we are old enough to know that most hard facts about the world are entirely subjective. We might not answer as quickly, but our replies are more considered, more accurate, and take a broader view.
The o1 model is not necessarily built for consumers, but for developers, or insiders. We non-technical users might not even notice the difference when we run standard inference, asking how to best shampoo our cat or translate a recipe for Sicilian Cassata into our native language.
But the GPT-o1 model is an astonishing 10 IQ points higher than its predecessor, GPT-4, rolled out just 4 months earlier.[iii] With a score of 120 from the standard Mensa Norway, a full 20 points higher than the average Human, GPT-o1 surpasses around 91% of the population in raw cognitive ability. This is enormously helpful when developing applications or creating synthetic data used to train tomorrow’s models.
Need help with AI Integration?
Reach out to me for advice – I have a few nice tricks up my sleeve to help guide you on your way, as well as a few “insiders’ links” I can share to get you that free trial version you need to get started.
No eyeballs to read or watch? Just listen.
What’s “Advanced Reasoning”? We saw this with the GPT-o1 rollout, but what does it mean, and how does it work? This new model design thinks before it answers, breaking down problems into steps, double-checking its work and delivering more accurate results. Aimed at developers, it’s smarter than most Humans, with a 120 IQ. But interestingly, its design is even more like our own brains.
About Fiona Passantino
Fiona helps empower working Humans with AI integration, leadership and communication. Maximizing connection, engagement and creativity for more joy and inspiration into the workplace. A passionate keynote speaker, trainer, facilitator and coach, she is a prolific content producer, host of the podcast “Working Humans” and award-winning author of the “Comic Books for Executives” series. Her latest book is “The AI-Powered Professional”.
[i] Glaiel (2023) “Can GPT-4 *Actually* Write Code?” Substack. Accessed June 19, 2023. http://tylerglaiel.substack.com/p/can-gpt-4-actually-write-code
[ii] Kerner (2024) “OpenAI o1 explained: Everything you need to know” TechTarget. Accessed October 4, 2024. https://www.techtarget.com/whatis/feature/OpenAI-o1-explained-Everything-you-need-to-know
[iii] Şimşek (2024) “What is New ChatGPT o1 and Its Features” StockIMG.AI. Accessed October 4, 2024. https://stockimg.ai/blog/ai-and-technology/what-is-new-gpt-o1
“This post has helped me solve my issue, thanks a ton!”