Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. Deepmind researchers introduce a new framework for language model inference, “Tree of Thoughts” (ToT), which generalizes over the popular “Chain of Thought” approach to prompting language models, and enables exploration over coherent units of text (“thoughts”) that serve as intermediate step toward problem solving.

Tree of thought allows multiple step analysis like chain of thought and allows multiple comparisons of different multiple step analysis. Tree of thought allows increased options after each step and allows the system to restart the at the first or earlier steps to look again for new options. It then finds the best option after multiple searchs for best of different analytical options.

Tree of Thoughts (ToT) allows LMs (Language Models) to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. The experiments show that ToT significantly enhances language models’ problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords.

For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%.

Tree of Thought is an improvement of basic input and output, chain of thought and self-consistency with chain of thought.

The Tree-of-Thought approach extends existing planning formulations by considering multiple potentially feasible plans simultaneously at each problem-solving step, and proceeding with the most promising ones. The integration between thought sampling and value feedback organically integrates planning and decision-making mechanisms, enabling effective search inside a solution tree. Traditional decision-making procedures usually require training dedicated reward and policy models as in reinforcement learning whereas we use the LM itself to provide the value estimates for decision making.

The Tree-of-Thought formulation is more versatile and handles challenging tasks on which GPT-4 only achieves very low accuracy with standard prompts.

Deliberate search such as ToT might not be necessary for many existing tasks that GPT-4 already excels at, and as an initial step this work only explores three relatively simple tasks that challenges GPT-4 and calls of better search and planning abilities incorporated with LMs. However, as we begin to deploy LMs for more real-world decision making applications (e.g. coding, data analysis, robotics, etc.), more complex tasks could emerge and present new opportunities to study these research questions. Also, search methods like ToT requires more resources (e.g. GPT-4 API cost) than sampling methods in order to improve task performances, but the modular flexibility of ToT allows users to customize such performance-cost tradeoffs.

Brian Wang

Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.

Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.

A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.

2 thoughts on “Tree of Thoughts Improves AI Reasoning and Logic By Nine Times”

gary oblock

June 6, 2023 at 3:58 pm

Well, this is interesting but it seems like trying trying to pound a nail with a crescent wench. It seems to me that Herbert Simon’s General Problem Solver could do these kind of things in 1957. Trying to do problem solving with Generative AI looks like it will go nowhere but good for them for trying. Also, I didn’t read this in depth but it looked to me like each specific problem needed a tailored bit of code crafted in order for it to work, if so that really reduces the utility.
Aris-t

May 30, 2023 at 7:29 pm

Does this remind anyone else of predictive cpu scheduling? Probably coming next.

Comments are closed.