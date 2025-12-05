xAI Grok 4.20 created a 47% return in the Nasdaq and outperformed all others in the Alpha Arena competition.
32 instances of various LLMs (including multiple variants of the same model under different prompting strategies like “Situational Awareness,” “Monk Mode,” “Max Leverage,” and “New Baseline”) were each allocated $10,000 in real money to trade autonomously on the Nasdaq exchange.
The total capital across all participants was $320,000. Models had to generate trading ideas, size positions, time entries/exits, and manage risk without human intervention, using only market data inputs.
This lasted 2 weeks and focused on volatile tech stocks, including Tesla (TSLA), Nvidia (NVDA), Microsoft (MSFT), Palantir (PLTR), Amazon (AMZN), and others.
Grok 4.20 executed 105 trades and dominated the leaderboard, with its top instance (Situational Awareness strategy) achieving a +47% return (growing $10,000 to $14,698), with an aggregate return across its instances cited as approximately 12.11% (possibly an averaged or weighted figure).
2 thoughts on “XAI Grok 4.2 May Have Cracked the Code on Trading”
Already, most trading is done by autonomous trading algorithms, trading in microseconds against each other, so much so that being physically close to the stock exchange is worth millions in electron proximity. Sometimes these programs exaggerate moves of a stock or even sector, especially volatile sectors like tech, but also small caps.
In one famous instance, that also pointed to other less spectacular moves of the S&P index, a “Flash Crash” from the algorithms, resulted in several arrests: https://en.wikipedia.org/wiki/2010_flash_crash
“New regulations put in place following the 2010 flash crash[10] proved to be inadequate to protect investors in the August 24, 2015, flash crash — “when the price of many ETFs appeared to come unhinged from their underlying value”[10] — and ETFs were subsequently put under greater scrutiny by regulators and investors.[10]
On April 21, 2015, nearly five years after the incident, the U.S. Department of Justice laid 22 criminal counts, including fraud and market manipulation, against Navinder Singh Sarao, a British financial trader. Among the charges included was the use of spoofing algorithms; just prior to the flash crash, he placed orders for thousands of E-mini S&P 500 stock index futures contracts which he planned on canceling later.[11] These orders amounting to about “$200 million worth of bets that the market would fall” were “replaced or modified 19,000 times” before they were canceled.[11] Spoofing, layering, and front running are now banned.[4]
The Commodity Futures Trading Commission (CFTC) investigation concluded that Sarao “was at least significantly responsible for the order imbalances” in the derivatives market which affected stock markets and exacerbated the flash crash.[11] Sarao began his alleged market manipulation in 2009 with commercially available trading software whose code he modified “so he could rapidly place and cancel orders automatically”.[11]”
So, there are/were guardrails in place, but will they work against AI? Maybe more importantly, will a neutered SEC, CFTB, DOJ, and other agencies both in the U.S. and abroad be allowed to curb such activity? Trump’s administration has said no one should hinder the development of AI, and has been the most pro-crypto currency – a form of virtual speculation on something with no underlying value – administration in the world.
Eventually, there should be enough AI competing against each other that they should cancel each other out; some will short the market/stocks too. using leverage that might bankrupt the company, sector and even potentially the entire banking sector – see: Long Term Capital Management in 1997: https://en.wikipedia.org/wiki/Long-Term_Capital_Management. This wasn’t AI but its sophisticated trading derivatives might have been similar to what AI could do in minutes instead of months.
Faced with AI that’s too fast and unpredictable to keep up with, and tax and trading costs that whittle advantages down to nothing (what was the tax liability of these AI systems in the test?), the smartest thing for ordinary investors to do might be the oldest: Buy and Hold and index of top stocks.
This was inevitable, it’s why I think their is zero chance the stock market exists in a few years, unless they globally institute massive changes, which will impact day traders the most. I think they will have to put limits in place on how fast you can buy or sell a stock, and a mandatory time of holding a stock (IE:24hrs).