Once the last of 120,000 hands of Heads-up, No-Limit Texas Hold’em were played on Jan. 30, Libratus led the pros by a collective $1,766,250 in chips. The developers of Libratus — Tuomas Sandholm, professor of computer science, and Noam Brown, a Ph.D. student in computer science — said the sizable victory is statistically significant and not simply a matter of luck.
“The best AI’s ability to do strategic reasoning with imperfect information has now surpassed that of the best humans,” Sandholm said.
This new milestone in artificial intelligence has implications for any realm in which information is incomplete and opponents sow misinformation, said Frank Pfenning, head of the Computer Science Department in CMU’s School of Computer Science. Business negotiation, military strategy, cybersecurity and medical treatment planning could all benefit from automated decision-making using a Libratus-like AI.
“The computer can’t win at poker if it can’t bluff,” Pfenning said. “Developing an AI that can do that successfully is a tremendous step forward scientifically and has numerous applications. Imagine that your smartphone will someday be able to negotiate the best price on a new car for you. That’s just the beginning.”
The pros — Dong Kim, Jimmy Chou, Daniel McAulay and Jason Les — will split a $200,000 prize purse based on their respective performances during the event.
McAulay, of Scotland, said Libratus was a tougher opponent than he expected, but it was exciting to play against it.
“Whenever you play a top player at poker, you learn from it,” McAulay said.
Les, of Costa Mesa, Calif., agreed that superior opponents help poker players improve.
“Usually, you have to lose a lot and pay a lot of money for the experience,” he said. “Here, at least I’m not losing any money.”
“This experiment demanded that we assemble some of the world’s best professional poker players who specialize in Heads-up No-Limit Texas Hold’em and that they would play to the best of their abilities throughout the long contest,” Brown said. “These players more than met that description and proved to be a tenacious team of opponents for Libratus, studying and strategizing together throughout the event.”
Libratus’ victory was made possible by the Pittsburgh Supercomputing Center’s Bridges computer, on which the AI computed its strategy before and during the event, and by Rivers Casino, which hosted the event.
“Rivers Casino was proud to partner with Carnegie Mellon University and the Pittsburgh Supercomputing Center to host the Brains Vs. Artificial Intelligence: Upping the Ante competition,” said Craig Clark, general manager of Rivers Casino. “History-making events like this are very important as they increase awareness of how companies in Pittsburgh are impacting the world.”
The event was surrounded by speculation about how Libratus was able to improve day to day during the competition. It turns out it was the pros themselves who taught Libratus about its weaknesses.
“After play ended each day, a meta-algorithm analyzed what holes the pros had identified and exploited in Libratus’ strategy,” Sandholm said. “It then prioritized the holes and algorithmically patched the top three using the supercomputer each night. This is very different than how learning has been used in the past in poker. Typically researchers develop algorithms that try to exploit the opponent’s weaknesses. In contrast, here the daily improvement is about algorithmically fixing holes in our own strategy.”
Sandholm also said that Libratus’ end-game strategy, which was computed live with the Bridges computer for each hand, was a major advance.
“The end-game solver has a perfect analysis of the cards,” he said.
It was able to update its strategy for each hand in a way that ensured any late changes would only improve the strategy. Over the course of the competition, the pros responded by making more aggressive moves early in the hand, no doubt to avoid playing in the deep waters of the endgame where the AI had an advantage, he added.
Sandholm will be sharing all of Libratus’ secrets now that the competition is over, beginning with invited talks at the Association for the Advancement of Artificial Intelligence meeting Feb. 4-9 in San Francisco and in submissions to peer-reviewed scientific conferences and journals.
Throughout the competition, Libratus recruited the raw power of approximately 600 of Bridges’ 846 compute nodes. Bridges total speed is 1.35 petaflops, about 7,250 times as fast as a high-end laptop and its memory is 274 Terabytes, about 17,500 as much as you’d get in that laptop. This computing power gave Libratus the ability to play four of the best Texas Hold’em players in the world at once and beat them.
Libratus has three main parts. The first has not changed much since 2015 when Sandholm’s team first entered its AI in a similar tournament against professional players (that time, humans won). This part computed a big list of strategies the AI could use when play began. At the outset of the tournament, Libratus had spent the equivalent of 15 million hours of computation honing its strategies.
The second part, now completely redesigned by Sandholm and his PhD student Noam Brown, worked to improve Libratus’s strategy with each hand. Called the “endgame solver”, it took into account “mistakes” the AI’s opponents made – instances where they left themselves open to exploitation – to predict the result of each hand. The team couldn’t tell from statistical analysis if the earlier version of the endgame solver improved the AI’s play at all, says Sandholm. “But this new one is just awesome.”
The final part of the AI looked for its own strategic weaknesses so it could change how it played before the next session. This sought to identify things its opponents were exploiting, such as a giveaway “tell” that another player had noticed.
Head’s-Up No-Limit Texas Hold’em is an exceedingly complex game, with 10160 (the number 1 followed by 160 zeroes) information sets — each set being characterized by the path of play in the hand as perceived by the player whose turn it is. That’s vastly more information sets than the number of atoms in the universe.
The AI must make decisions without knowing all of the cards in play, while trying to sniff out bluffing by its opponent. As “no-limit” suggests, players may bet or raise any amount up to all of their chips.
SOURCES - Carnegie Mellon, Youtube