Why AlphaStar Does Not Solve Gaming’s AI Problems | Design Dive

Why AlphaStar Does Not Solve Gaming’s AI Problems | Design Dive


Hi I’m Tommy Thompson and welcome to Design
Dive here on AI and Games. In this episode let’s talk AlphaStar – DeepMind’s grandmaster
level AI StarCraft 2 player. AlphaStar made headlines throughout 2019 as the competence
of the system grew, first defeating pro StarCraft players TLO and MaNA followed by playing in
public matchmaking on the Battle.net European servers, allowing it to climb to the top 0.15%
of all players. The big question I hear a lot is how can the games industry capitalise
on this and build their own Deep Learning AI players. But it isn’t as straight forward
as that and despite the real innovation and excitement around AlphaStar, this isn’t going
to have an immediate impact on the way game AI is developed – or at least not in the way
you think. And in this video I’m going to explain why… Let me stress that this video isn’t intended
to speak ill of DeepMind and their work. AlphaStar is an incredible achievement that – even in
academic circles – still felt like it was years away. But rather I want to… temper
peoples enthusiasm a little bit. The media sensationalism’s of AI often means understanding
the capability of these systems is difficult to grasp. But the bigger issue is that the
way in which AlphaStar has been built does not make it an easy to adapt and translate
to a game development pipeline. So let’s talk about what this all really means for the video
game industry in the short-term rather than treating this as the next big innovation that
will transform into Skynet and eventually kill us all. **On that note, it is legit both funny and
depressing how everyone and their Aunty knows what SkyNet is yet Terminator movies are bombing
at the box office.** I won’t be talking about how AlphaStar works
in this video, because I did that already over in episode 48 of the main show. So if
you want to get a grasp of what’s actually happening under the hood of these StarCraft
AI players, then go watch that video first. Plus I do make reference to some of the points
raised in that video, but hopefully it’s all easy for you to follow along with. The first issue is that the games industry
needs to see the benefits of adopting this approach for non-player character AI before
it embraces it. This isn’t the first time machine learning has reared its head offering
to fix problems for the video games industry. In fact there was an initial exploration of
machine learning back in the late 90’s and early 2000’s – which led to games like the
original Total War and Black & White using neural networks and genetic algorithms – but
to mixed success. One of the big reasons that machine learning died out was the lack of
control or authority designers and programmers have once they’ve been trained to solve the
task at hand. Deep Learning is creating complex artificial neural networks that carry thousands
if not millions of connections that are given numeric weights. Training the connection weights
is what gives the system intelligence, but when you read that as a human – it’s just
numbers. Lots and lots of numbers. So if you build an AI player using Deep Learning
and it does something weird, you can’t crack it open and debug it. You need to isolate
what’s wrong in the design of the network, the learning process or the training data
that may have caused this erroneous behaviour and re-train it. Then if you want to create
AI that cater to particular situations or configurations, you’d need to build the training
process to reflect that. This isn’t remotely accessible or friendly to game designers who
want control over how an AI character will behave within the the game and are working
with the programming team to make that a reality. If you consider episode 47 of AI and Games
where I looked at Halo Wars 2, that whole system is built in a modular data-driven fashion
to allow designers to have a huge amount of control. Right now Deep Learning technologies
do not cater to that level of interaction and oversight for a designer to work with
it. It’s why behaviour trees are so pervasive in game AI: they’re arguably the most accessible
for both designers and developers, allowing each team to focus on their specialism without
stepping on the toes of others. This isn’t to say machine learning isn’t going
to have an impact within the industry itself, but more specifically I don’t see it being
used pervasively for in-game behaviours. Sure, we’ve seen the likes of Forza Horizon and
MotoGP adopt it for their opposing racers, but those are very bespoke situations that
actually cater quite nicely given the problem space. The industry is still evolving and
adapting to this surge in machine learning once more and while big publishers are investing
in their own AI R&D teams, that isn’t reflected in even AAA studios. Over time we’re going
to see Deep Learning used more and more in games, but not in the ways you might think
and I’d argue rarely for in-game character behaviour. The second issue is that – irrespective of
the technologies capabilities – the requirements for training AlphaStar don’t allow for it
to be easily replicated for games in active development. As mentioned in my other video,
AlphaStar’s first phase of learning is achieved by watching and then mimicking behaviours
from match replays of human players. So this is a chicken and egg problem: given
if you want to train super intelligent AI in your game, you need to have existing examples
of high-level play that it can replicate through supervised learning. If you want that training
data, then you either need to have expert players playing the game before release or
build a separate AI player to bootstrap the machine learning player by creating examples
for it to learn from – and that kinda defeats the point. AlphaStar benefits heavily from
the ecosystem that StarCraft exists within. The game has been out for nearly a decade
and is relatively bug-free, it’s been a popular eSports title for several years, plus Blizzard’s
cult of personality helps maintain an active and lively fanbase around their products.
This means lots of data already exists for AlphaStar to work with. Now all that said, the AlphaStar is still
quite a fickle system. The two version of the AI player were built against two specific
version of StarCraft 2 – with version 1 running on 4.6.2 and version 2 on 4.9.2 of the game.
Now the unspoken problem here is that any changes made to the games design that influence
the multiplayer meta in any significant way will break AlphaStar. The reinforcement learning
trains the bots against that current meta, and that means it can’t just adapt to the
changes brought on by the patch, you need to retrain it. Even the human expert play
it’s bootstrapped against might not even prove applicable anymore in this context. I can’t
say with any certainty, but there’s a small chance that already as of version 4.10 of
StarCraft 2, AlphaStar might not be able to play as well as it once did. The third and most critical element that prevents
AlphaStar being adopted en-masse is cost. Training the AlphaStar agents is an incredibly
expensive process, you need to have dedicated processing systems for the training to run
in a large distributed heterogeneous fashion. DeepMind utilise Google’s own cloud infrastructure
to achieve this and the training was executed on their Cloud Tensor Processing Unit’s or
TPUs. These are custom-developed application specific integrated circuits or ASICs designed
to help accelerate machine learning training and inference. The more recent version of AlphaStar from
November 2019 trained on 384 TPU v3 accelerators – for a period of 44 days. Now if you consider
Google’s public pricing model for using these TPUs, which runs at around $8 an hour for
a single TPU, then even a naiive estimation of cost amounts to $3,072 per hour, $73,728
a day and $3,244,032 in total. Though I’m sure DeepMind got a heavy discount. Now you might think this isn’t a big deal
when some AAA productions have budgets in the tens if not hundreds of millions of dollars,
but $3.5 million to train your AI is a ridiculous amount of money. Sure, publishers like EA,
Take Two, Ubisoft or Activision might have that kind of cash available, but this is just
the cost of running the training, not the staff, the infrastructure, the development
time and all that other critical parts of game development. Bear in mind this is but
one tiny part of a much larger puzzle when building a game of a scale akin to StarCraft.
Plus, as cool as this ridiculous expenditure is, DeepMind are actually haemorraghing money
right now – posting losses for Alphabet (the umbrella company of Google) exceeding $1 billion
dollars in the last three years. This technology is not stable enough at this stage without
further investigation for a AAA publisher to take seriously. Perhaps even more critically, this excludes
all but the top 2% of games studios and publishers even if they could afford it. The training
costs suggested here are bigger than most development budgets for a game. This technology
can’t permeate throughout the industry if it costs that much to train it. And of course,
if you need to train it again, as your design needs force you to reconsider something – boom
– that’s more money being thrown at Google to solve the problem. Alternatively, a company
invests in their own Deep Learning infrastructure or use another provider. In any case, money,
money, money. I will stress this isn’t just an issue of
unabated capitalism: the issue of data and compute resource to train Deep Learning systems
is not a solved issue and is one of the larger problems being addressed not just in research
of AI methodologies, but even hardware companies such as Intel that are building the next generation
of compute hardware to deliver training and inference of machine learning cheaper and
faster than is currently possible. Now while I’m stressing that AlphaStar isn’t
going to change gaming just yet, that is not to say that machine learning is not having
an impact within the games industry. As I mentioned earlier, the intial enthusiasm for
machine learning largely petered out by the mid-2000’s, the recent Deep Learning revolution
has seen renewed interest. But this new and more concerted effort is being explored to
address issues beyond just the creation of traditional AI players. EA’s SEED Division
revealed their work in 2018 training Deep Learning agents to play Battlefield as well
as exploring imitation learning from human play samples to bootstrap AI behaviours. Meanwhile
Ubisoft’s La Forge research lab in Montreal is experimenting with machine learning for
testing gameplay systems, AI assistants that support programmers in committing bug-free
code, motion matching animation frameworks for character behaviours and lip syncing for
dialogue in different languages. Plus the most obvious applications in data sciences
are long established at this point, as analytics teams use machine learning to learn more about
how people play their games and provide insight into changes that can be made going forward.
I mean let’s look on the bright side, I’m going to have plenty more to talk about on
this channel in the coming years! Thanks for watching this episode on Design
Dive, I figured it was worth giving my 2 cents in explaining why we shouldn’t be expecting
Deep Learning to invade all of Game AI just yet. I hope you found it interesting! If you’ve
got questions, comments or just flat out disagree with me then slap that down in the comments
and once I’ve had enough to drink I’ll go take a look! Don’t forget you can help to
support my work by joining the AI and Games Patreon or by becoming a YouTube member – just
like Scott Reynolds, Ricardo Monteiro and Viktor Viktorov have done right here, plus
all the other lovely folk you see right here in the credits. Take care folks, I’ll be back.

39 thoughts on “Why AlphaStar Does Not Solve Gaming’s AI Problems | Design Dive

  1. It's time to put the AlphaStar chat to rest. With this Design Dive episode I'm giving my 2 cents on the practical applications (and otherwise) of Deep Learning in games right now. Currently got some fun topics lined up for later in the Spring. But first, I gotta big deadline to hit by the end of the month.

    Don't worry, you'll know what I'm talking about when it hits.

  2. Your first point is moot to me. You either want something to learn, which requires you to teach it, or you go back to the dark age and program it yourself.

  3. I think it is also really important to differentiate between fun AI and hard AI. Alphastar is great for pro players to have an opponent they can still learn from, but for the average gameplay AI you dont want the AI to be hard, you want the AI to create a fun gameplay experience.
    For competetive RTS games it is good to have an AI that plays like an experienced player, so new players can learn from it by watching/playing it, but it is not necessary to use machine learning, for example age of empires 2 definitive edition upgraded the old aoe2 AI to use actual meta strategies and micro.

  4. Your first point is moot to me. You either want something to learn, which requires you to teach it, or you go back to the dark age and program it yourself. If the industry isn't up to the task, it's not a failure of the system when it's capable of learning, but we simply can'tbe bothered. Blaming the young honor student for not meeting the needs and wishes of the parent. It's not like AI is getting any better under other technologies. This is the future. Either we grasp it, or we pretend we never cared.

  5. Y'know, I kept hearing alphastar learned from watching 10s of thousands of games, but it never really struck me what that meant

    Alphastar didn't learn to play the game from the ground up. It probably doesn't have any implicit understand of what it's doing,. because it's just a 10s of millions of dollars exercise in "monkey see, monkey do" (or AI sees what the monkeys are doing, AI does)

    Further evidenced by the fact alphastar cannot adjust itself to new gameplay when blizzard makes a change to the units and the games meta changes.

    Now, don't get me wrong. Alphastar has done things in game that baffle the players, it's part of the reason its able to win. Alphastar has done seemingly novel things in game, assuming it has watched 10s of thousands of pro player games, or at least GM level play… especially with the economic mechanics of the game. But those actions could have just been iterative, not necessarily novel. Meaning, alphastar hadn't detected or calculated a better way to play to make mineral collection and mineral gatherer more efficient (as was first thought) but alphastar may have just been iterating a basic mechanic or "never stop building workers, no matter what" where as a pro player knows exactly when to cut building workers, and when to resume.

    Sorry if that was a bit detailed on the games mechanics. If you're curious, a starcraft pro by the name of Beasty QT has done some very high level commentary on alphastars play style.

  6. Hi Tommy, I watch your vids every day 🙂 Id love to know your thoughts on a "nav mesh in the sky", im trying to code flying AI right now and wondering how to get something at least slight as good as a nav mesh but in the sky.. or any other way to make a simple flying ai that avoids walls etc, i guess i could just raycast and change dir etc. I even thought you could put cubes up there, bake navigation then tunr them off, but then it wouldnt really work in true y axis etc. Thanks for all the great vids, my AI of about 8 months now was built originally from your tuts.

  7. So many videos! Hey Tommy, while I've been very interested in AI from a more abstract and theoretical designer perspective for a long time, I'm only now getting into the programming, and feel quite lost… I've got a partial prototype of an enemy that uses pretty much all the different factors I think I want in a full package, and was wondering if you could give me a lead on what kind of AI programming system I should invest my time focusing on in my learning process.
    (It's gonna be simple and pure top-down 2D with zero verticality simulation)

    ''Tries to keep line of sight on as much of the area spanning x distance in any direction the player could move, even takes priority over direct line of sight on player if area is big enough.
    Has less heavily weighed preference to stay close to a certain sweet spot distance away from the player.''

    Basically, I'm describing three conflicting goals it has that it tries to balance to get the most overal value, each goal has it's own value to the unit.
    Direct line of sight is an absolute goal, but ''staying close to sweetspot'' isn't, and that thing about keeping line of sight on x distance over any direction the player could move… Where do I even start with that one?

    Anyway, if you, or anyone else, has any good sources or advice on how to achieve these things, I'd be VERY appreciative.

  8. Great video! I love the realistic view of the situation. ML is amazing but not widely applicable yet

    I also didn’t know Black & White has neural networks! Time to learn more

  9. The difference between good gameplay AI and winning game using AI are worlds apart.
    Games like half life and fear nailed FPS ai 20 years ago but the industry frequently fails to learn from past glories or failures.

  10. On the topic of adapting to patches, I found that interesting because your notes are exactly what humans deal with. They use old patch builds and ideas initially and adjust as those builds are proven to be good or bad. I think the difference is humans theorycraft the moment patch notes show up, trying to know how changes will impact their build and then testing these theories, something AlphaStar wouldn't do. So Humans should, big should because often times the theories we come up with don't match the actual game in testing, have a leap in terms of adjusting to changes but I thought that topic was interesting and how it compared to how humans adjusted to change.

  11. The shocking revelation of AlphaStar's capabilities is not that it will solve AI problems in gaming development. It's that we now have AI that beats players at fast paced multi tasking strategy games.
    It's a milestone, way more impressive and frightening than chess AIs beating chess masters.
    It's another proof that AI can replace any human output whether physical or mental.
    Reeducation of workers is a big topic in order to deal with automation and prevent people from becoming unemployed.
    There is no guarantee that by the time their reeducation is complete there won't be an AI that can take over the job they just learned.

  12. if I was going to use Deep Learning in a video game as the default AI I wouldn't train it in a lab…
    I'd use the games own single player mode as the training group, every player connected to the internet would be training the AI weather they realised it or not. (insert evil villain laugh)

  13. The last Terminator movie didn't do well at the box office because people got pissy and totally forgot that it largely rehashes the original movie with a fresh modern take and instead thought it was a heretical aberration, the same way they did with Ghostbusters. It's less that Terminator was bad and more that "female led movies" are being assailed by whiny fanboys.

  14. Why kill them all, when you can beat them all at every video game they come up with and make them feel inferior for all of eternity

  15. AlphaStar also ran 80,000 simultaneous instances of SC2 during training. So they had at LEAST 80,000 CPUs running on top of those TPUs.

  16. If it weren't for the prohibitive cost of building and training the AI, I would say that an AI like Alphastar could be a very effective tool for balancing gameplay that is generally difficult to quantify. Essentially by taking out factors like player skill or community bias (one of the characters/factions being favored over the others for non-balance reasons) and repeatedly testing you can get a substantial amount of high quality game balance information, particularly if you're already set up to record useful data from the matches (how much of a unit gets made, how many times this ability gets used, such like that). At the moment though, Blizzard is already very effective at gathering data from online matches, so unless the process becomes much cheaper then the industry will likely stay the course.

  17. It may be possible though to use a subset of GAN (Generative Adversarial Neural Networks). But for now, there are no out of the box solutions and you weren't wrong in stressing how much cost effectiveness is important. Even if EA or Activision got millions to put in this kind of R&D, they won't do it because they have no guarantee on the return on investment. They are more focused on designed new manipulative micro transaction systems. But you are also right by saying that ML is mostly used for now, in graphics and sounds. Nvidia already shown promising result already back in 2016-2017 with lips syncing and on the fly animation. Or with audio synthesis. I will be the most happy person when I'll be able to hear a NPC calling me by my character's name. I also witnessed in research labs astonishing tech used to apply and deform textures according to the deformation of the 3D mesh.

    Damn that's a long post. All of that to say that video game and computer graphics have good days before them.

    And again nice vid 🙂

  18. Would training A.I. be cheaper if they used GPUs vs TPUs? I know TPUs are better as far as results but could the price of GPUs be a factor for using them instead? Also when are tier lists in games going to be determined by A.I? I'd rather know what character an A.I. chooses more often to beat a game instead of the opinion of pros and hobbyists. Objective Tier lists(which some websites replicate by showing which characters winning teams have most of the time like in tft) would have a strong use for all game designers. The only way to balance a game is to know which characters in the game have an advantage.

  19. I think you forgot an aspect of AI:
    a perfect AI will immediately abuse game design flaws and expose weak and overpowered concepts – lowering the game's enjoyability

  20. I see the potential for this in sports games like NBA 2K. Sports games have always suffered from the AI opponents being stuck in specific play styles and patterns. If you have a team with dynamic wings who can cut to the basket off of screens, or have a team with a dominant post player surrounded by 3 point shooters who can hit open 3's when the post player gets double teamed, it does not matter. The AI will always play the same way. You can tune the sliders in game to try to make the AI run more plays for your post player, and nothing changes. Kareem Abdul Jabbar will never be the highest scorer on the AI's team but some small forward or shooting guard will, no matter how much better your center is. Every team and every game plays out the same way for the AI with very little diversity.

    End game scenarios in a basketball game are dynamic. How much time is left on the clock, how many time outs the AI coach has left, how far ahead or behind the AI team is on the scoreboard; all of those things come into focus when you need to make a decision about taking a 3 point shot or a 2 point shot, late game fouling. Right now in game AI just have a pattern and don't even realize that with 8 seconds left in the game that they need to take a 3 point shot to at least tie the game.

    I see basketball games, and other sports games benefiting tremendously from an AI that can actually change tactics depending on the situation they find themselves in. I would love to play a basketball game where the best player gets the most chances to score, and not just the 2 wing players taking the majority of the shots. I would love to have to change my defensive strategies every game to compensate, and even in game with the AI adapting to my adjustments. The potential for AI in sports games can not be understated. 2K looks nice and has very fluid movements. You can do nearly anything in the game that real players can and the physics engine replicates player movements really well. But strategically 2k and other sports games are still a mess.

  21. So, Alphastar's skill is reset to 0 when you change the game? Is it impossible to transfer the raw strategy data from one game and see if it can apply it to others? Because it seems to me that the core elements of RTS are relatively similar. Resource management, micro and makro controlling units, map domination. Can Alphastar play Age Of Empires for example?

  22. A tiny point, but all the problems with needing examples of high level play evaporate once they develop AlphaStar Zero. Self-play has been shown to rapidly and efficiently reach super-human levels of play without any external input.

  23. Very nice video, I agree with all the arguments on why deep learning is not ready yet to be used in mainstream game development. But then, I also think the video kind of missed the point at the beginning by considering that the end goal of algorithms like AlphaStar is to be used as AI in games. The reason why DeepMind and many other companies and laboratories worldwide use games in their research is that they are easy to understand, well-known, and well-defined environments, but complex enough to test new algorithms and have a way to benchmark the results against both other machine approaches and humans.

    Beating games is actually just a milestone, much like Deep Blue beating Kasparov in chess or AlphaGo beating Lee Sedol in go. This technology can later be used for other real-life problems, such as protein structure prediction (https://deepmind.com/blog/article/AlphaFold-Using-AI-for-scientific-discovery).

    Of course, that's not to say that ML has no impact in games development, or that the points in the video are not important for clarifying a misconception that many gamers might have just from reading the news. Still, I believe it is important to make it clear that DeepMind and other institutions are not just trying to provide new tools for game development.

  24. So with some luck, by 2030 this technology would be a lot cheaper to implement* and maybe you could use it for some Game AIs… though by then technology will probably have moved on and perhaps far more useful approaches will exist.

    * Due to making the process more efficient and substantially improved hardware capability.

  25. Lol, I am sort of satisfied that if a human discovers a robot they will identify it as such and then proceed to beat it up. A common human-robot interaction suprisingly.

  26. Openai trained the original dota 2 ai in a couple ours on a single box. It all depends on the size of the grid and the frequency of actions. If you restricted Alphastar to a single small map the training cost would drop 100 fold. These days you can train Resnet from scratvh on a GPU in less than 10 minutes if you utililize the modern hacks to reduce the training time (reduced precision, learning rate optimization). If Deepmind were focused on reducing cost of training they could easily get 100 times reduction – but it is not a focus as you point out their real cost is salaries, amounting to a billion over 3 years, so reducing training time isn't on the radar as they would spend a million to save a million.

  27. AlphaStar has other limitations that makes its achievements not so impressive:
    – An AlphaStar agent only knows how to execute one strategy. Learn to beat it, you beat that agent every time. Once Mana figured out how to beat it, if they went best of 20 he would have won 10-0. Instead they concluded the event saying "oh the human player managed to beat it once". Pure PR.

    – The only way "AlphaStar" can beat a good player several times in a row is to throw different agents at him randomly so he can't predict what he will be playing against. This randomness is not part of the AI nor is it a sign of intelligence.

    Coupled with the aforementioned limitations of being unable to adapt in any way, and it's easy to see how AlphaStar is still far from demonstrating human-like intelligence in Starcraft 2. Or even being an interesting training partner for pro players.

    The point of AI companies like DeepMind is to generate hype so investors throw money at them. Keep that in mind.

  28. I think a really good use for the AI in game development, is as an artificial play tester. Because they can quickly find exploits and cheese strats that you want to patch up.

Leave a Reply

Your email address will not be published. Required fields are marked *