Monday, November 03, 2008

Mü on the Zune!

Dear reader,

I'm afraid that I'm going to be a bit of a tease. You see, what I am about to describe is something that you may never be able to see in person. Read on if you can stand it.

As a computer programmer who also happens to own a Zune, I was pretty excited to learn that Microsoft has just released XNA Game Studio v3.0 which contains support for writing games on the Zune. Naturally I couldn't leave that alone. I just couldn't resist the challenge of implementing one of my favorite card games for the device. What I'm going to describe here is my experience implementing Mü for the Zune. Unfortunately for all of you, Mü is a copyrighted game and my implementation makes liberal use of the original copyrighted artwork and rules, so unless by some generous miracle I am granted license to distribute this little gem (Mr. Nestel? Are you reading?), you'll probably never be able to try it. Sorry. But what I can do is tease you with screen shots, discuss what I've done and talk about how it works.

Let's start with what you'd see if you could start the game. This is the loading screen:


After all of the art assets are loaded you'll find yourself at the main menu:

The How to Play section contains abbreviated rules to the game. The Controls section describes how to interact with this particular implementation. The About Mu section describes what a naughty boy I've been by borrowing Doris Matthäus' lovely artwork and Frank Nestel's lovely game rules without permission. And the Options section lets you configure difficulty settings, AI bidding styles, and so on.

When starting a game, you're given a choice between 4, 5 and 6 players. If you have a game already in progress, you're also given the choice of continuing your game. (Progress is automatically saved after every hand so there's no need to explicitly save your game.)

When the game begins, the cards are shuffled and dealt to each of the players. (Which looks very cool.) Your hand is shown at the bottom of the screen. Then the bidding round begins. After a few rounds of bidding, the screen might look like this:

Note that you're told what the current high bid is, how many points need to be taken to meet the goal, who is currently aligned to be chief and who would be vice. As your opponents bid, their cards appear around the edges of the screen. As you bid, your cards are pushed forward so you can tell which cards you've revealed to the other players. The spotlight tells you whose turn it is (in this case, it's the player to your right). The weird looking line in the middle of the spotlight is a sweeping hand that spins around once per second whenever the computer is thinking. It looks pretty cool in motion.

Eventually the vice and chief select trump and the trick begins. As cards are played the screen looks like this:

After the trick ends, you're given a few seconds to make a mental note of the cards that have been played (you can button mash through it if you like) then the next trick starts. When the hand ends, you'll be shown a screen that looks like this:

The unboxed numbers are the points taken during play. The numbers in the boxes are the current scores, which include points taken this hand and any bonus points or penalties awarded.

All in all, I'm very pleased with the user interface. It's very easy to use and functional. The only downside is that there is no "undo" feature so if you accidentally play the wrong card you just have to live with it so you learn to be careful and think before you play (not a bad thing in my opinion).

When I mentioned to my friend Adam (also a programmer) that I was thinking of embarking on this project he was justifiably skeptical. Bidding card games are notoriously difficult for a computer to play well even under the best of circumstances and the Zune is hardly a processing powerhouse. We were both concerned that the challenge would prove to be too much for such a wimpy computer processor. I also had concerns that it might be impossible to get all of that information crammed onto such a tiny screen (320 x 240 pixels) but, as you can see from the screenshots, I managed to pull that off. But that still left the bigger question: would it play well? I'm happy to say that it does! It plays surprisingly well for such a wimpy computer! To the point that I am now often losing games to the computer. Will it take the place of my human opponents? Naw. I'd still rather play against real people. But it's pretty cool to be able to play a game of Mü whenever I like.

Implementation Details and Lessons Learned

If you're still reading, I'm impressed! If you are interested in learning how the game works and what I've learned then read on. Those of you who aren't interested in the computer programming guts behind my implementation are free to go now. It gets a little technical from this point on.

The heart of most game playing artificial intelligences is something called a decision tree. Basically, each point in the game is represented by a node, with every legal play branching out from that node to other nodes. The result is something that looks like a pyramid as the tree quickly branches out from the current node to the set of nodes n levels (or turns) forward in time. The computer analyzes each node in the decision tree and tries to determine which of the branches will lead to the most advantageous outcome. This approach is often used with two player perfect information games such as chess, checkers, tic-tac-toe, and so on. It's particularly effective in those cases because the computer knows exactly what each player's options are and so it knows that if it makes move X, its opponent will be required to make either moves A, B or C. Unfortunately for card games this isn't possible (without cheating) because you don't know exactly which cards your opponent holds. One possible solution (and the one which I've chosen) is to use what's called a Monte Carlo algorithm. Basically, what that means is that you start with the cards you know (the cards in your hand, the cards the other players have bid, and the cards that have already been played and are therefore out of play) and then you randomly assign the cards you don't know. You create a number of sample hands and then play each of those sample hands as if all the cards were laid face up on the table. Then you can choose whatever move comes up "best" most of the time.

One other problem with using a decision tree is that the farther ahead you look, the more nodes you have to analyze. Looking just one more play ahead could result in having to analyze five or six times as many moves. Now consider that with a trick taking game, it makes no sense to look ahead just a single card play, you really have to analyze all the way to the end of the trick before you can make a determination of who has won the trick. Looking ahead just one trick can require that the computer analyze thousands of potential plays. Looking ahead two tricks can sometimes require that you analyze several million. I suspect you see the problem.

Some of these nodes are automatically ignored by using an alpha-beta pruning algorithm (look it up on Wikipedia if you don't know what that is) but on average the AI still examines around 500,000 nodes each play, and early in the hand there are times when it analyzes millions. Since XNA programs are written in C#, this posed its own set of challenges. C# is not the most efficient language and it tends to allocate memory when you least expect it. I had to carefully write my analysis routine to avoid memory allocation as much as possible. I also considered using a dictionary to try to prune the moves even farther but after giving it some thought I discounted that as impractical on the Zune, especially since it's not nearly as likely to be useful as it would be with a game like chess or checkers. I figured that speed lost to complexity would be likely to outweigh any speed gained by scoring the occasional dictionary hit.

My game on the normal setting looks ahead two full tricks during card play. This isn't ideal and it occasionally will result in sub-optimal play but I find it's a pretty good compromise between speed and difficulty. Furthermore, since there is some inherent uncertainty in where the other cards are, I find that most of the time you won't notice the difference between looking two tricks ahead and three.

During play, I also compute only a half-dozen Monte Carlo samples. At first blush, you might think this wouldn't be enough, and for the most part you'd be right. I originally thought so as well. But if you think about it a little more you might realize that much of the time, the winner of a trick is decided by which player holds a single unplayed card. A half-dozen random samples is enough to cover the cases where that card is in each of the other players' hands. Furthermore, I have to admit that I have a little bit of an ace up my sleeve here. The computer sort of cheats. It's not a full blown cheat; it's just an itty bitty cheat. When I compute a Monte Carlo sample, what I do is start with the correct set of hands. Then I randomly choose two unknown cards from two different hands and swap them and I repeat that a set number of times. The result is that I have a hand where most of the unknown cards are scrambled but possibly not all of them. This means that the information isn't quite as random as it might be and that gives the computer just a tiny edge when guessing where the other cards are. But let's not make more of this than it is. For all intents and purposes, any benefit that could be gained this way is so small as to be nearly insignificant.

One other thing that helps the card playing algorithm is how a trick is scored. I had originally thought I would just score the pip values of the cards taken and leave it at that. The problem is that it means that 9s are valued just as poorly as 1s and so if the computer can't see any long-term benefit to keeping the 9 (read: if it isn't likely to be needed on the very next trick) it's likely to toss it off at the next opportunity. The solution is to use a heuristic scoring that puts a premium value on taking tricks, and only a secondary value on taking pips. Furthermore, the heuristic weighs 9s just as strongly as the other one pip cards. This encourages the AI to hold on to 9s even though they aren't worth as many pips because they're so likely to take tricks.

That got me most of the way toward solid card play but there was one more thing that had to be taken into account: Mü's complex overlapping multiple trump suits. Early on I encountered a bug where the algorithm was sometimes throwing off 1s in the upper trump in favor of 1s in the lower trump: clearly a poor choice! The problem was this: the algorithm works by evaluating each player's moves in turn. Each move is given a score and if the score is better than any previous score, that move is considered best. Moves are evaluated through the deck suit by suit and in ascending rank. Because the suit ordering just happened to be red, yellow, green, blue, and black, red cards were being evaluated before the other suits. This meant that if a play of 1 red scored the same as 1 yellow, the 1 red would be chosen, even if red were the upper trump and yellow the lower trump. In a game with only one trump suit this wouldn't matter but in this game it can be very bad. The solution was to change the algorithm so that it first evaluates off-suits, then lower trump, then upper trump, and then übertrump.

Bidding, as I suspected, proved to be much more challenging than card play. Bidding is super important since it's not only how you control trump; it's how you reveal information to the other players in an effort to make yourself a more attractive partner (kind of like dating). I explored several potential methods for bidding, most of which build upon the Monte Carlo algorithm developed for card pay, before settling on the final one. Most of the discarded methods were discarded because they needed to analyze far too many nodes, which slowed the game to an unplayable crawl during bidding.

Bidding in my game uses a combination of expert systems and emergent behavior. This is unlike the card play which is 100% emergent behavior. Emergent behavior simplified is this: you set up the rules of the game and a few heuristics, repeat the algorithm enough times, and the correct behavior emerges. Expert systems relies on an "expert" (in this case me) describing on a case by case basis the possible circumstances and what actions to take in each of them. The benefit of an emergent system is that the computer programmer doesn't have to have expert knowledge of the problem and he doesn't have to account for every single potential situation, the computer solves the problems on its own. The downside is that computers are really bad at distinguishing between a good decision and a poor one unless they follow it all the way to its conclusion, which can be very time consuming.

The bidding system starts by generating heuristic scores for each bidder to determine what the "best" or "most likely" trump call would be. There are actually two different heuristics employed: one for the player who is currently bidding (who therefore knows every card in his hand) and one for players who are not bidding (where the only known cards are the ones that have been bid). From this information, a list of the most likely trump, chief, vice, and partner combinations is generated and a number of Monte Carlo sample hands are played to conclusion. In the interest of time, the AI only thinks ahead to the end of the current trick when playing these hands, which can lead to some bogus results but it's good enough to get a ball park estimate. Each Monte Carlo hand is played and scored and the scores are used to determine the likelihood of a given bid succeeding.

Based on current circumstances, the likelihood of a bid's success is used to determine what action the player should take: should he take vice? should he step up as chief? should he push the chief? should he reveal a card in support of the current chief's most likely bid? and so on. This is where the expert system comes in. Because there isn't time to analyze each and every possibility, the system tries to analyze each specific position on a case by case basis. This leads to some slightly complicated code but it's not too bad and the game bids surprisingly well when everything is tuned just right.

That brings up the subject of tuning. In order to get the game to play competitively, I implemented a debug mode where the computer plays against itself and sends to the debug output complete traces of hands, card plays, and outcomes. By playing hundreds of these games over and over again I can get a feel for how well the game is playing. How high are the bids going? Are they reasonable given the hands? How often does the chief make his bid? How well is the chief doing selecting a partner? The answers to these questions were used to tune the system so that it plays as strong a game as possible.

Summing Up

All in all, this has been an extremely satisfying project. I'm extremely pleased with the outcome. The game looks and plays great, particularly given the device it's playing on. Because some corners needed to be cut, the AI isn't perfect, but it's still strong enough to beat me much of the time, and given how often I've played Mü, I'd like to think that's saying something. My only regret is that, because it's based on a copyrighted card game, I can't share it with all of you. Perhaps at some point that will be possible. I realize that most of you don't have a Zune to play it on anyway but I could just as easily release a Windows version of the game if I had the license rights. Oh well. It'll probably never happen. Still, never say never.

(By the way, I should mention that I am open to exploring employment opportunities. If you think you might be interested in hiring a programmer of my ability, feel free to send me email at steve@housefullofgames.com to request a copy of my resume.)