The To-hit mechanic II
So last time I talked a bit about the four different models you can adopt for a “to-hit” mechanic. This time I’ll focus a bit on the ones involving randomness.
Let’s have a look at my d20 rolls over three nights of D&D, roll by roll.
Looks pretty random, except if you look at the averages. All of them are below the expected average, Week 1 especially. Weeks 2 and 3 were close enough to the average to be okay, but there was definitely a bad moon on Week 1
Here’s the frequency counts for the three weeks:
The frequency counts back up our feelings about the different weeks. You do get a feel for the weight of the distribution for the three weeks:
- Week 1 is heavily weighted on 1, 3 and 6.
- Week 2 is mostly uniform except for the huge peak at 12.
- Week 3 is mostly uniform for 1-10, and heavily biassed towards 19 and little else in 11-20.
How did it feel around the table? Well, even around the table the other guys were cursing my bad luck in Week 1, explaining how best to “make an example of the cursed d20 to the rest of them”. The exact same die produced all the data, so whether it had a bad week depends on whether you really prescribe to panpsychism or not.
On that note, which sequence of rolls is more likely: the first ten rolls of Week 1, Week 2 or Week 3? They are all equally as likely. Since there are 10 rolls, you have a 1-in-20 chance of matching the first roll, 1-in-20 for the second, … This gives you a probability of of getting that exact sequence of rolls1, regardless of what the rolls you had to match.
If you got tricked by this, you’re probably implicitly thinking of the likelihood of the frequencies, rather than the actual results. You can test this using Pearson’s test, which confirms your intuition: you’d expect something like Week 1′s frequency distribution 21.75% of the time, Week 2 about 70.36% of the time, and Week 3 about 88.24% of the time. The problem is that humans aren’t so good at doing tests in their heads and are distracted by a whole bunch of psychological biasses. For example, I was astounded at how many 1′s I got in Week 1. Sure it was five times more frequent than you’d expect but it had the added emotional impact that 1′s indicate spectacular failure in D&D. Therefore I had 5 times more spectacular failures than I would have expected. I wasn’t so thrown by the fact that I had even more middling failures (6 sixes when you’d expect about 1) because a 6 is much the same as a 7, and I got zero 7′s, so perhaps in my mind I averaged it out.
Humans are typically terrible at thinking statistically, especially when emotions are involved. MMORPG forums are full of this kind of discussion. They may think the random number generator (RNG) is against them personally. They may even blame the game makers of using bad RNGs. Folks like this tend to forget the awesome luck the very same RNG may have given them the day before. This is really an education thing, so we don’t want to dwell on it too long. But one thing we can take away as game designers is players can easily have runs of bad, average or good random number outputs. We need to design with this in mind.
If we’ve locked into the random hit chance/random damage model, we want to have a think about what this means for our weapons. Let’s take the bow I in my D&D campaign. It’s a +1 Darkwood Composite Longbow and I’m at the stage where it gives me +11 to-hit, and does 1d8+2 damage, if I don’t do anything fancy with it. How much damage would I expect on average to inflict with my bow if I hit someone? Well the average of a 1d8 is 4.5, so 1d8 +2 has an average damage of 6.5. Okay, so I’ll do 6.5 damage per round, if I hit something.
What about accounting for the random chance to hit? Now D&D’s rules are a little tricky. I have to roll 1d20, add my bonuses, subtract penalties and compare that number to the bad guy. If my result is greater than their Armour Class, I hit and roll for damage. There’s two complications, namely rolling 1 or 20 – critical fumbles or critical hits respectively (don’t include bonuses or penalties). There are funky rules for dealing with both, but let’s assume we use the simplest ones. A natural 1 guarantees zero damage. If I get a natural 20, I roll again. If that second roll still hits, you do triple damage. Otherwise, normal damage.
Given this tangle of rules, how much damage would I expect to deal a round? We still need a benchmark Armour Class, so let’s say I’m trading arrows with my evil doppleganger. He has AC of 21, so my attacks have to be stronger than a 21. I’m (initially) lazy at doing maths, so let’s simulate it!
The average damage per round here was about 3.6. This means the whole “roll to hit” thing is reducing my damage to a little more than half. Notice, however, the average “hit” value (my clumsy term for the damage done when you definitely hit) is much higher – 7.8, which is over double the damage. What’s the effect of taking out critical fumbles? Insignificant at this level2 What’s the effect of taking out critical fumbles and critical hits? Average damage drops to about 3.2 per round, and the average “hit” value drops to 6.5.
What’s this mean? Well if I fight my doppleganger, I expect to kill him in about 16 rounds, since I have 57 HP and do about 3.6 damage per round. If there were no critical hits or misses, it’d take about 18 rounds. But that’s on average, right? What are all the possibilities? The fight could go on forever if I kept missing (and he took no shots at me). The fight could be over in two rounds if I manage to get critical hits for full damage for both. This is pretty broad and for game design, it’d be neat to have a bit more knowledge.
Here I’ve simulated combat for a great many combats.
The orange curve is for killing in exactly N rounds, and the green curve for killing in at most N rounds. We can use the former to look at the spreads for individuals. Close to no-one will get the coveted 2-hit kill, but the probability ramps up quickly, peaking at 16 rounds. At the other end of that curve, you can see that maybe 1% of people will have to go at least 28 rounds before killing him.
The green curve helps us look at spreads for the population. While the most frequent kill time is 16 rounds, about 40% of people will actually kill the doppleganger in 16 or fewer rounds. 60% of people will be done in 20 rounds, and the vast majority of people will have finished in about 35 rounds. Although the top seems to level off at one, it’s not strictly 100% (because of rounding). If you have a large number of players, or a lot of fights with this doppleganger, there will be a significant number of people who might take 50 rounds of combat, just because of bad dice rolling.
Suppose I want to investigate the effect of a certain buff. In this case, we have “Deadly Aim”, a special ability that lets you trade some accuracy for increased damage. It gives a -2 penalty to-hit, but +4 to damage. Let’s have a look:
The peak has shifted down to 12 rounds, and it’s a tiny bit more frequent, so we’d expect to kill him faster. We also note that 50% of people will kill him in at most 16 or fewer rounds, which is a nice improvement. Messing with the penalties and damage bonuses can give you nice graphs in which you can base your game design on. With tiny tweaks you can speed up combat, slow it down, reduce the chance of people blowing through it with lucky rolls, or mitigate against unlucky rolls.
If you wanted to do this sort of analysis yourself (with maths and not just Monte Carlo simulations), then you should learn a little about Poisson distributions. I’d talk a bit more about the statistical analysis you can do, but I think I’ve blown my words and pretty graph quotas for the month
- If you like numbers spelt in words, it’s about one in 10 trillion, which is roughly double the probability of picking a particular red blood cell out of a normal human. Alternatively, it’s roughly picking a particular star in our galaxy, if you’re allowed 30 goes. ↩
- At this level of skill versus this opponent, rolling a 1 means a miss however you spin it. At higher skill levels or against weaker foes, it matter more. I’m also ignoring house rules that critical fumbles incapacitate you for some time, mostly because it’s too hard to model. ↩