I’ve spent a lot of time thinking about Challenge Cup.

Central to generating random Pokemon for CC is giving each Pokemon a random EV spread. But generating random EVs is a lot harder than generating random IVs because of the requirement that total EVs cannot exceed 510 (let’s also assume that you don’t want to allow any under-trained sets, so make 510 the minimum EV count as well).

So the issue is how you go about generating a random EV spread, by which I mean, select one in the set of all valid EV spreads at random. Your first instinct will be to take each of the 510 EV points and assign them one-by-one at random to one of the six stats. But if you do that, you will almost always get a distribution that’s pretty close to 85-85-85-85-85-85 (even spread). That’s not fun–you want to be able to get spreads that max out (or nearly max out) certain stats and leave others with next-to-no investment.

My next thought–which I ended up using in CC4Wifi and Challenge Cup on Pokemon Showdown was to implement the following algorithm:

1. Select a stat at random
2. Choose a random number between 0 and (255 – the number of EVs already in that stat) or the number of unassigned EVs (whichever is less).
3. Add that number to the number of EVs in the stat. Subtract that number from the number of unassigned EVs.
4. Repeat until you run out of EVs.

That algorithm is certainly an improvement, and you get “random-looking” stats, but it’s still not going to choose from each possible valid EV spread with the same probability.

I let the problem mull over in my mind for months until I came up with a definitive answer. Before I share it with you, though, I want to share an IRC discussion I had with Zarel, the creator and lead developer of Pokemon Showdown:

Jul 16 15:05:36 <Antar> so I’ve come up with a modification to the random-EV algorithm for Challenge Cup.
Jul 16 15:06:09 <Antar> the distinction is purely mathematical, but this new algorithm really and truly selects a valid EV spread at random from all possible spreads
Jul 16 15:06:23 <Antar> with each spread being equally likely
Jul 16 15:06:31 <Antar> ah, the power of statistical mechanics…
Jul 16 15:06:49 <Antar> anyway, I’ll commit it once I code it, which should be in a few days
Jul 16 15:07:05 <Zarel> hmm
Jul 16 15:07:39 <Zarel> for (i in 1 to 510) randomEV++;
Jul 16 15:07:59 <Antar> no, you can’t do that, because then your stats are gaussian
Jul 16 15:08:13 <Zarel> oh, yeah

Jul 16 15:12:35 <Zarel> Combinatorics saves the day
Jul 16 15:12:45 <Zarel> Shuffle 510 zeros and 5 ones
Jul 16 15:12:59 <Zarel> The number of zeros before the first one: HP ev
Jul 16 15:13:08 <Zarel> The number of zeros between the first and second one: Atk ev
Jul 16 15:13:10 <Zarel> etc etc
Jul 16 15:13:14 <Antar> oh holy shit
Jul 16 15:13:17 <Antar> go combinatorics!
Jul 16 15:13:30 <Antar> except…
Jul 16 15:13:40 <Antar> the max length of 255 screws you over
Jul 16 15:13:47 <Zarel> oh
Jul 16 15:13:49 <Zarel> damn it
Jul 16 15:13:50 <Antar> I guess you can throw out those solutions
Jul 16 15:13:57 <Antar> just keep going until you get a valid one
Jul 16 15:14:03 <Antar> shouldn’t be that hard
Jul 16 15:14:22 <Zarel> There’s probably a combinatorics solution that lets you route around those
Jul 16 15:15:14 <Zarel> Anyway, that was balls-and-bins, 510 balls and 6 bins

Needless to say, my solution was a bit more complex and was very much colored by my educational background. Zarel, being something of a computer science guy, had some familiarity with combinatorics. I’m pretty ignorant of the field, and my answer is based on physics, specifically statistical mechanics.

What I’m going to do is enumerate all possible EV spreads.

To do this, I’m going to generalize the problem to consider all the possible ways x EVs can be divided between N stats and do it one stat at a time.

To start, if N=1, for a given x there are

$n_1(x) = \left\{\begin{array}{ll}1 & : 0\leq x \leq 255\\ 0\ & : \text{otherwise}\end{array}\right.$

or, if you prefer,

$n_1(x)=\theta(x)\theta(256-x)$

where $\theta(x)$ is the Heaviside function (surprising fact: it’s named for a mathematician named Heaviside, not for its shape).

Moving on to N=2, say y is the number of EVs in stat 2. Then that leaves x-y EVs in stat 1. For a given y, there will be $n_1(x-y)$ allowed “microstates” (nomenclature from physics–Zarel just calls them “states”). All that remains is to count them up.

$n_2(x)=\sum_{y=0}^{255}n_1(x-y)$
$n_2(x)=\sum_{y=0}^{255}\theta(x-y)\theta(256-x+y)$
$n_2(x)=(256-|255-x|)(\theta(x)\theta(510-x)$

For N>2, all that remains is to follow the same procedure. Probably you want to do this numerically.

$n_N(x)=\sum_{y=0}^{255}n_{N-1}(x-y)$

$n_6(510)$ will thus give us the total number of allowed microstates (the number of valid EV spreads). Something that will probably be very useful for me when I go about modifying X-Act’s Base Stat Ratings system for the purposes of Challenge Cup level balance will be $n_5(510-s)$, which, when normalized by dividing by $n_6(510)$ will yield the probability of getting a speed EV of s.

Back now to the question of generating random EV spreads, the “physics” method would be the following:

1. Select the first stat (s) at random. Do this by choosing a value for s in the range $0\leq s \leq 255$ with a weighting given by $f(s)=n_5(510-s)/n_6(510).$
2. Next, select the second stat (e) at random by selecting a value for e in the range $0\leq s \leq 255$ with a weighting given by $f(e)=n_5(510-s-e)/n_5(510-s).$
3. Proceed in this fashion until all six stats are generated.

Needless to say, Zarel’s combinatorics solution is not only more elegant but is algorithmically simpler, although I *would* like to take the time to show that they’re equivalent. Maybe some day soon. I actually still haven’t coded up either algorithm for CC4Wifi or for Pokemon Showdown. I should get on that…