More Testing

In the Smogon forums thread where I discuss my stalliness metric, I asked users to submit their own teams to by analyzed by my metric. A user by the name of alkinesthetase linked me to Smogon’s RMT Archive index, which contains importable versions of dozens of teams, in various tiers and playstyles. I ran my algorithm against this dataset in an attempt to come up with “cutoffs” for stall vs. semi-stall vs. balance/bulky offense vs. offense vs. heavy offense. Below are the results, both for bias (Innocent Criminal’s metric) and stalliness (my own).

Testing the metric

As nice as it was to define a metric for stall that made physical sense (at least to me), what would be even NICER would be to see that this metric actually *predicts* something.

So what should my stall score predict? How about the length of a battle?

Measuring Stall

A while back, fellow Smogonite Innocent Criminal coded up a script for me that pulled data from some specialized Pokemon Online logs he’d had us keep and generated moveset analyses and metagame analyses. Now we get most of our logs from PS, and I have to re-create his work.

So by moveset analyses, I mean things like what moves were used most frequently, most common EV spreads, that sort of thing, and that’s about 50% done and fairly straightforward.

His metagame analyses were a bit more subjective. A large component was things like identifying weather teams, pseudo-weather (Trick Room) teams and Baton Pass teams, but another component was figuring out the  breakdown of offense vs. stall.

If all you have is a hammer, everything looks like a nail

I’ve spent a lot of time thinking about Challenge Cup.

Central to generating random Pokemon for CC is giving each Pokemon a random EV spread. But generating random EVs is a lot harder than generating random IVs because of the requirement that total EVs cannot exceed 510 (let’s also assume that you don’t want to allow any under-trained sets, so make 510 the minimum EV count as well).

Read more…


Those of you who know me from Smogon will know me as Antar, the stats guy and the dude who keeps the simulators running. Those of you who know me from Youtube know me as Antar1011, the decent-enough battler who plays his music too loud.

I was talking with the Smogon staff about resurrecting the old “Pokemetrics” forum so that I could start providing stats beyond just the simple usage statistics and use the subforum as a place to bounce ideas off of people, but I realized during that discussion that a better format for that discussion would be a blog.

WordPress supports LaTeX (expect to see a decent amount of equations), so here I am! Feel free to leave comments and shoot me ideas about new metrics or revising old ones.

Oh, and finally, my code will always be available on my github repository. Feel free to peruse it, use it, fork it, or modify it.