Skip to content

Missing Mods, Source Code, and Leftovers, Revisited

eric the espeon, a Smogon user and stats junkie from way back when, made a detailed post in the stalliness discussion thread, and I just posted back a reply to his questions and criticisms.

In the end, I revised the metric a bit further, but before I get into that, I want to point your attention towards my github repository, where I now host my team analyzer (which contains the stalliness algorithm) as a separate file. If you navigate your way over to this folder, you can find an example of how to use the team analyzer script. Feel free to fork my repository, modify my team analyzer, and tell me if you come up with better results. If you ask me nicely, I’ll even provide you with importables of the RMT archive.

Read more…

Revisions, Revisions

After some careful thought and a LOT of testing and re-testing, I made some revisions to my stalliness metric (namely adjusting some key moveset modifications), and the end result is something that I’m pretty happy with.

Before I get into the nitty-gritty of exactly what I changed, I’d like to show off the results:

Read more…

When In Doubt, Throw One Out?

From the feedback I got after posting my previous results, I started to wonder if stalliness wasn’t working better simply because of an outlier problem. Even full stall teams usually have one offensive member, and offensive teams will often have some utility Pokemon. Do these “outliers” throw off the combined stalliness? Easy enough to check.

Read more…

More Testing

In the Smogon forums thread where I discuss my stalliness metric, I asked users to submit their own teams to by analyzed by my metric. A user by the name of alkinesthetase linked me to Smogon’s RMT Archive index, which contains importable versions of dozens of teams, in various tiers and playstyles. I ran my algorithm against this dataset in an attempt to come up with “cutoffs” for stall vs. semi-stall vs. balance/bulky offense vs. offense vs. heavy offense. Below are the results, both for bias (Innocent Criminal’s metric) and stalliness (my own).

Read more…

Testing the metric

As nice as it was to define a metric for stall that made physical sense (at least to me), what would be even NICER would be to see that this metric actually *predicts* something.

So what should my stall score predict? How about the length of a battle?

Read more…

Measuring Stall

A while back, fellow Smogonite Innocent Criminal coded up a script for me that pulled data from some specialized Pokemon Online logs he’d had us keep and generated moveset analyses and metagame analyses. Now we get most of our logs from PS, and I have to re-create his work.

So by moveset analyses, I mean things like what moves were used most frequently, most common EV spreads, that sort of thing, and that’s about 50% done and fairly straightforward.

His metagame analyses were a bit more subjective. A large component was things like identifying weather teams, pseudo-weather (Trick Room) teams and Baton Pass teams, but another component was figuring out the  breakdown of offense vs. stall.

Read more…

If all you have is a hammer, everything looks like a nail

I’ve spent a lot of time thinking about Challenge Cup.

Central to generating random Pokemon for CC is giving each Pokemon a random EV spread. But generating random EVs is a lot harder than generating random IVs because of the requirement that total EVs cannot exceed 510 (let’s also assume that you don’t want to allow any under-trained sets, so make 510 the minimum EV count as well).

Read more…

Follow

Get every new post delivered to your Inbox.