Revisions for Gen VI

I’m keeping it simple, updating based on the new items, moves and abilities and not doing anything groundbreaking.

Several pokemon have had base stats change–my stalliness implementation pulls in those changes for free.

Changes in move power have no bearing on the stalliness metric.

  • The abilities Dark Aura, Fairy Aura, Infiltrator,* Parental BondProtean, Strong Jaws, Sweet Veil and Tough Claws subtract 0.5 from the metric.
  • The abilities Aroma Veil, Bulletproof, Cheek Pouch and Gooey add 0.5 to the metric.
  • The ability Fur Coat adds 1.0 to the metric.
  • The move Crafty Shield does not affect the metric (as it does not prevent damaging moves).
  • The moves King’s Shield, Mat Block and Spiky Shield get added to Protect and Detect in the list of moves that, if present on a moveset, add 1.0 to the metric.
  • The move Nuzzle gets added to the other paralysis moves for adding 0.5 to the metric.
  • The moves Power-Up Punch and Rototiller gets added to the list of setup moves that subtract 0.5 from the metric (recall that multiple setup moves do not stack).
  • The move Geomancy gets added to the list of setup moves that subtract 1.0 from the metric.
  • The move Sticky Web subtracts 0.5 from the metric (since stall teams really won’t benefit from having the opponents’ speed lowered).
  • The item Assault Vest does not change the metric.
  • The items Kee Berry, Maranga Berry, Roseli Berry and Snowball get added to the list of “consumables” which subtract 0.5 from the metric.
  • The item Pixie Plate subtracts 0.25 from the metric.
  • The item Weakness Policy subtracts 1.0 from the metric.
  • The item Safety Goggles does not change the metric (“powder” moves are few and far between, and neutralizing weather is better accomplished with Leftovers)
  • Mega Stones, if held by the corresponding Pokemon, will result in stalliness being calculated as the AVERAGE of the metric under each form. That is, for Aerodactyl holding Aerodactylite, calculate stalliness once assuming it stays an Aerodactyl (old stats, old ability), then calculate again assuming Aerodactylite is used and it has the Mega forme’s stats and ability. Take those two values and average them (this is because Mega Evolution is not guaranteed and is in fact limited to one-per-team, even though a team may contain multiple Pokemon that can Mega evolve).

*Infiltrator now bypasses substitute

Missing Mods, Source Code, and Leftovers, Revisited

eric the espeon, a Smogon user and stats junkie from way back when, made a detailed post in the stalliness discussion thread, and I just posted back a reply to his questions and criticisms.

In the end, I revised the metric a bit further, but before I get into that, I want to point your attention towards my github repository, where I now host my team analyzer (which contains the stalliness algorithm) as a separate file. If you navigate your way over to this folder, you can find an example of how to use the team analyzer script. Feel free to fork my repository, modify my team analyzer, and tell me if you come up with better results. If you ask me nicely, I’ll even provide you with importables of the RMT archive.

Revisions, Revisions

After some careful thought and a LOT of testing and re-testing, I made some revisions to my stalliness metric (namely adjusting some key moveset modifications), and the end result is something that I’m pretty happy with.

Before I get into the nitty-gritty of exactly what I changed, I’d like to show off the results:

When In Doubt, Throw One Out?

From the feedback I got after posting my previous results, I started to wonder if stalliness wasn’t working better simply because of an outlier problem. Even full stall teams usually have one offensive member, and offensive teams will often have some utility Pokemon. Do these “outliers” throw off the combined stalliness? Easy enough to check.

More Testing

In the Smogon forums thread where I discuss my stalliness metric, I asked users to submit their own teams to by analyzed by my metric. A user by the name of alkinesthetase linked me to Smogon’s RMT Archive index, which contains importable versions of dozens of teams, in various tiers and playstyles. I ran my algorithm against this dataset in an attempt to come up with “cutoffs” for stall vs. semi-stall vs. balance/bulky offense vs. offense vs. heavy offense. Below are the results, both for bias (Innocent Criminal’s metric) and stalliness (my own).

Testing the metric

As nice as it was to define a metric for stall that made physical sense (at least to me), what would be even NICER would be to see that this metric actually *predicts* something.

So what should my stall score predict? How about the length of a battle?

Measuring Stall

A while back, fellow Smogonite Innocent Criminal coded up a script for me that pulled data from some specialized Pokemon Online logs he’d had us keep and generated moveset analyses and metagame analyses. Now we get most of our logs from PS, and I have to re-create his work.

So by moveset analyses, I mean things like what moves were used most frequently, most common EV spreads, that sort of thing, and that’s about 50% done and fairly straightforward.

His metagame analyses were a bit more subjective. A large component was things like identifying weather teams, pseudo-weather (Trick Room) teams and Baton Pass teams, but another component was figuring out the  breakdown of offense vs. stall.

