Good luck, ChanmanV!

If you haven’t heard yet, ChanmanV announced that he will be ending his shows and general involvement in eSports. I think the community will really miss his presence in the coming months. He produced a ton of valuable, free content for the community, and he engaged positively and frankly with the drama and issues in a burgeoning industry.

The community probably saw a lot of what he did publicly with his shows, but he also did a lot behind the scenes, and I figured I would share a story about that. Fun fact: Chris was actually instrumental in a lot of the progress of Spawning Tool to where it is today. Within a week or two of the launch, Chris contacted me about using Spawning Tool in conjunction with Pro Corner to host pre-generated replays for practice. That ended up not panning out due to technical difficulties from my side (I regret having not been more on the ball) and a better understanding between us of what the community could offer, but it was the beginning of understanding how Spawning Tool, a technical concept more than a product concept, could be valuable.

Over the next few months, Chris invested a lot of time, thought, and social capital in Spawning Tool and never asked anything for it. He put me in touch with personalities and players (I was starstruck every time) to get feedback on the site and functionality. He sat with me on long skype calls to think over features and even went as far as to review what was basically a product backlog we maintained in a spreadsheet. He stuck with me through all of my vague feelings about product and business direction. In fact, my biggest regret to today was not having dedicated more effort and listened more to Chris (and a few others) who were my biggest proponents earlier. I feel like Spawning Tool could have moved much more rapidly and came to a stronger result from that.

I personally will be sad to see him go, but his explanation makes a ton of sense. There are a lot of professionals out there, but a lot of eSports is still driven by the community, volunteers, and semi-professionals. I don’t think I can overstate how much Chris has put into StarCraft and couldn’t ask any more of him. I wish him the best to him and his family moving forward.

Matchup Win Rates by Game Length

(Originally posted on reddit and TL)

As requested by /u/SidusKnight, here’s the win rates in the non-mirror matchups by game length, which roughly looks at what points in the game certain races appear to have an advantage.

The data comes from roughly 2000-3000 games in each matchup from roughly April 2013 to present on Spawning Tool. The vast majority fo the replays are from tournament replay packs and should represent high-level play, though other games are not filtered out. Note that this does integrate data over multiple balance patches (I’m happy to rerun the numbers within specific time frames upon request).

There aren’t many games shorter than 5 minutes or longer than 31 minutes so the graph is truncated there. On the shorter end, those are ignored since many of them appear to be re-games, and on the longer end, those are grouped together. The raw data is available here.

I’m not really sure how to read the data myself. There are definitely deviations from 50-50 along the way, though it’s heartening to know that it’s pretty close in all matchups in traditional late-game timings (19-25 minutes). It’s a little crazy between 25-31, but that might be small sample sizes. The sample for >31 minutes is pretty big, though, and that is very close for 2 of 3 matchups.

Were I a better person, I would have rendered this online to make the graph slighly more interactive, but as it is, I unfortunately am not. If you would like more interactive visualizations, however, chime back, and I’ll put more effort into that in the future. If you have any other thoughts on other graphs or data you would like to see, I’m happy to take all suggestions for that as well.

Also happy to get help on doing some of this. /u/somedave recommended error bars on the last one, and I actually don’t know what the right methodology and presentation for that is.

ZvT win rate v. Mutalisk count

(Originally posted on reddit, also posted to the TL thread)

Following the extensive description of the state of Terranin HotS by TheDwf, TL collected opinions from professionals. Specifically, Snute was interested in “the correlation between Mutalisk count and win rate”. Here’s what I came to.

The data comes from approximately 2100 ZvT games from roughly April 2013 to present on Spawning Tool. The vast majority of replays are from tournament replay packs and should represent high-level play, though other games are not filtered out.

The frequency of counts tails off, and I truncated the data around 50ish since past that, there were fewer than 10 games at each point. The raw data is available here

Looking at the data myself, I’m not seeing a lot. There may be a bump around 16 Mutas, but otherwise, it’s hovering around 50% with some variance due to small sample sizes.

Let me know if there are any other analyses, graphs, or stats that you are curious about, and I will do my best to follow up with those.

Close/Cross position win rates (updated for WCS)

A few weeks ago, I wrote a post comparing win rates on cross and close positions 4-player maps. Here’s what it looks like with the WCS replays released by Blizzard yesterday.

(Edit: Fragbite also released their replays, so those are included as well)

PvT PvZ TvZ
Frost
Close Positions 47-39 (54.65%) 49-64 (43.36%) 39-49 (44.32%)
Cross Positions 23-20 (53.49%) 36-25 (59.02%) 15-12 (55.56%)
Total 71-59 (54.62%) 88-92 (48.89%) 55-64 (46.22%)
Alterzim Stronghold
Close Positions 22-17 (56.41%) 29-24 (54.72%) 12-8 (60.00%)
Cross Positions 4-9 (30.77%) 16-10 (61.54%) 5-5 (50.00%)
Total 26-26 (50.00%) 45-34 (56.96%) 17-13 (56.67%)

Methods and Discussion

The replays are mostly from tournament replay packs, with the vast majority being from WCS. Of course, all conclusions must be drawn from what is still a relatively small sample size.

Note that close positions are twice as likely as cross positions. I’m not sure whether that fact is noted as often as it should be.

A confounding factor here is bans. I’m not exactly sure what the effect is, but I’m sure it’s relevant since it’s not a uniformly random sample of potential games.

Frost appears to be far more favorable for Zerg at cross positions. Notably, it crosses over the 50-50 mark depending on the positions. It’s not immediately clear to me whether that advantage is coming early or late, but the game lengths are visible on that page as well if you want to look into that.

Alterzim is less clear, though that PvZ is quite dramatic. It’s a super-small sample size, though, but there may that does or doesn’t confirm common knowledge.

Let me know if there’s any other analyses I have previously done or that you would like to see in the future. The data just got a lot richer and relevant, so hopefully there’s good stuff in there to discover.

Update after a year of extracting build orders from replays

A year ago, I launched Spawning Tool, and it’s grown tremendously in that time. It started out as an experiment in using Blizzard’s replay format to grab build orders. Since then, it has become a site for organizing and labeling replays not only to steal build orders but also to analyze replays in bulk.

So where are we now? By the numbers, Spawning Tool has:

  1. 9,051 replays uploaded
  2. 132,364 replay tags
  3. 23,092 lines of code

I like to hope that Spawning Tool has contributed meaningfully to our understanding and analysis of StarCraft. Highlights are:

  1. Comparing win rates by supply difference (part 1, part 2, reddit)
  2. Putting PvT Blink Stalkers in perspective (blog)
  3. Finding close/cross position win rates (blog, reddit)

I ended up going on hiatus for quite awhile around the beginning of this year, but I have cranked out a few changes recently to highlight as well:

  1. Tags are now directly searchable so you can understand the hierarchy and dig down into specific builds and players
  2. Added spawn positions for players to mark cross and close positions
  3. Started using machine learning to suggest build order tags for replays
  4. Added an easy accept/reject option for rapidly labeling build orders
  5. Drag-and-drop file upload
  6. and lots of other bug fixes, optimizations, and changes

Of course, I have an ask for all of you as well:

  1. Label a few build orders just by accepting or rejecting suggested builds. The archive of replays is as good as its searchability, and build orders still require human expert knowledge
  2. Fill out a survey about your experience with Spawning Tool. I would love to know where to take the site from here

Thanks to everyone in the community for their support. Specifically, I would like to mention GraylinKim (creator of the sc2reader library), dsjoerg (creator of ggtracker), and ChanmanV (host of so many shows) for all of their help in getting Spawning Tool this far. I look forward to seeing what else we can do in the next year!

A better way to add build orders

In my mind, the heart of Spawning Tool is the extracted, readable build orders. To get to that point, however, there are a lot of replays to sort through, and I think that’s where the tagging system becomes valuable. Many of the tags are auto-generated from replay data or extrapolated from past data. The biggest area still requiring human analysis, however, is labeling build orders, which, judging from the front page, hasn’t been well-distributed in the community.

And I admit that the experience so far sucked. You had to find a replay on your own, and it was at least 4 or 5 clicks to punch in a build order. Hopefully, however, it’s a lot easier now with the new system for labeling build orders. There are a few parts to this.

build_order_labeler

First, you can now approve and reject suggested tags with one click. Previously, it was hard to know what taxonomy of build orders was, and it took too many steps. Now, there are a few thousand procedurally-generated suggested tags for you to approve. You can take a look at the build order and hit “yes” or “no” to determine whether the tag is appropriate or not.

Second, the interface to tag replays is up top on this page. Previously, it was hidden at the bottom of the page on the sidebar and took a click to open up. Now, you’re automatically focused into the box so you can add tags immediately on page load without having to click or scroll anywhere. Hopefully, you can pair up the action of approving or rejecting a suggested tag with more detailed tags on top of that.

Finally, you now see isolated build orders and can browse replay-to-replay. Previously, you had to bounce back and forth from browse (or open 10 tabs at once like me) to tag several replays in a row. In the build order labeling pages, you can jump from random build to build and stay on a roll.

The link to the build order labeler is on the front page, so you can hop straight into that and check out the world of actual in-game builds. Remember to login as well so that your tags are associated with you and counted in the leaderboard.

One more thing: my hope is that labeling build orders can become more and more automatic (though still with some human intervention). Machine learning is in the works and can generate suggestions, but it will only improve with more hand-labeled training data. I’m not sure where the tipping point is, but I’m excited to get to a point where that can take off on its own!

Close/Cross position win rates on different maps

I just posted this on reddit, though it’s reproduced below. If you have any comments, please do so on reddit where you’re likely to get better discussion than you would here!

 

Often, commentators will mention how certain maps favor certain races. I figured I would take a look at actual win rates to see how true those differences are.

PvT PvZ TvZ
Whirlwind
Close Positions 52-57 (47.71%) 76-75 (50.33%) 52-77 (40.31%)
Cross Positions 25-26 (49.02%) 31-41 (43.06%) 36-39 (48.00%)
Total 81-89 (47.65%) 118-127 (48.16%) 98-123 (44.34%)
Star Station
Close Positions 4-4 (50.00%) 5-5 (50.00%) 6-7 (46.15%)
Cross Positions 36-37 (49.32%) 43-63 (40.57%) 91-70 (56.52%)
Total 43-44 (49.43%) 50-73 (40.65%) 106-82 (56.38%)
Frost
Close Positions 24-18 (57.14%) 29-27 (51.79%) 19-21 (47.50%)
Cross Positions 12-10 (54.55%) 17-11 (60.71%) 7-7 (50.00%)
Total 37-28 (56.92%) 49-41 (54.44%) 27-31 (46.55%)
Alterzim Stronghold
Close Positions 8-9 (47.06%) 17-8 (68.00%) 3-3 (50.00%)
Cross Positions 1-6 (14.29%) 7-2 (77.78%) 3-2 (60.00%)
Total 9-15 (37.50%) 24-10 (70.59%) 6-5 (54.55%)

Methods and Discussion

For each replay, the map is divided into a 3×3 grid, and each cell is assigned a clock position (11, 12, 1, 3, 5, 6, 7, 9). The starting building (CC, Hatch, Nexus) position for each player is recorded. With those, cross positions are all locations that don’t share either a column or row, leaving 3 cross positions for each starting location. For example, 11 is cross from 3, 5, and 6.

The Replays are mostly from released tournament replay packs uploaded to Spawning Tool. Unfortunately, the biggest source of professional games is WCS, and they haven’t released for 2013 season 3 or 2014 season 4 (though I’m excited to redo these numbers after they do!). Because of that, we don’t have as many examples from newer maps.

Maps are collapsed across the different versions (e.g. Frost and Frost LE are counted together). Star Station was changed to a 2 player map at some point, and Alterzim Stronghold is relatively new. For the other maps, close positions are twice as likely as cross positions, so that’s the difference in counts.

A confounding factor here is bans. Since players in tournaments can ban maps that they don’t have favorable matchups in, we have a biased sample on these maps. I don’t really have any thoughts here.

The cross/close position data is available on Spawning Tool (though it does require sticking &tag=1173 or &tag=1172 to work in the research tool), so I welcome you to poke around with the data there to see if you can find anything else. Also let me know if there is anything else you’re interested in that you think can be informed by replay analysis!

Spawning Tool update: UI tweaks

A few weeks ago, I sent out a survey (still open here if you want to fill it out) about how users use Spawning Tool and what they were interested in seeing in future development. Thanks to the feedback there, I have made quite a few changes recently. There are a few big ones I want to talk about in more detail in future posts, but here’s a list of some of the smaller ones.

First, the browse replays page now shows the names of tagged players. This happens to be on of the most important pieces of information to see at a glance, and it doesn’t clutter the interface. I would have liked to do map name as well, but the poor standardization in map names would be messy, and you’re better off using the hierarchy from the tag filters.

Second, I slapped race icons around on the site. One totally valid criticism of Spawning Tool is that it lacks any visuals. I’m not great with either visuals or data visualization, so I largely depend on text and numbers to convey things. I’m open to other suggestions on visuals as well.

Third, I opened up tag pages for all users. I was previously using this just as an administrator tool, but it’s a handy dashboard around a player or build order. Currently, it contains the list of replays tagged and the parents and children of the tag so you can see the hierarchy that exists behind the scenes. I’m a little scared of fleshing out the page too much since generating content is time-consuming and would probably look a lot like liquipedia content, but if you have any ideas on useful things for this page, I’m open to suggestions.

Fourth, there have been various tweaks to the research pages, which were largely inspired by my own annoyances in using them. You can now filter by build orders for each players, and the View Win Rates page has more data to read things off more easily. I think I buffed out the advanced research page as well, but you should consider that “under construction” even still.

Fifth, you can now drag-and-drop .SC2Replay files onto any page (other than the upload page) to instantly upload your replays. A common use case I see for replay sharing is getting feedback from others, and I wanted to make it as painless as possible for someone to share a replay and the build orders.

Those are the minor but not trivial updates. Look for updates soon on other features, and send along any feedback on these or other proposed changes for Spawning Tool.

Stats for WCS AM/EU semifinals

It’s past 3AM here, and over the past 6 hours or so, I have been cranking on a few minor features for Spawning Tool, but primarily machine learning to learn to label build orders. It’s not very well-trained at the moment, but it got to 61% on Reaper Expands, so it was above 50-50. More importantly, the code ran to completion! I’ll write more about that soon.

In the meantime, however, I think I might be sleeping in tomorrow, so I thought I would publish stats before heading to bed. Enjoy the semifinals tomorrow!

MC (P) v jjakji (T)
1. MC beat jjakji 3-1 at IEM Sao Paulo with surprises everywhere. He opened Blink, Phoenix, DTs, and Robo
http://spawningtool.com/research/?p=1&after_time=&before_time=&after_played_on=&before_played_on=&p1-race=&p1-tag=106&p2-race=&p2-tag=475
2. Out of 10 recent TvPs, jjakji went for a Bio Mine composition. Expect to see more of it
http://spawningtool.com/research/?p=1&after_time=&before_time=&after_played_on=1%2F1%2F14&before_played_on=&p1-race=&p1-tag=475&p1-tag=132&p2-race=1&p2-tag=

MMA (T) v San (P)
1. San loves Templar. Before the 25 minute mark, he casts ~2.6x as many Storms as all PvTs, and Ghost usage is also up to compensate
http://spawningtool.com/research/abilities/?after_time=&before_played_on=&p2-race=2&p1-race=&p1-tag=286&before_time=&after_played_on=&p2-tag=&el-after_time=&el-before_time=25
http://spawningtool.com/research/abilities/?after_time=&before_played_on=&p2-race=2&p1-race=1&p1-tag=&before_time=&after_played_on=&p2-tag=&el-after_time=&el-before_time=25

Alicia (P) v HyuN (Z)
1. In 18 games, Alicia has never played a PvZ shorter than 12 minutes.
http://spawningtool.com/research/winrates/?after_time=&before_played_on=&p2-tag=&p1-race=&p1-tag=285&p2-race=3&before_time=&after_played_on=
2. Unlike his ZvT, HyuN doesn’t care how long a ZvP lasts: his win rates are always about the same
http://spawningtool.com/research/winrates/?after_time=&before_played_on=&p2-tag=&p1-race=&p1-tag=82&p2-race=1&before_time=&after_played_on=

Revival (Z) v Oz (P)
1. Oz does a lot of Forge Fast Expands (which are less popular than Nexus First and 1 Gate Expand builds) and at a lot of different timings
http://spawningtool.com/research/tags/?after_time=&before_played_on=&p2-tag=&p1-race=&p1-tag=380&p2-race=3&before_time=&after_played_on=

Stats for WCS AM/EU quarterfinals day 2

I’m a little late here, but here are some numbers:

Snute (Z) v MMA (T)
1. MMA beat Snute in the ATC Season 2 Finals http://spawningtool.com/7773/
2. Snute opened 15 Hatch, 16 Pool in all 10 ZvTs in 2014. However, he has also gone for Roach aggression, Swarm Hosts, and Ultralisks out of it http://spawningtool.com/research/tags/?after_time=&before_played_on=&p2-tag=&p1-race=&p1-tag=58&p2-race=2&before_time=&after_played_on=1%2F1%2F14

San (P) v Welmu (P)
1. I didn’t find much of interest for this matchup. In 2014, though, Welmu has at least 26 San replays to study
http://spawningtool.com/research/?p=1&after_time=&before_time=&after_played_on=1%2F1%2F14&before_played_on=&p1-race=&p1-tag=286&p2-race=1&p2-tag=

Polt (T) v Revival (Z)
1. Revival plays very long ZvTs. 8/11 (73%) went longer than 20 minutes compared to 45% of ZvTs globally
http://spawningtool.com/research/winrates/?after_time=&before_played_on=&p2-tag=&p1-race=&p1-tag=374&p2-race=2&before_time=&after_played_on=
2. Polt also tends to go long, playing 19 / 35 (54%) over 20 minutes. This series could take awhile
http://spawningtool.com/research/winrates/?after_time=&before_played_on=&p2-tag=&p1-race=&p1-tag=57&p2-race=3&before_time=&after_played_on=

Oz (P) v Arthur (P)
There aren’t many replays for these players other than Oz v sOs at IEM Katowice
http://spawningtool.com/research/?p=1&after_time=&before_time=&after_played_on=&before_played_on=&p1-race=&p1-tag=380&p2-race=1&p2-tag=