I just deployed one of the most exciting Spawning Pool updates ever. Well, most exciting to me, that is: hierarchical tags. Previously, to tag something on Spawning Tool, there were a few heuristics applied to the replays on upload, but the rest were manual tagging of categories and tags. Now, there is structure behind the tags so that a whole tree of tags can be applied when you apply only a single one. This is probably best explained by example.
The primary use case for this feature is build orders. Let’s say you open “15 Hatch, 17 Pool”. That itself is a tag. This build, however, is also a “Hatch First” build as well as a “Fast Expand” build. Previously, you would need to enter all 3 of these tags to get the replay to appear under all 3 types, and if you uploaded another “15 Hatch, 17 Pool”, you would have to do the work all again. Now, the backend knows the structure that “15 Hatch, 17 Pool” is a type of “Hatch First” build, so it will implicitly label the build for browsing. Check out http://spawningtool.com/replays/?tag=309 to see it in action.
I think this is the most obvious use case, but I have 2 more examples to share off the top of my head. The first is labeling all replays out of a particular league. Currently, we have replays from DreamHack Valencia and DreamHack Summer in the system, but it’s difficult to aggregate results across both events. Now, both of those tags have “League: DreamHack” as a parent, so you can see all of them at http://spawningtool.com/replays/?tag=469. Or if you’re a TL groupie, players for TL are now “children” of the “Team Liquid” tag, visible at http://spawningtool.com/replays/?tag=470.
So I’m really excited about this because:
- It reduces a lot of manual labor. Getting good, clean, labeled data is hard, and this will bootstrap that process
- It adds tremendous richness to the data. Along the same lines, we can take a small amount of work and generate a lot of tags to browse through
- It’s technically cool. I’m proud of the implementation*, and I think that hierarchical tags (with categories) is one of the best, most flexible ways to characterize data in general.
Of course, it isn’t completely free. It only works with the hierarchy defined, and that still requires legwork (though exponentially less than previously). If you’re interested in helping out with that, please let me know (@spawningtool or email spawningtool@gmail.com), and I can hook you up with additional permissions for that.
* there are DEFINITELY bugs. Email me if you find any