Tuesday, December 31, 2013

The Sublime Art of BPO Fu

I don't know how many readers out there have been ambitious enough to dig into the EVE Data Dump for more than just typeID/typeName conversions.  There is a ton of data in there, and traversing it can be a real trip for the uninitiated.

One segment that has always proven personally daunting has been trying to scrape out BPO data.  Though the queries look easy to start, they quickly become cumbersome trying to handle all the data you actually care about.  Some of the linking you probably want:
  • T2 Products: what does the invention step cost?
  • T2 Products: what is the source BPO?
  • T2 Products: what decryptor(s) should I use?
  • T2 Products: default runs yield?
  • Can I build the sub-components (capital, T2, T3)?
  • What is the product's group/category?
Truly, the "basic" functionality of a materials list is pretty easy to put together.  Simply combine the "base materials" and "extra materials" queries and apply the appropriate math.  Most of the pain comes around T2 blueprints.  So much of the accounting for T2 is interdependent.  Also, since T2 BPOs exist, the attributes are a little screwy when trying to account for the two production paths.

My buddy Valkrr had a pretty decent xml/JSON tool that took care of all the little nuances, but lacked a standalone updater.  Since he has quit EVE for the foreseeable future, I was SOL for recreating his utility as the SDE's update.

BPO_Builder.py command-line script
You will need to download the script + scraper.ini to parse FuzzySteve's MySQL releases.  As of now it requires whatever SDE you want to scrape to be mounted locally.  There are no special options, it just dumps the result files for use elsewhere.  In a future release I'd like to internalize it to the app in question so it can refresh the values at launch time.

The Goal: A Formatted File Of BPO Data

The idea here is you could crawl along BPOs (or their products) and get all the data you'd care to know in one return.  Invention info, dependent builds, linked blueprints, math data... the only thing missing is required skills for the builds (filtered out).

As of this post, the script returns an XML file similarly formatted as above.  I am working to push out a JSON version for lighter-weight release.  The idea here is this script can be run once-per-release to give you a standardized view of all the BPO data you could need for a program.  I'm personally pushing this to replace my gawd awful Perl kit builder, that stored much of this data manually.

The Pain of the SDE

If you are completely lacking in SQL-fu, the SDE can be really obnoxious to traverse.  Thankfully, FuzzySteve is incredibly easy to get a hold of and is an immense help in those circumstances when you're just stuck.  He was instrumental in helping with the T1/T2 BPO mapping.  

The absolute worst part of dealing with the Data Dump is the few little pitfalls out there.  The Data Dump is littered with bugs, and they've been there for a very long time.  Some bugs I ran into:
  • Meta Level (dgmTypeAttributes.attributeID=633) is split between valueInt/valueFloat
    • fixed with COALESCE(dgmTypeAttributes.valueInt,dgmTypeAttributes.valueFloat,0)
    • Though metaLevel is never a float in-game, someone at CCP flubbed about 20% of the items into the wrong type
  • Capital T2 rigs (Anti-EM/Anti-Explosive armor rigs) report the wrong metaGroupID in invmetatypes
    • Reported bug
    • Added manual repair to my tool 
  • Mercoxit Mining Crystal I lacking a metaLevel (unlike other mining crystals)
    • Reported bug
    • Added manual repair

What's Next

This will enable two goals of mine.  First, to be able to further crunch scraper data to answer questions like "How much Tritanium was destroyed".  Second, to enable me to move away from spreadsheets to more sustainable apps.  I'd still like to do something to hook into google spreadsheets, only because of ease of sharing, but trying to build a similar tool for price searching is unsustainable.

Go ahead and give my script a whirl.  Tell me if there's anything you need added to it.  The dump should be quick and easy to use.  It's a bit larger than I expected, but slurping in XML/JSON should be pretty easy to handle.