Tuesday, December 31, 2013

The Sublime Art of BPO Fu

I don't know how many readers out there have been ambitious enough to dig into the EVE Data Dump for more than just typeID/typeName conversions.  There is a ton of data in there, and traversing it can be a real trip for the uninitiated.

One segment that has always proven personally daunting has been trying to scrape out BPO data.  Though the queries look easy to start, they quickly become cumbersome trying to handle all the data you actually care about.  Some of the linking you probably want:
  • T2 Products: what does the invention step cost?
  • T2 Products: what is the source BPO?
  • T2 Products: what decryptor(s) should I use?
  • T2 Products: default runs yield?
  • Can I build the sub-components (capital, T2, T3)?
  • What is the product's group/category?
Truly, the "basic" functionality of a materials list is pretty easy to put together.  Simply combine the "base materials" and "extra materials" queries and apply the appropriate math.  Most of the pain comes around T2 blueprints.  So much of the accounting for T2 is interdependent.  Also, since T2 BPOs exist, the attributes are a little screwy when trying to account for the two production paths.

My buddy Valkrr had a pretty decent xml/JSON tool that took care of all the little nuances, but lacked a standalone updater.  Since he has quit EVE for the foreseeable future, I was SOL for recreating his utility as the SDE's update.

BPO_Builder.py command-line script
You will need to download the script + scraper.ini to parse FuzzySteve's MySQL releases.  As of now it requires whatever SDE you want to scrape to be mounted locally.  There are no special options, it just dumps the result files for use elsewhere.  In a future release I'd like to internalize it to the app in question so it can refresh the values at launch time.

The Goal: A Formatted File Of BPO Data


The idea here is you could crawl along BPOs (or their products) and get all the data you'd care to know in one return.  Invention info, dependent builds, linked blueprints, math data... the only thing missing is required skills for the builds (filtered out).

As of this post, the script returns an XML file similarly formatted as above.  I am working to push out a JSON version for lighter-weight release.  The idea here is this script can be run once-per-release to give you a standardized view of all the BPO data you could need for a program.  I'm personally pushing this to replace my gawd awful Perl kit builder, that stored much of this data manually.

The Pain of the SDE

If you are completely lacking in SQL-fu, the SDE can be really obnoxious to traverse.  Thankfully, FuzzySteve is incredibly easy to get a hold of and is an immense help in those circumstances when you're just stuck.  He was instrumental in helping with the T1/T2 BPO mapping.  

The absolute worst part of dealing with the Data Dump is the few little pitfalls out there.  The Data Dump is littered with bugs, and they've been there for a very long time.  Some bugs I ran into:
  • Meta Level (dgmTypeAttributes.attributeID=633) is split between valueInt/valueFloat
    • fixed with COALESCE(dgmTypeAttributes.valueInt,dgmTypeAttributes.valueFloat,0)
    • Though metaLevel is never a float in-game, someone at CCP flubbed about 20% of the items into the wrong type
  • Capital T2 rigs (Anti-EM/Anti-Explosive armor rigs) report the wrong metaGroupID in invmetatypes
    • Reported bug
    • Added manual repair to my tool 
  • Mercoxit Mining Crystal I lacking a metaLevel (unlike other mining crystals)
    • Reported bug
    • Added manual repair

What's Next

This will enable two goals of mine.  First, to be able to further crunch scraper data to answer questions like "How much Tritanium was destroyed".  Second, to enable me to move away from spreadsheets to more sustainable apps.  I'd still like to do something to hook into google spreadsheets, only because of ease of sharing, but trying to build a similar tool for price searching is unsustainable.

Go ahead and give my script a whirl.  Tell me if there's anything you need added to it.  The dump should be quick and easy to use.  It's a bit larger than I expected, but slurping in XML/JSON should be pretty easy to handle.

Friday, December 20, 2013

Promoting Learning

November and December have been a real shift in my playstyle and priorities.  Though I am still chipping away at my old goals of Conquering the T2 Manufacturing Sphere, Aideron Robotics has given me fresh new goals with some much younger players.  Though I am not participating in very much PVP, the bits I get to do are fun, and my work enables the corp to reach above its weight class.

As I said before, my training contributions haven't been very helpful lately.  The summer pushed me in to the 1% echelons and what I personally find helpful these days is not applicable to the general players just looking to get their careers off the ground.

As a half-joke, I started streaming my EVE playing.  Though I don't use the in-client tool (doesn't do anything for multiboxing), I can now show off exactly HOW I do what I do.  Personally, I think this is kind of dull to share, but I've been getting a ton of positive feedback about the streams.

Guides requested in streams:

  • How to POS?  Speccing a POS, what mods in what quantities?
  • How to spreadsheet?  Your sheets are awesome, I want!
  • How to pick a product?
  • Where to sell a product?
  • How can industry help a corp?
These are all great questions, but each is a pretty big topic.  Though I'm liking the response my infographics have been receiving, they aren't always the right tool for the job.  I tried to work on a T2 production flowchart to explain how you can optimize research/manufacturing in parallel, but I couldn't make it pretty.  

Instead, I have a few people whispering in my ears to do some youtube things.  And though I would love to do a series of classes, rather than the 1-hr rant fests that most game youtube channels tend to be, the time investment is pretty steep.  I will be talking to some of my buddies to see if I can get a video editor at the least so I can offload the part I just have no time for.

And because I've been neglecting my #graphpr0n duties



Since I released my Booster Use Infographic (updated version here) I wanted to see if I had any sort of impact on the game.  Though the post-release spike is pretty considerable, it's hard to isolate the reasons.  One of the Yule Lads handouts was a Synth booster, and I would need better data on fights/deaths to really dig into the impact.  Though it's pretty convincing to say here that it's not exactly seasonal.  It's refreshing to see demand growing over the last year, but only time will tell.

Saturday, December 14, 2013

Experimenting with Streaming

Since EVE pushed twitch.tv integration into the game, I figured I would give it a try more as a joke than anything.  Thanks to +Poetic Stanziel finding my post and pimping my feed, I had nearly 20 people in the stream watching as I do my regular shopping trip at the start of my production cycle.

I found it pretty enjoyable.  I went ahead and explained my process to those who were watching, stepping through each of the tools, why they are set up the way they are, and how I manage my projects.  I do feel a little nervous/silly talking to the void with no direct feedback, but it was fun.  I know this is EVE and we shouldn't share our advantages, but I enjoy helping the others out there because it's in my best interest to have more educated peers.

I've joked in the past that I should make a youtube series about spreadsheets and industry.  I just had a hard time imagining how many people want to watch someone click through spreadsheets, but there seems to be a bit of demand.  Producing videos for youtube is a bit more work than I have time for right now, but streaming my work isn't.  I'd like to put together something semi-regular, like a class series, or at least a live walkthrough of what I do and why I do it.


I invite everyone to join me at 1P MST (2000 in game I believe) on Sunday Dec 15 as I walk through my prebuild steps.  I will see about getting the stream cached somewhere (hopefully on twitch) for those that can't make it.  I'm hoping to start the stream about 1hr before that so people can gather and ask questions in chat (I will also accept questions here on the blog and on twitter) and hopefully it will run 1-2hrs and we'll all be enriched.

Stream: twitch.tv/hlibindustry
Twitter: HLIBindustry
Time: 2000 UTC Sunday Dec 15

Hope to see you there.  Invite your friends.  Please don't wardec/gank me into oblivion :)

Wednesday, December 11, 2013

You Don't Understand Risk

If there's a topic that drives me crazy fastest, it's talking about risk.  EVE is loaded with all sorts of risks and some pretty serious penalties for those foolish enough to ignore them.  As such, everyone has their own little rules about avoiding risk... though I believe a lot of them are completely useless or overly paranoid.

It boils down to a general ignorance when it comes to probability and statistics.  Everyone is obsessed with the lottery odds.  Wins are worth any cost, even if the odds are low.  Losses must be absolutely avoided, even if the odds are equally low.  MMO players will farm the loot-treadmill for a 1% drop, but won't use something that could have a 15% penalty.

This has been the major block to trying to push Combat Boosters on my friends.  They are still hung up on the penalties and are blinded to the rewards.  I went ahead and expanded the infographic to include a breakdown of the actual odds.  This is a bit of a draft, I don't like the new graph colors yet, and I need to push a fully updated version with the edits from Reddit.


The point I wanted to illustrate here was how low the chances of penalty really are, especially for Standard boosters.  With no skills, the odds are 80% that you will incur 1 or less penalties.  Neurotoxin Recovery 5 pushes that figure to 89%.  The odds of actually rolling all 4 are astronomically low in comparison.  In my opinion, if you're flying a ship that should use boosters and aren't, you're losing out on a big opportunity!  This is the same point I make about invention accounting, if you understand the probabilities, you'd be foolish not to participate.   These mechanics are driven by a random number generator (RNG), most players shy away since they aren't in complete control.

Risk aversion drives me the most crazy when it comes to flying a freighter.  There are dozens of guides about how to safely use a freighter, and the advice boils down as such:
  1. Don't Use Autopilot.  Fly gate-to-gate.  AFK is death
  2. Don't haul too much.  1B is the standard rule of thumb
  3. Avoid chokepoints.  Inter-hub shipping goes through some high risk chokepoints
  4. Other good ideas:
    • Have insta undocks
    • Have insta docks
    • Have a friendly webber
    • Avoid hauling in high traffic situations
      • Jita approach (3 gates land on 4-4 with the same vector, ganker's paradise)
      • Friday-Saturday prime times
      • Inter-hub routes
  5. If you can't accept the risk, use a courier service
All of these are good guidelines, but don't let fellow freighter pilots know you have broken any of these commandments.  Personally, I contend the 1B cargo rule.  If you are already engaging in risky behavior (high traffic, gank routes) then the 1B rule is totally reasonable.  If you go through Niarja/Madmilre with more than 1B in unwrapped cargo during prime time, you're probably going to have a bad time.  

Personally, I have set up my routes so I can be a little more lax on these rules.  My Jita<-->Base route is pretty low traffic and doesn't go through a high-risk chokepoint.  I use instas to get in and out of stations, and if I am being particularly ballsy (stupid) I have a webber + skirmish mindlink.  

Yes, I am risking a gank.  Yes, I am probably inviting even more risk bringing this up.  But I think it's important to properly assess a situation and use the right tool for the job.  If I have a heavy haul to do on a weekend, I'll usually ping a courier service to just avoid the risk entirely.  If it's a quiet weekday, and I just need something moved, I'll bring the right equipment to protect that investment.  

Being crippled by fear of risk is no way to play the game.  If you follow every single rule and guideline to avoid risk, you'll either never undock, or never actually have any fun.  The guides are there to help people, but no one takes spaceships more seriously than EVE people.

Wednesday, December 4, 2013

Better Piloting Through Chemistry

With all the recent PVP I've been doing in Aideron Robotics, chiefly against Russians in Old Man Gang, we've been faced with a higher class of solo/small-gang PVPer than any of us is used to.  Chiefly, the kind that ALWAYS has fleet boosts, and employs pirate implant sets.  These have made for some very tough nuts to crack, and have made defending Heydieles a real challenge.

Our answer has been to respond, as much as we can, in kind.  I've pushed two of my booster characters into the system with a full suite of fleet boosters.  I've been keeping a steady stream of fitted ships on contract where pilots can quickly grab them down and get back in the fight.  With the recent tide of allies and a generous US holiday, we've been able to turn the system in our favor.  Aideron Robotics has recently taken away OMG's POCOs (#1, #2), and really stomped down Caldari challenge to the system.

This isn't enough, and we've been scrambling for more to answer OMG.  There's no way we're going to push pilots into pirate sets and expect them to beat OMG at their own game, but we can leverage Combat Boosters!

The big problem with boosters though is convincing people to try them.  With the really steep penalties for use, and pain of transport/sale, most pilots completely discount them.  Though, if they are properly utilized, Boosters can be a real force multiplier when used in the right roles/ships.

Unfortunately, there are no really good definitive guides on drug use.  Ripard Teg's Fit of the Week segment usually highlights individual boosters when they make sense in a fit, and there are some written guides explaining how they work in wiki wall-o-text fashion, but there isn't a great go-to guide for them.  To push boosters on our greener members, and keep them active in our FC's minds for utilization, we need something better.

Making Infographics

Recently, Aideron Robotics has been pushing a "Making a Better Pilot" series.  Similar to a lot of TEST/CFC propaganda, cute infographics to try and curb bad behaviors or illustrate less-intuitive piloting ideas.  



Since my moon mining flow chart was so well received, especially with the siphon additions, I figured this was a good chance to fill a need.  Also, I'm eyeing Booster production, but the market throughput is kind of anemic, so I figured we could kill a lot of birds with one stone here:  Increase demand in general, improve Aideron performance in PVP, line the industry wallet, and contribute something to the overall meta (which I have been neglecting for the last month or so).

How It's Made

For the graphically retarded, Google Drawing in Google Docs is an absolute life saver.  Pair the image dumps(link) with some text and a little graph magic and GIMP.  The biggest problem I ran into is I wanted a radial bar graph... and had no easy way to make one.  Seriously... why is this so difficult?

What I Wanted

What I Made

The hope was to have a bit of a gauge to illustrate the various grades + skills combinations.  The hope was to illustrate that, with the appropriate skills, the chances of incurring a truly unacceptable penalty was very low.  Unfortunately, explaining probability to the masses is always a frustrating endeavor. Though I think I illustrated the reality pretty decently by pairing my chart with a character sheet view.  


Released infographics after the break!

Wednesday, November 20, 2013

10,000 Hours

I personally am a firm believer in the '10,000 Hours to Mastery' house of thought.  Not so much in the exact number, but in the idea that "Nothing is achieved without work.  And mastery takes A LOT of work".  This is how I approach both my in-game work and out-of-game projects; I can read and theorize all day, but it won't count for anything without actually doing the work.

Aideron Robotics has made me focus more on this topic lately.  With such a vast sampling of players, many of them in their first 6mo, I've caught myself being an elitist jerk.  Having made almost all the mistakes one can make in science and industry, I have a burning desire to save and shelter my friends from making those same mistakes.  Unfortunately, I'm realizing very quickly, the value of that advice is not appreciated until you've suffered through some of your own mistakes.  I'm also quickly realizing my initial newbro industry advice is a poor cop out.  Watching Manufacturing Confusion's new blog shows me I'm way too entrenched in my position to be very helpful to our newbros.


Snuffing Out the Spark

What's really bringing this reality into sharp focus is the constant ribbing I'm getting on my code.  I got a serious talking to the other day about efficient data structures, and my complete disregard for efficient database design.

This behavior is the ire of my existence when it comes to learning code.  When pulling yourself up by your own bootstraps, there is a ton to learn.  Without direct mentorship, there are a lot of sins that Stack Overflow isn't going teach you avoid.  Furthermore, any one facet of code is infinitely deep, and without keeping a keen focus on goals, it's easy to get lost in optimizations and perfecting a piece of code.  Just like writing books, articles, or blogs, there's as much art to saying "good enough" as there is to actually producing good work.

Personally, I've been jumping on the anyone can code bandwagon.  After making progress and looking back at the pain, I really believe a lot of the elitism is unwarranted.  Also, the resources out there for learning skills out there are unprescedented.

When all else fails, Google has been instrumental in helping to bridge the gaps.  Personally, I work very best with ample examples, and it's not hard to find snippets to walk through to add to the tool belt.  Though the very best option would to work with a team or master who can help you avoid the sins of CS, but we're all extremely busy and sometimes DIY is the only way it will ever get done!

Furthermore, an ugly tool is better than a beautiful tool that doesn't work.  For the last two years, I've leveraged gdoc sheets of increasing complexity to get where I am today.  I even still use my super-terrible-perl-kitbuilder to enable my manufacturing lines.  And you can post about how terribad my tools are, but I'm still making progress.  Don't worry if your tool or program or plan isn't perfect, make it crawl, then make it run!

But You're Doing it Wrong!!!

Being on the other side of the coin in-game, I'm starting to see why my input isn't helping anyone.  I could write a thousand guides, record YouTubes, make infographics, and I still won't save most people from pitfalls.  I can only hope to guide them away from the most egregious issues (Mined Minerals Aren't Free, T1 is a sucking hole, etc) and save them the effort to reinvent the wheel.

I will be sitting down with my new corpmates over the following weeks in the hopes of building some less jerky tutorials and help get our youngest newbros fully integrated.  I really feel AIDER has done an excellent job avoiding the traditional PVP grind (tackle until you have enough SP to be useful), and there have to be lower fruit worth picking on the industry/trade side of the coin.

Some sins must be committed to understand the value of another route.  Whether that takes a code refactor (or several) or some ISK is lost, as long as we're mentoring friends to avoid the largest and most painful pitfalls, they can still contribute to the team.

Friday, November 8, 2013

Emergencies Are Expensive

Coming back to Aideron Robotics has been an incredible boon on my desire to play EVE.  Mostly because they are enabling me to try new and exciting parts of the game that I hadn't done before.  The first is the POS Reaction Farm, which is something I've wanted to try for a while.  The second is stocking a market hub.

Though I am shying away from open market orders, I'm embracing with both arms open wide fitted ship contracts.  With our ambitious goal of taking and keeping Heydieles for the Gallente, this seemed like a great opportunity to use my industrial skills as a big force multiplier.

After tinkering with a sheet from another corp member, Vic Vorlon, I was able to really enable the work on the scales I'm used to.  Thanks to the members filling out a table of fits, I am able to task out a JF load with extreme efficiency.  In the last week, I've pushed over 3B in fittings to Heydieles... to say nothing of the shipments for the POS project along with that.





Experimenting with published sheet data.

Challenges

But, I'm running into some issues in providing these contracts.  Chief among them is "20% markup?!  You're crazy! #goonfucking".  To which my chief response is "Push Button, Receive Bacon" comes at a cost.  If I can't get paid for the time required at rates I think are appropriate, then I will simply stop the practice.  Flying in fleets is way more fun than the JF dance.


Yes, my rate is high.  But I would counter that "emergencies are expensive".  If you're stocked ahead of time, and put your efforts into being prepared, you don't have to pay my margin.  If you're lazy, or too busy focusing on the major corporation goal to stabilize Heydieles, then the mark up should be gladly paid.  Someone else spent the time for you!  I know for a fact that these contracts have enabled local FC's to reship the fleet quickly and flip fights that would have otherwise been lost.  That edge has a cost.

I by no means hold a monopoly, and I am doing my best to tune prices effectively.  The margins on frigates are wider since they are our primary tool, but narrower on the cruisers to prevent price being a reason to avoid an escalation opportunity.  There are at least two or three other guys who can help shoulder the burden, but haven't been able to meet the immediate demand.  If we can get more effective coverage, I am more than happy to bow out to others.  I'd even prefer to not be in the business... it's a lot of cash to leverage, and the payout time is slow.

I hope to have a more complete report of the effort once Heydielese is properly secured.  The hope is that I can ween down stocks after this weekend or next weekend.  But until then: For the Federation!!!

Monday, November 4, 2013

The Universe is a Small Place

I have a particularly terrible loss on my record that my friends in Aideron Robotics will never let me live down.

https://zkillboard.com/detail/23183263/

Due to a big mistake on my part, I did not change my AP route planner from the previous night's PVP roam.  Thought I was flying gate-to-gate, I was not paying attention to the route.  Once I realized my mistake, it was too late.  Didn't help that the loot fairy was so damned generous.

I continue to defend that despite such a monumental loss AIDER still managed to meet its obligations to its staff that month and the industry program continued as planned.  So in the scale of fails, though that might have bankrupted anyone else, and caused members to ragequit, we came out largely unscathed as an organization.  Also, it helps that we didn't lose the freighter we have a particular sentiment for.

Fast forward to yesterday.  One of the guys who ganked me, EURIPODES, was in local and struck up a conversation.  After shooting the shit back and forth for a bit, he told me a rather interesting story.  It seems his cut from the gank (1B) was used to get him from being constantly space-poor to instead invest in sustainable income.  Now he's among the space-rich... all thanks to, as I joked, "Unintentional Philanthropy".  He continued to thank me for the leg up, and returned the 1B as karma.

Sometimes this game is amazing.  Though I am not hurting for ISK, I really appreciate the gesture.  It's fun to see something go full circle like that.  Just to get the story is worth the loss at this point.

Thursday, October 31, 2013

Let's Make a Deal

Since industry is mostly about market PVP and making ISK, there is a lot of wheeling and dealing.  Sometimes those deals are bad (and they should feel bad), sometimes those deals are great, and sometimes things shake out generally neutral.  In general, I avoid direct/bulk sales, but I do try to be helpful to friends when I get a chance.

I recently left Paxton Industries and came back to Aideron Robotics.  Half of the reasons are for business/play, the other half is a rant about nullsec that I'll have to indulge later.  I am returning to Aideron Robotics because I know the people, the density of code/tool oriented people is the highest I've ever found, and Marcel Deveroux extended a deal I couldn't refuse: help the corp generate income through a POS reaction farm.  Also, FW is more my pace for ops, even if the ships are smaller than I usually like flying.

Any negotiator will tell you the secret to successful bargaining is to make sure both parties feel like they walk away from the table with a win.  Though there's a small segment of scams that prey on this rule by fooling the victim, you can't fool all the people all the time.  For the vast majority of business, you have to strike a bargain for both parties; very rarely do you get the fortunate position of being a dictator.

Making Corp/Alliance/Coalition Programs

The big pet peeve I had against Paxton was every one of their "deals" was looking to screw the other party.  Their capital program for allies was far more expensive than the open market or any other internal service, their industry and POS programs were all designed to take 100% of proceeds with only a "service guarantees citizenship" nod in return.  Frankly, these practices are not sustainable, and will cripple growth.  Your volumes will be difficult to maintain if you're expecting nearly scam-level returns.  Your personnel churn will be high without rewarding the staff directly.  Worst of all, it leads to complacency among the management and membership because "why do better?"

The truly great programs benefit everyone.  You provide a service that customers need, and you treat your employees well.  The GSOL presentation at EVE Vegas (the best presentation of the convention IMO) showcased the backbone of GSF/CFC's military might: their ability to deliver the hardware where it's needed when it's needed.  GSOL membership is paid, often in PLEX, for their efforts.  They are constantly itterating their tools and practices.  They have a focus on getting the right staff, and doing their damnedest to mitigate burnout.  The CFC can absolutely turn the tide of battle before the first shot is fired, all because GSOL is ready to deliver.



The proposal I brought to Aideron Robotics was "I will manage x towers for 1/3rd of the profits".  In the proposal, I outlined a means to expand and include more members, and the expected limits of our reach.  I'm even fronting half of the set up cost (even though I probably shouldn't) as a means to support my friends.  In this deal, all parties get a win out.  I probably would have been more cut throat if I didn't already have a prototype tool made, but things fell into place pretty easily considering.  These are my friends, both in and out of game, and I want to be a force multiplier for them to achieve their goals.

People in Glass Houses

I am not innocent of being a dick when it comes to deals.  I actively run away from direct deals, I dissuade people from selling directly to corp if we don't have a significant need.  So much of this is because Jita represents a gold standard for prices.  I center my projects around Jita, so it would require less-than-Jita prices to make it worth making a substitution.  Either way, one party ends up screwed in the relationship.

I would rather see my friends get full price for their work than prey on their generosity.  The only time it makes sense to me is when someone is already chasing Jita prices locally, so we both avoid a shipping step.  If we are not BOTH making money, I am effectively robbing Peter to pay Paul.  By lowering line-member income, I am causing them to demand more from the corp... and programs may not be strong enough to allow that relationship.

The one position I'm missing here is the communist/socialist WH corp.  This relationship is different, since many of your members will live and die by the corp's stream of goods and equipment in and out of the WH, and the logistics of keeping proceeds individual are just too painful.  Also, it helps that in WH, your wallet balance does you a fat lot of good without access to a market.  But still, this relationship is a win-win on both parties: members get the equipment they need to live and thrive in WH, while the corp gets what it needs to enable that.  Without a feed down to line-members, the WH operation shrivels and dies.

Monday, October 28, 2013

Treading a Fine Line

I had an excellent convo with @EVE_WOLFPACKED and @ChiralityT the other day about the broader topics of delivering an app to the EVE community.  Pair that with the recent Somer Blink scandals, there has been a lot of noise in the 3rd party service sphere.

I've been sitting on this post for a week now, and may regret publishing it.  But I feel that the backlash from this SomerBlink scandal is quickly reaching a fever pitch, and a lot of players are losing sight of an important pillar of EVE: Making a profit isn't evil.  Though I do not condone ISK-->IRL conversion in any path, the service-->ISK angle has been a long-standing feature of the 3rd party sphere.  And as nice as it would be to be a household name like EVE-Central or zKillboard, I'm going to have a very hard time justifying making a public portal like those if it will put me in the poor house, or I lose all my free time to SEO or ad management.

Open Sources Thanks to Open Sources

Let me be 100% clear, my work is enabled only because others have provided APIs.  I would not have kill data if it weren't for zKillBoard.  I would not have in-game history data if it weren't for EVE-Marketdata.  I would not have order histories if it weren't for EVE-Central providing raw daily dumps.

After talking with the contributors from eve-kill/zKillboard on how to best handle their API for my data, they made it clear that I could not in turn make that data secret.  They state in their TOS:
Using the zKillboard database and API for the purpose of datamining, in an attempt to gain an unfair advantage over corporations and/or alliances, is not allowed
And Squizz was extremely explicit in noting they enforce that rule.  EVEwho exists only because someone had tried to make a private spy network with that data, and Squizz instead published a public version and put him out of business.  

So, for the short term, I am in the uncomfortable position that I have collected this data, but am still lacking a distribution means.  I have tried to be open to those that ask questions about the data, but short of maintaining a SQL dump by hand, I don't have a better method than telling people to scrape their own version.

Gotta Make A Living

If things go according to plan, I will be shooting my own margins in the foot when I release an open version of my tool.  Also, I need to figure out a way to pay for service overhead, and I'd prefer not to plaster my site in ads.  Lastly, if I expect to get people to hand me some sort of compensation, I have to be able to prove my service is worth paying for.

Public Features

I'd like to be able to publish access to all the charts I've been making.  I'd like if users could browse through items like Eve-Markets, and see:
  • Market candlestick
  • Total volume
  • Buy orders/day
  • Sell orders/day
  • destruction (and access to by-location binning: HS/LS/NS/WH)
  • Build costs
--also--
  • Personal S&I tracking
    • Job Tracking
    • Kit Building
    • Accounting
This should give enough view to return the favor of data given.  I'd like to be able to let people do their own market research on my site, and have some look under the hood how the planning/accounting would work for their own projects.  I have no problem providing data outlays for the big hubs, but the finer the data gets, the more trouble it is to serve for marginal increase in value.

Paywall Features

Let me be 100% clear.  I am in this business to make a buck.  I'm fine with getting paid in ISK, but there are a ton of manhours to be spent in development, and server space ain't free.  In an effort to subsidize the work and server space, I'd like to offer the full suite of tools for ISK at two tiers: a "corp level" and an "alliance level".  Since the intended audience is organizations and not individuals, I'd like to leave the beefier features behind a paywall:
  • Price predictions
  • Automated production planner (at least suggestions)
  • Org-level accounting: paying contributors, tracking stockpiles
  • Localized market prediction actions:
    • "A lot of people died in this area, market activity should increase"
    • "Ship x item to y for increased margin"
  • Localized reports outside major hubs
Where the "corp" option allows a flat fee for a limited number of active builders tracked, the "alliance level" would be a % of profits (as they are tracked through the system, no fair taxing un-fulfilled plans) without limits.  I would love to provide tastes of these features to the personal account option, but we'll have to see when we get there.  I seriously doubt my prediction tools will be strong enough to provide a "1wk taste, 4wk 'feature'".

The entire strategy here is to share what was fetched from public sources, but protect the unique tools.  I will keep most of the code environment open, but intend to only protect a very small set of features:
  • Machine learning modules
    • Open inputs, but protect trained black-box
  • Automated decision making tools
    • Core feature is to free up S&I managers/staff to play EVE
    • Data streams = open, automated decision making = closed
My hope is that only these two modules would remain "sekret", and though the corp implementation would be behind the paywall, the base code behind accounting, kit building, job tracking, would remain open.  

Monetization is my last priority, and is the very last feature to be implemented.  The goal is to allow the 5% that need the horsepower subsidize the 95% of casual users.  If I can provide an ad-free clearinghouse for all the data any industry or market player could want.  

What If Paywall Is Verboten?

CCP has not done a good job in making clear what kinds of services are okay and which are not.  With the recent SomerBlink scandal and API EULA drama over the last year, my plans are very close to (or even over) the line of what is allowed.  Furthermore, I am at the mercy of other API providers, and if I run afoul of them, my entire tool goes dark.  So, it is clear I need to have a Plan-B.

First, providing a front-end that distributes the data I'm collecting needs to be a priority.  I may need to rely on ad revenue to pay for upkeep, but I would love to be able to provide that service to the general EVE playerbase.  

If I can't sell the automated/corp S&I tools, then they will remain private.  If the headache of trying to get my team paid for their work becomes too much, or I get embroiled in a scandal for trying to provide this service to the community, the simplest answer is to lock it down.  I'd prefer not to do that, more users would provide extremely valuable feedback on designing better optimizations options and incentive to really develop a top grade tool... but it's just a game, and I'd rather be largely unknown than infamous and derided.  
The goal is only to "make the best industry corp in EVE" or "win at the market", I'm not looking to pay for a new car/house/whatever and get IRL rich.  My primary goal is to help as many people in my own organization play the game for free, while crushing competitors under my heel in the truest and most extreme version of market PVP.  It also wouldn't hurt to unseat Mynnna as the richest bastard in EVE.

Tuesday, October 22, 2013

A Little Less Talk: EMD scraper v2

In my fervor to get at one subset of data, I wrote myself into a corner.  So, I spent this last weekend ripping out the inner workings of my pricefetch script and bringing it line with the style/stability of my zkb scraper.

Code at Github

This exercise was painful because I had to essentially start over and rework the entire tool from top to bottom.  This did give me the chance to clean up a lot of errors (data backfill was bugged all along), and now things are pretty and fast.  I still have the issue of "fast as you please, there's still xGB's to parse", but I think I've worked the tool down into a sweet spot for effort/speed.

I owe a lot of thanks to the recent progress to Valkrr and Lukas Rox.  Seeing as I am so painfully green with databases, they've been exceptionally helpful in cleaning up some of the pitfalls I've run into.

What Changed?

Where pricefetch was designed to grab everything from one region, EMD_scraper is designed to grab everything from everywhere.  To accomplish this I put in two modes for scraping:
  • --regionfast
  • --itemfast
These handles help define the method of scraping.  --regionfast will attempt to pull as many regions as possible, resulting in a one-item-per-call return.  --itemfast does the opposite, trying to pull as many items as possible, one region at a time.  Also, unlike zKB_scraper which goes in dictionary-order, regions have been placed in a "most relevant" configuration on this release.  Namely big hubs first, then HS, LS, Nullsec.  It still accepts smaller lists, and you can modify the lookup.json values to your heart's content as well.

This also necessitated some updates to the crash handler.  Crashes now dump the entire progress so far (region,item) and the script modifies the outgoing calls to skip region/item combinations already run.  I'd really like a more efficient crash/fetch routine, trying to get the full 10k returns each query... but I can't know the limits ahead of time with the current layouts.  I'll take 10k max with 5-7k avg returns rather than try to dynamically update the query.  EMD isn't designed to crawl like zKB.

I'm not wholly pleased about how --itemfast runs.  I may have to rewrite to crawl through all items in one region before moving onto the next.  It's currently blasting through a large number of items and increments region.  

Beautification

Coding on my own, I have this habit of scrawling down code/files willy-nilly until I can get a stable working midpoint.  Since my professional code habits stem from more time spent repairing code or tacking features onto an existing project, I lack a lot of intuition on building foundations.

Repository Maintenance

When I first created the Prosper repository (about a year ago now) I spent a good deal of time trying to create a monolithic DB scraper/builder.  With this second try, I wanted to split the tasks into finer pieces and make the code more independent.  If I could adopt a "First: make it run" mentality, I could at least get to a manageable midpoint with data, rather than burning a bunch of effort in crafting expert code.  This resulted in a lot of duplicated work, and I figured since the paradigm shifted so far, I might as well gut that original code and promote the new scripts to "DB_builder" status

I am banking all of my examples to a scraps directory, but I need to make sure I am adding them all to the repository.  Thankfully, I find myself ransacking those samples to help move the project forward.  Much of the zKB urllib2 code was previously written.  Also, many of the item lookup JSONs were pre-existing.

A tack on the TODO list though is to add more sample data dumps into the SQL portion of the repository.  I was avoiding tracking these to avoid making the repo too large, but as Valkrr pointed out, at least keeping the SQL scripts of common queries would be useful as examples.

Death to Global Variables

I had a good Samaritan swing by my code and point out that I should de-commit some globals, like db_username/db_password, and replace them with configuration scripts.  After a little back-and-forth, he was so gracious as to add the .ini handlers for me into the zkb script.

I figured it was a good time to add some extra functionality and roped those changes into a more complete set.  Now zKB and EMD scrapers both pull from the same .ini; as will any other outgoing scraper (EVE-Central, eveoffline?).  I'd like to compartmentalize internal and external scrapers to use different .ini files, but we'll see how long that continues.

Cleaner, Clearer Code

If you look at the previous version of the EMD_scraper, you'll see a lot of commented code around working code.  I left a lot of the trial-and-error in the first version.  I have since cleaned a lot of that out, leaving only some quick handles in there for debug printing.

I would like to take another pass at these scripts down the line to make very-pretty output, instead of the progress dumping to the command line.  This is purely cosmetic though, so expect the priority to be extremely low.

SQL-Fu

I seriously underestimated how much trouble data warehousing would be.  I have spent a lot of time over the last week trying to understand where I am going wrong and what steps I am missing.

Steps so far:
  • Reduce DB size by reducing strings
    • Removed itemname from priceDB
  • Design the DB to have the data, use queries to make the form
    • Abandoned "binning" directly from zKB data
    • Instead save by system, binning can be handled in a second-pass method
  • OPTIMIZE TABLE is your friend
  • CUSTOM INDEX's for common queries: added some, need to read more
  • CCP and NVIDIA are sloppy with their previous patch cleanup:
    • Check C:\Program Files\CCP\EVE\
    • Check C:\nvidia\ 
  • mySQL is a hog

JOIN and SUM(IF(..)): two flavors that don't go well together

One bug I mentioned is that some of my queries are returning hilariously high values.  On my Neutron Blaster Cannon II experiment, the raw numbers were 10x what were in the DB.  When Powers, from the #tweetfleet asked for freighter data, I was returning something like 138x.  It seems I have been confused about order of operations in SQL.  

This is why I really want to get the "bridge" scripts done so I can just splice together the tables I want to have all the data I need.  Since the data is local, rescraping should be mostly trivial, and it would give me data stores in the shapes I need to move onto the next step of the machine.

Thursday, October 17, 2013

Rubicon - Siphon Unit Statistics and Operation Anounced

Dev Blog announcing new Siphon Unit

This is the biggest change I've been waiting for from the original Rubicon announcement.  Personally, I've been waiting to move forward on my own ambitions to deploy a reaction farm until this feature was better explained.

TL;DR

Click for full res
With Rubicon, there is only one new "space yurt" in the proposed Siphon group: Small Siphon Unit.  It's very small (20m3 in cargo), very cheap (~10M), and can be dropped anywhere within 50km of a control tower to interrupt the moon mining supply chain.  Also, it's important to mention that this module will only interfere with the "Raw Material" and "Processed Material", and can only steal from the final step in the chain.

As of this announcement, there are some important things to note:
  • Rumor: Tower will not notify owner a siphon has been placed (hope this changes)
  • POS will not automatically aggress the siphon
  • Siphon has a waste factor
    • some of its yield is destroyed in transit from POS --> Siphon unit
  • Anyone can access the cargo of the siphon unit
  • Any number of siphons can be placed outside a tower
    • Each siphon steals (in dropped order) from the final yield
  • Only steals from "last step" in the chains it can steal from
  • Siphon steals from the "link" not the silo.  Once it's in a silo, it's safe
What isn't 100% clear is if siphon units will steal from Biochemical Reactors (drugs) or Polymer Reactors (T3 reactions).  I think the answer is no, but I would rather it expressed implicitly.

What Is Good

I preface that "what is good" may not be good for a particular subset, but healthy for the game in general.  The short story is the siphon waste factor should constrict moon material supplies.  As of today, most of these materials are in oversupply.  Up to now, the only place that slack could be drawn out with was T2 consumption (which has been in steady decline up to now thanks to tiericide).  Also, it breaks down the ivory tower big moon holders have enjoyed and allows pilots to interfere at a small scale.  Lastly, this forces reaction holders to invest more active time into managing their lines, lest someone steal from you.

I think this module is a great add and though I personally fear for my own projects, I think it's a great middle ground to push industry and PVP closer together.  Also, I really like the idea of more moon goo slipping out of the supply chain to waste.

What Is Bad

I think the balance on this module is still a little off.  Personally, I'd like to see a few things changed
  • Notify owners when a Siphon goes up
  • Swap stolen quantities.  I think the raw material steal rate is a bit too high
  • Add stacking penalty.  Either increase waste, or lower yields with more siphons dropped
  • Up cost OR up m3 requirement.  I'd prefer if frigates couldn't carry them, or if they cost ~75M
The big losers in this change aren't the blocs or corps, but the logistics guys who already have to deal with a crappy POS system.  I think the major rallying point should be owners MUST be notified about siphons being deployed on their equipment.  Otherwise, prioritizing defense will become far too cumbersome.  I personally was hoping that POS would auto-engage the siphons, but I can understand why CCP didn't do this.  

I also think the linear nature is not as good.  I would rather see diminishing returns, since a single player could very easily drop a dozen of these siphon units around a tower and milk the system dry.  I understand why it might be difficult to implement/explain, but since small siphons are so small/cheap, it's not well balanced against the "What about Goons?" test.  

Tuesday, October 15, 2013

Objective Complete: zKB Data Get

3.75M Kills parsed (2013 so far)
17.5M Entries
40hr estimated parsing time

Frigate 905,329
Cruiser 315,493
Battleship 77,617
Industrial 81,642
Capsule 929,041
Titan 25
Shuttle 41,814
Rookie ship 246,308
Assault Frigate 110,147
Heavy Assault Cruiser 30,583
Deep Space Transport 2,421
Combat Battlecruiser 170,480
Destroyer 332,165
Mining Barge 54,804
Dreadnought 2,218
Freighter 1,960
Command Ship 6,340
Interdictor 32,956
Exhumer 24,032
Carrier 4,873
Supercarrier 113
Covert Ops 39,401
Interceptor 52,546
Logistics 15,082
Force Recon Ship 24,132
Stealth Bomber 97,226
Capital Industrial Ship 247
Electronic Attack Ship 5,940
Heavy Interdiction Cruiser 3,961
Black Ops 1,129
Marauder 1,375
Jump Freighter 861
Combat Recon Ship 8,368
Industrial Command Ship 2,986
Strategic Cruiser 32,309
Prototype Exploration Ship 265
Attack Battlecruiser 82,652
Blockade Runner 11,583


Remaining To-Do

  1. Investigate count bug
    • Initial dump is 10x expected values on items?
  2. Finish "prettying" for release
  3. Update pricefetch to scrape all regions for full market picture
  4. Find a way to maintain/release .sql dump of data generated
  5. mySQL optimization and "bridge" scripts for smaller passes

Progress So Far

I have to thank a bunch of people for helping me get to this point where I have at least a passable crawler and data set to munch on.  I would like to get EVE-Central's dumps processed before moving onto the data science step, but we will see what happens.

Extra special thanks to:
I still have a lot of work to go between "working" and "good", but being able to stand upright and get my hands on this data is exceptionally awesome.

Finally, I can put together data like this:

Monday, October 14, 2013

Fool's Errand

I found today's Nobel prize in Economics interesting, especially since it's partially related to my project.

The prize winners, all vastly more qualified than me, state through their research that you can't know short term price fluctuations, but should be able to map longer term trends.  I might be in trouble, since my project is looking to do the opposite: chart with decreasing certainty a small number of weeks into the future.

The end product here is that I may not be able to do what I want with all this data.  But if I don't try, because it's "impossible", then I will never know.  I'd like to take this moment to talk through some of my dissenter's opinions.

Imperfect Data

This is the most common dissent I hear when people hear what I am trying to do: "But the out-of-game feeds are imperfect.  How could you possibly know EXACTLY [pick your metric]?"  I always end up countering with a classical engineering retort: "But I can get close enough"

If I may extend the metaphor, imagine you couldn't possibly see something with the naked eye (ISS flying overhead, for example).  If I could get a telescope to take a half-decent black-and-white picture of it, would that not be "close enough" for practical purposes to show you that it was there and what it kinda looked like?  I may not be able to provide the stunning HD pictures NASA can, but something beats nothing.

Exploring the frontier is all about using what you can to get what you need.  I may not be able to tell you EXACTLY how many noob frigates died this year, but I can tell you it's on the order of ~250K and probably under 300K.  Just because I can't know the EXACT number doesn't mean a good estimate has no value.  

Understanding Limitations

It's important to know the relative accuracy of the data you're collecting, and what your blind spots may be.  As far as kill data goes, these are the assumptions I am using:
  • PVP-kill quality
    • 95% quality.  API-only kills should provide extremely good coverage
    • HS kills will be less thourough.  But gaps should be very small
  • Other kill quality (NPC kills, CONCORD kills, self-destructs)
    • No way to view these kills.  zKB filters NPC-only kills before adding them to DB
    • These kills should account for a very low percentage of destruction data
The thinking goes: if something dies in PVP, it should get to zKB somehow.  It only takes one key to get the data.  Either from the victim or the killer, or their corps.  Now, it is possible to have kills unaccounted for, where the killer (killing blow) or victim or their corps don't have a key in zKB/eve-kill's records.  But losing sleep over the last ~10% that I can know is not worth derailing the 90% I'm already getting.

The things that worry me that I'm not seeing:
  • PVE deaths: BS/BC/T2 losses to rats
  • Suicide bombers: Attacker km's are ignored
  • Self destruct data: small segment of pod data not being tracked
  • NPC corp data that might be missed because killer doesn't have correct keys
The hope is these groups account for a very small fraction of the data out there.  

Prediction Quality

I expect to get a decent idea of the future price of something (trend up or down, by how much?) and network all those predictions together to feed to a machine that will automatically task out my manufacturing lines.  If the tuning is strong enough, getting a leg up on the shipping margin economy is a second avenue for activity.  

I catch flak when I describe this because people get mired in the fine details.  I might predict a 20% rise in price on a weapon, but only see a 10% rise.  That's still enough to pocket profit, and I'm better off having some numbers-based prediction than spending a ton of my time scouring the numbers and playing by "gut feeling".  Large repetitive math is EXACTLY what computers are for, and if I can tune the machine to have even a sliver of intuition, then I am ahead of my competition.

Today, I am using today's-cost and today's-profit to say that when I do get to market, I will be somewhere close to that prediction.  I also watch market order volume to make sure what I bring to market is a suitably small percentage of actual sales, so as not to be the downward force.  This is fine in products that swing slowly (most modules) but can be extremely troublesome in ships where bubbles are constantly forming/popping with fickle player tastes.  

My preliminary data doesn't show kills as a predictive metric.  But with kill data being extremely spiky (weekend warriors) I may not be looking at the groupings correctly yet.  So far, only pure-market numbers look like the trend setters.  This is probably because the kill data I am scraping isn't as publicly and easy to access as eve-central data.  But there has to be some amount of weight to put into "replacement" behavior rather than just purely buying and selling commodities without any other basis in reality.

I am wondering if I should get in touch with Chribba or Ripard Teg for their PCU numbers too.  Since player participation is pretty directly related to profitability.

Progress Update

In the end, you won't know unless you try.  Even if the data is purely scientific it's been extremely interesting to get a look at what is destroyed.  Raw dumps for those interested.

As interesting as that is, you get a very similar picture with sales data

If you overlay the two charts, the levels and spikes line up pretty similarly.
As for data parsed so far: total ~3.15M mails parsed
Frigate 902227
Cruiser 314795
Battleship 77331
Industrial 81221
Capsule 929040
Shuttle 3153
Rookie ship 245810
Assault Frigate 109897
Heavy Assault Cruiser 164
Deep Space Transport 30
Combat Battlecruiser 387
Destroyer 314605
Mining Barge 658
Command Ship 6280
Interdictor 32810
Exhumer 23516
Carrier 4873
Covert Ops 537
Interceptor 276
Logistics 14973
Force Recon Ship 177
Stealth Bomber 96750
Capital Industrial Ship 247
Electronic Attack Ship 32
Heavy Interdiction Cruiser 137
Black Ops 19
Marauder 27
Combat Recon Ship 53
Strategic Cruiser 377
Prototype Exploration Ship 236
Attack Battlecruiser 192
Blockade Runner 164

Thursday, October 10, 2013

A Little Less Talk: Part 2 - zKB cooking with fire

Like Frankenstein's monster, the parts are coming together.  zKB crawling is almost ready for the first full time passes.  Figured it would be worth blogging some of the work.

Throughput is a Bitch

After 2 days of running the "pre-alpha" full-flow version of my binner, the progress was as follows:
Frigate319,689 (to Aug1)
Rookie ship241,695
Logistics14,736
Capital Industrial Ship245
Prototype Exploration Ship186
That's a lot of mails parsed, but my rate was something like 400 kills/minute.  This is abysmally slow, and means I would have needed several days to have any hope of getting the whole destruction picture.  

Thankfully, the dudes behind zKB just added some keys to better communicate server status and my dry run was pulling 2,600 kills/minute and stands to run stable at up to 3,800 kills/minute.  Still pretty slow compared to the market data (10,000 entries/minute) but I'll take the sizable improvement.

For those playing along, 3 throughput keys were added to the HTTP header:
X-Bin-Attempts-Allowed
X-Bin-Requests
X-Bin-Seconds-Between-Request
Leveraging these keys lets me set the between-call waits on the fly.  As the budget is changed, I am able to adapt to that and pull "as fast as possible" according to the rules.  I would like to implement a more dynamic back-off routine, that keeps a more steady stream, but that is not yielding a better throughput at the moment. 

Still a Database Scrub

Originally, I was making the scraper set up dynamic "bins" from a file, and push those into a table.  The output can be found on my gdoc dump.  As this is practical for serving from SQL->user, it is not efficient or elegant.  By relying on the data dump for translation, I'm now only storing the required information:
  • Date destroyed
  • Week destroyed (becuase I don't want to do the date->week conversion)
  • typeID
  • typeGroup (also for easy grouping)
  • systemID
  • destroyed count
I could stand to lose Week/typeGroup from the DB, but I like to have the quicker grouping handy... and being numbers instead of strings means they are much smaller to deal with.

Results

Frigate27
Cruiser1
Industrial51
Capsule1
Shuttle5
Rookie ship50349
Assault Frigate3
Heavy Assault Cruiser1
Deep Space Transport9
Combat Battlecruiser4
Destroyer4
Mining Barge19
Interdictor2
Exhumer63
Covert Ops2
Force Recon Ship2
Stealth Bomber5
Capital Industrial Ship247
Prototype Exploration Ship189
Blockade Runner39
In ~30mins, I was able to crunch nearly 50,000 kills.  The numbers aren't as tidy as before (should add bool value for ships killed vs cargo destroyed), but this is leaps and bounds better than before.  Odds are good that by Monday I'll have a nearly complete picture of destruction statistics in EVE.

To Do

  • Add something to watch for repeated killID's
  • Clean up home PC so I can parse this data at home
  • Better test "polite snooze" routine
The zKB devs have asked me to contribute this feature to them so they can serve the data themselves.  I would be more than happy to open up that data to the world through them, but seeing as kill data is such a small segment of my project, I would rather focus on my goals for the time being.  If I can get to the point where I am able to hire contributors, then I might be able to loop back and contribute to them.

Also, zKB has a service like EMDR that throws live data to listeners.  If I can get most of the parts I'd like stable on the "cron" data, then I'd be more than happy to switch feeds over.  Unfortunately, since I have no reliable web space to catch these live feeds, I am not able to get the qualities I need from them at this time.