Prosper: An EVE Online Tool Development Blog: Fool's Errand

I found today's Nobel prize in Economics interesting, especially since it's partially related to my project.

The prize winners, all vastly more qualified than me, state through their research that you can't know short term price fluctuations, but should be able to map longer term trends. I might be in trouble, since my project is looking to do the opposite: chart with decreasing certainty a small number of weeks into the future.

The end product here is that I may not be able to do what I want with all this data. But if I don't try, because it's "impossible", then I will never know. I'd like to take this moment to talk through some of my dissenter's opinions.

Imperfect Data

This is the most common dissent I hear when people hear what I am trying to do: "But the out-of-game feeds are imperfect. How could you possibly know EXACTLY [pick your metric]?" I always end up countering with a classical engineering retort: "But I can get close enough"

If I may extend the metaphor, imagine you couldn't possibly see something with the naked eye (ISS flying overhead, for example). If I could get a telescope to take a half-decent black-and-white picture of it, would that not be "close enough" for practical purposes to show you that it was there and what it kinda looked like? I may not be able to provide the stunning HD pictures NASA can, but something beats nothing.

Exploring the frontier is all about using what you can to get what you need. I may not be able to tell you EXACTLY how many noob frigates died this year, but I can tell you it's on the order of ~250K and probably under 300K. Just because I can't know the EXACT number doesn't mean a good estimate has no value.

Understanding Limitations

It's important to know the relative accuracy of the data you're collecting, and what your blind spots may be. As far as kill data goes, these are the assumptions I am using:

PVP-kill quality

95% quality. API-only kills should provide extremely good coverage
HS kills will be less thourough. But gaps should be very small

Other kill quality (NPC kills, CONCORD kills, self-destructs)

No way to view these kills. zKB filters NPC-only kills before adding them to DB
These kills should account for a very low percentage of destruction data

The thinking goes: if something dies in PVP, it should get to zKB somehow. It only takes one key to get the data. Either from the victim or the killer, or their corps. Now, it is possible to have kills unaccounted for, where the killer (killing blow) or victim or their corps don't have a key in zKB/eve-kill's records. But losing sleep over the last ~10% that I can know is not worth derailing the 90% I'm already getting.

The things that worry me that I'm not seeing:

PVE deaths: BS/BC/T2 losses to rats
Suicide bombers: Attacker km's are ignored
Self destruct data: small segment of pod data not being tracked
NPC corp data that might be missed because killer doesn't have correct keys

The hope is these groups account for a very small fraction of the data out there.

Prediction Quality

I expect to get a decent idea of the future price of something (trend up or down, by how much?) and network all those predictions together to feed to a machine that will automatically task out my manufacturing lines. If the tuning is strong enough, getting a leg up on the shipping margin economy is a second avenue for activity.

I catch flak when I describe this because people get mired in the fine details. I might predict a 20% rise in price on a weapon, but only see a 10% rise. That's still enough to pocket profit, and I'm better off having some numbers-based prediction than spending a ton of my time scouring the numbers and playing by "gut feeling". Large repetitive math is EXACTLY what computers are for, and if I can tune the machine to have even a sliver of intuition, then I am ahead of my competition.

Today, I am using today's-cost and today's-profit to say that when I do get to market, I will be somewhere close to that prediction. I also watch market order volume to make sure what I bring to market is a suitably small percentage of actual sales, so as not to be the downward force. This is fine in products that swing slowly (most modules) but can be extremely troublesome in ships where bubbles are constantly forming/popping with fickle player tastes.

My preliminary data doesn't show kills as a predictive metric. But with kill data being extremely spiky (weekend warriors) I may not be looking at the groupings correctly yet. So far, only pure-market numbers look like the trend setters. This is probably because the kill data I am scraping isn't as publicly and easy to access as eve-central data. But there has to be some amount of weight to put into "replacement" behavior rather than just purely buying and selling commodities without any other basis in reality.

I am wondering if I should get in touch with Chribba or Ripard Teg for their PCU numbers too. Since player participation is pretty directly related to profitability.

Progress Update

In the end, you won't know unless you try. Even if the data is purely scientific it's been extremely interesting to get a look at what is destroyed. Raw dumps for those interested.

As interesting as that is, you get a very similar picture with sales data

If you overlay the two charts, the levels and spikes line up pretty similarly.

As for data parsed so far: total ~3.15M mails parsed

Frigate	902227
Cruiser	314795
Battleship	77331
Industrial	81221
Capsule	929040
Shuttle	3153
Rookie ship	245810
Assault Frigate	109897
Heavy Assault Cruiser	164
Deep Space Transport	30
Combat Battlecruiser	387
Destroyer	314605
Mining Barge	658
Command Ship	6280
Interdictor	32810
Exhumer	23516
Carrier	4873
Covert Ops	537
Interceptor	276
Logistics	14973
Force Recon Ship	177
Stealth Bomber	96750
Capital Industrial Ship	247
Electronic Attack Ship	32
Heavy Interdiction Cruiser	137
Black Ops	19
Marauder	27
Combat Recon Ship	53
Strategic Cruiser	377
Prototype Exploration Ship	236
Attack Battlecruiser	192
Blockade Runner	164

2 comments:

Unknown said...: The difference with the prize winners statement is you're not quite trying to predict small scale fluctuations. You're trying to predict small scale fluctuations in required volumes of products. Whilst these increases in demand may result in th prices rising there are many more factors involed to be able yo actually predict that and wstimate magnitude, such as availability of supply ans leed time .; October 14, 2013 at 11:27 PM
Anonymous said...: BS lost in PVE can be aproximated by the filtering relevant equipment purchase and comparing to relevant BS loses in PVP... probably...; October 24, 2013 at 6:09 AM

Monday, October 14, 2013

Fool's Errand

Imperfect Data

Understanding Limitations

Prediction Quality

Progress Update

2 comments:

Post a Comment