Prosper: An EVE Online Tool Development Blog: A Little Less Talk: Part 2

Like Frankenstein's monster, the parts are coming together. zKB crawling is almost ready for the first full time passes. Figured it would be worth blogging some of the work.

Throughput is a Bitch

After 2 days of running the "pre-alpha" full-flow version of my binner, the progress was as follows:

Frigate	319,689 (to Aug1)
Rookie ship	241,695
Logistics	14,736
Capital Industrial Ship	245
Prototype Exploration Ship	186

That's a lot of mails parsed, but my rate was something like 400 kills/minute. This is abysmally slow, and means I would have needed several days to have any hope of getting the whole destruction picture.

Thankfully, the dudes behind zKB just added some keys to better communicate server status and my dry run was pulling 2,600 kills/minute and stands to run stable at up to 3,800 kills/minute. Still pretty slow compared to the market data (10,000 entries/minute) but I'll take the sizable improvement.

For those playing along, 3 throughput keys were added to the HTTP header:

X-Bin-Attempts-Allowed
X-Bin-Requests
X-Bin-Seconds-Between-Request

Leveraging these keys lets me set the between-call waits on the fly. As the budget is changed, I am able to adapt to that and pull "as fast as possible" according to the rules. I would like to implement a more dynamic back-off routine, that keeps a more steady stream, but that is not yielding a better throughput at the moment.

Still a Database Scrub

Originally, I was making the scraper set up dynamic "bins" from a file, and push those into a table. The output can be found on my gdoc dump. As this is practical for serving from SQL->user, it is not efficient or elegant. By relying on the data dump for translation, I'm now only storing the required information:

Date destroyed
Week destroyed (becuase I don't want to do the date->week conversion)
typeID
typeGroup (also for easy grouping)
systemID
destroyed count

I could stand to lose Week/typeGroup from the DB, but I like to have the quicker grouping handy... and being numbers instead of strings means they are much smaller to deal with.

Results

Frigate	27
Cruiser	1
Industrial	51
Capsule	1
Shuttle	5
Rookie ship	50349
Assault Frigate	3
Heavy Assault Cruiser	1
Deep Space Transport	9
Combat Battlecruiser	4
Destroyer	4
Mining Barge	19
Interdictor	2
Exhumer	63
Covert Ops	2
Force Recon Ship	2
Stealth Bomber	5
Capital Industrial Ship	247
Prototype Exploration Ship	189
Blockade Runner	39

In ~30mins, I was able to crunch nearly 50,000 kills. The numbers aren't as tidy as before (should add bool value for ships killed vs cargo destroyed), but this is leaps and bounds better than before. Odds are good that by Monday I'll have a nearly complete picture of destruction statistics in EVE.

To Do

Add something to watch for repeated killID's
Clean up home PC so I can parse this data at home
Better test "polite snooze" routine

The zKB devs have asked me to contribute this feature to them so they can serve the data themselves. I would be more than happy to open up that data to the world through them, but seeing as kill data is such a small segment of my project, I would rather focus on my goals for the time being. If I can get to the point where I am able to hire contributors, then I might be able to loop back and contribute to them.

Also, zKB has a service like EMDR that throws live data to listeners. If I can get most of the parts I'd like stable on the "cron" data, then I'd be more than happy to switch feeds over. Unfortunately, since I have no reliable web space to catch these live feeds, I am not able to get the qualities I need from them at this time.

Thursday, October 10, 2013

A Little Less Talk: Part 2 - zKB cooking with fire

Throughput is a Bitch

Still a Database Scrub

Results

To Do

No comments:

Post a Comment