Prosper: An EVE Online Tool Development Blog: How I Data Science

Thursday, February 4, 2016

How I Data Science - Hunting For Trends

Long time friend Blake at k162space.com has been pumping out some really exciting work around NPC-kill rates. He recently poked me about taking that raw data forward to something with more teeth, and after futzing with it for about an hour, came out with some interesting data.

His inquiry reminded me of a common question I receive: How do I get into data science? So, this blog is going to be a bit more long/technical than the recent fare. We're going to walk step-by-step through the investigation process and I'll try to illustrate what I see as we go along. The readouts were generated using JMP, only because it's faster to use than R. This entire process can be done in R, and I can revisit with more specific R samples if its requested.

Let's take a look:

Getting Started

Raw Data

I like the Forecaster's Toolbox as a jumping off point. We're looking for a few things when we start:

Basic visualizations: look for obvious trends

Simple time-series graph shows no obvious correlation
Simple scatter plots in case there's something obvious

Check scales: linear vs log vs sqrt

SUM_factionKills does not have much variance
log/sqrt price doesn't seem like a good idea

Skew data

+/- a few days to look for lead/lag

Other useful indicators

Deviation from moving average (more on this later)

Processed Data

This gives us some basic trends to start comparing. Nothing is jumping out from the data at this point yet.

Scatter Plot

What we really want is a NPC-kills vs price correlation. With that kind of relationship, we can basically automate investments off a single number.

At first crack, we're really not seeing anything. The vertical spread, especially around the median factionKills value (~780k/day) is showing no viable trend to really predict with. At best, we're seeing that variation/volatility is highest on median days, but that doesn't give us anything meaningful to work with in finding a trend to leverage. Only prediction on which days might complete orders, not what price we can expect to get.

Looking at the skewed data (+1 to +5 days) there are some better clustering, but not better trending. There's still a very obvious failure of the vertical line test which make it very hard to find a X->Y trend. Also, some of the price outliers that were obvious in the first time-series graph are really showing to be problematic in this clustering view.

Back to the Drawing Board

As I have said in previous dev blogs, I really like if we can find a normal-shaped trend. Even if we can find a correlation, it will be really hard to use effectively if it's not either linear OR normal. Working outside those bounds gets difficult fast, so let's try to get back to the sorts of things we know well.

Thankfully, SUM_factionKills is reasonably normal. But as we have discussed previously, prices really aren't. But, deviation/volatility are normal-shaped trends. This is starting to look like something we can at least statistically flag on, even if a linear relationship might be out of reach.

Now that we've clipped out the high-flier, and zoomed in on just the Machariel, things are looking a lot more useful. Though the price/5d avg trends are essentially random, the deviation trend is looking far more linear. This is extremely promising.

What I See

This preliminary result confirms a baseline assumption:

Higher ratting counts will lead to more NPC drops hitting the market and increased supply will drive down the price.

With the little bit of data, we can see a pretty strong correlation between deviation from the 5-day trend and total NPC kills in Angel space. Now, that isn't to say we've "solved" the system yet, there are still a lot of troubling points:

The sample size is just barely big enough to work with.

Don't like declaring trends without at least 60d of data to back them up

There are still some troubling fliers

Though the low and high ends of the graph are telling, there's some points around 800k-850k that make me slightly worried.

Deviation/Volatility should be 0-centered.

In a local period of decline. Without positive swings, it's hard to confirm the "less ratting = higher prices" part of the equation

Flavor of the Month (FOTM)

Though the Machariel has been traditionally popular, it's easy to miss forest for trees with other indicators such as total sell volumes and other activity metrics

This is an extremely interesting first result out of the data at hand. Though there are still plenty of points to be cautious about, this is enough confirmation to keep digging and collecting data. Also, this being a derivative trend, I worry about leveraging it directly without a second signal to back it up.

Also, just to show the entire picture, we might need to include a fit-quality metric as a go/no-go boundary. Where the Machariel/Dramiel are traditionally popular, the Cynabal isn't as strong.

Specifically troubling is the Dramiel graph which shows the reverse correlation we'd expect. This could be a signal showing more about the demand driving the price of things more than strictly the supply. Again, the best approach will probably be multi-factor, but this is a very interesting step toward something. Paired with a market-side predictor, this could be a very useful second-source to validate against, or as a means to seed forecasts for items that aren't directly manufactured.

Also, I try very hard to test both positive and negative cases. It's easy to accept when a model shows promise, and hard to accept where it might fail. The second thing I always do in these kind of searches is try to find a case that breaks the tool, and understand why. This is why I'm not particularly a fan of things like MACD, where it feels like 50/50 shot on whether the signal is true or not. Even more so with candlestick reading.

Regardless, the NPC Kill rates are a very interesting trendline that I look forward to messing with more. At the absolute least, there are still interesting things to be said about where players are spending their time, and there are still a lot of trends left to pick out of this data set.

61 comments:

Unknown said...: Way over my head and insanely interesting as usual.

Is there an advantage to using JMP over R other than speed? R seems more powerful to me.; February 14, 2016 at 12:52 PM
DataScience Specialist said...: I will be interested in more similar topics. i see you got really very useful topics , i will be always checking your blog thanks
Data Science Course in Bangalore; June 13, 2020 at 8:47 AM
DataScience Specialist said...: This is very educational content and written well for a change. It's nice to see that some people still understand how to write a quality post!
Data Science Training in Bangalore; June 13, 2020 at 8:48 AM
Tech Institute said...: Excellent blog with valuable information thank for sharing.
Data Science Course in Hyderabad 360DigiTMG; September 12, 2020 at 9:47 AM
360digiTMG Training said...: Hi! This is my first visit to your blog! We are a team of volunteers and new initiatives in the same niche. Blog gave us useful information to work. You have done an amazing job!
Best Digital Marketing Institute in Hyderabad; December 6, 2020 at 11:01 PM
Global Tech Council said...: Thanks for the detailed blog.The blog consist of informational content about the topic.I really appreciate your blog post.You may also visit to the
Global Tech Council to get the best deal.

Visit- online data science certification courses; December 9, 2020 at 12:09 AM
360digiTMG Training said...: I don t have the time at the moment to fully read your site but I have bookmarked it and also add your RSS feeds. I will be back in a day or two. thanks for a great site.
Best Institute for Data Science in Hyderabad; January 31, 2021 at 9:00 PM
Ravi said...: Very Good Post. Thanks for sharing a useful info. I would also suggest for Data Science course with Real time experience, visit: https://socialprachar.com/data-science-training-in-bengaluru/; February 10, 2021 at 11:07 PM
Babit said...: Thanks for sharing the such information with us.
Data Analyst Course in Pune; March 2, 2021 at 11:42 PM
Babit said...: Really I feel happy to see this useful blog, Thanks for sharing such a nice blog.
Data Science Certification; March 12, 2021 at 11:05 PM
data scientist course said...: I see some amazingly important and kept up to length of your strength searching for in your on the site
data scientist course in hyderabad; March 18, 2021 at 12:33 AM
Anonymous said...: Informative blog, thanks for posting.
digital marketing video course; April 30, 2021 at 10:47 PM
Unknown said...: Thanks for posting this useful information.
Visit us: Business Analytics Course in Dombivli; May 31, 2021 at 11:50 PM
sandeep said...: Nice info! blog has all the details related to data science which i found helpful and i hope others also find it helpful for them.

Also visit us: "Business Analytics Course Training in Chandigarh
"; June 1, 2021 at 5:59 PM
Mohanraj A said...: Extremely helpful post, thanks for giving this wonderful article.
Visit us: Data Science Course in Rourkela; June 1, 2021 at 8:44 PM
traininginstitute said...: This was not just great in fact this was really perfect your talent in writing was great.
business analytics course; July 2, 2021 at 3:03 AM
Tejas Thakkar said...: Thank you for information

Data Analytics; July 18, 2021 at 8:11 AM
Ramesh Sampangi said...: Learn to master Data Science in real-time by doing hands-on exercises on real-time data science projects with the Data Science Training in Hyderabad program by AI Patasala.
Data Science Training Hyderabad; September 24, 2021 at 9:50 PM
traininginstitute said...: This is really very nice post you shared, i like the post, thanks for sharing..
data scientist course in malaysia; September 26, 2021 at 8:08 PM
Nirmala Mary said...: I read this blog, Nice article...Thanks for sharing and waiting for the next...
devops tutorial
devops for beginners; October 23, 2021 at 5:11 AM
Maneesha said...: Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing
data science course in hyderabad; November 22, 2021 at 2:07 AM
Ramesh Sampangi said...: Nice information. Very useful to all. I am satisfied with your site. Keep sharing more stuff like this. Thanks for sharing this blog with us.
Data Science Training in Hyderabad
Data Science Course in Hyderabad; December 16, 2021 at 10:09 PM
360DigiTMG said...: This is a smart blog. I mean it. You have so much knowledge about this issue, and so much passion. You also know how to make people rally behind it, obviously from the responses.
best data science training in hyderabad; January 18, 2022 at 1:13 AM
Ramesh Sampangi said...: Thanks for sharing this blog with us. Really informative and knowledgeable content to all. Keep up this work in further blogs.
Data Science Training in Hyderabad; February 19, 2022 at 8:16 AM
Akshat said...: https://eve-prosper.blogspot.com/2016/02/how-i-data-science-hunting-for-trends.html?showComment=1645772726790#c3163268608304538148; March 17, 2022 at 2:24 AM
data science bangalore said...: I at long last discovered incredible post here.I will get back here. I just added your blog to my bookmark locales. thanks.Quality presents is the urgent on welcome the guests to visit the website page, that is the thing that this site page is giving.data analytics course in rohtak; March 24, 2022 at 10:33 PM
patna said...: It is the superset of data mining in which data is collected. It is then cleansed with the help of statistical algorithms to transform it into a model that can efficiently represent data.; May 17, 2022 at 11:18 PM
Career Academic institute said...: Simple Linear Regression is a logistic method used to find out the relation between a single input variable and an output variable when both variables are continuous. To learn more about Simple Linear Regression start your Data Science course today with 360DigiTMG.

Data Science in Bangalore; May 19, 2022 at 8:02 AM
Career Program and Skill Development said...: Data Science has understood the necessity of every scholar and ensure that every scholar gets an unmatched studying experience for the lifetime.

Best Data Science Training institute in Bangalore; May 20, 2022 at 6:46 AM
BORIVALI said...: You should get certification in the relevant courses if you need to be considered for recruiting data experts.data science training in borivali; June 10, 2022 at 6:31 PM
Professional Career Technology said...: Enroll in the Data Science course near me to learn the handling of huge amounts of data by analyzing it with the help of analytical tools. This field offers ample job profiles to work as a Data Architect, Data Administrator, Data Analyst, Business Analyst, Data Manager, and BI Manager. Step into an exciting career in the field of Data Science and achieve great heights by acquiring the right knowledge and skills to formulate solutions to business problems.

Data Analytics Course in Calicut; June 22, 2022 at 2:08 AM
Career Program and Skill Development said...: 360DigiTMG offers the best Data Science certification course in the market with placement assistance. Get trained by IIT, IIM, and ISB alumni.

Data Science Training in Jodhpur; June 22, 2022 at 3:46 AM
Career Programs Excellence said...: Advance your technical skills required to crack huge datasets to bring out new possibilities from data. Join the Data Science institutes in Bangalore and get access to top industry trainers, LMS, live projects, assignments, and mock interviews to skyrocket your career in the ever- evolving field of Data Science.

Data Scientist Course in Bangalore; August 10, 2022 at 10:37 AM
Learning Skill Opertunity said...: Boost your professional reputation with a surefire way to pick up some impressive new skills in data science by registering for the Data science courses near me. Learn to collect, clean, and analyze data with tools like Hadoop and Spark. Learn to develop algorithms and build models in machine learning to optimize product performance and gross profit for your organization. Become an expert in techniques like Data Mining, Data Cleansing, and Data Exploring that help refine data, making it possible to present it in an understandable format.

Data Science Course Fees in Bangalore; August 11, 2022 at 11:24 AM
360digitmgmalaysia said...: This is the most amazing blog I have ever come across. Not only did I find it interesting and fast-paced, but it also motivated and encouraged me to build a successful career and take the right steps in the right direction. Taking a data analytics course will help me gain knowledge of theoretical concepts and hands-on exposure to the data science industry. 360DigiTMG teaches students courses in business analytics, data analytics, and data science and helps them get placed in top companies based on their merits and skills. The detailed information and details that have been posted here will be helpful to the readers especially the aspirants.
iot certification courses; April 24, 2023 at 3:39 AM
Anonymous said...: That was really useful and informative blog.
artificial intelligence course in Pune; May 30, 2023 at 12:55 AM
Anonymous said...: This comment has been removed by the author.; May 30, 2023 at 1:11 AM
SAii said...: Impressive breakdown of data science methods applied to gaming trends, offering valuable insights into NPC-kill rates and market dynamics.embedded systems course in hyderabad; March 23, 2024 at 11:19 PM
SM FIBER LINKS said...: That was really useful and informative blog.
High speed internet Hyderabad; June 2, 2025 at 11:01 PM
sowmya said...: This blog provides a clear overview of Salesforce CRM Training in Hyderabad
. It's great to see how Version IT focuses on real-time scenarios, expert trainers, and job-oriented modules. A perfect guide for anyone looking to build a successful career in SAP HR with top-quality training in Hyderabad!; June 20, 2025 at 2:29 AM
sowmya said...: Great post with good content and thanks for sharing!!
SAP SuccessFactors Training in Hyderabad |
Salesforce CRM Training in Hyderabad |
SAP HR Online Training |
SAP Modules Training in Ameerpet; June 20, 2025 at 2:34 AM
sowmya said...: Great post with good content and thanks for sharing!!
SAP SuccessFactors Training in Hyderabad |; June 20, 2025 at 11:35 PM
radissonhotels said...: Great post with good content and thanks for sharing!!
Family restaurant in kochi; June 26, 2025 at 11:16 PM
SM FIBER LINKS said...: Great post with good content and thanks for sharing!!
Unlimited broadband Hyderabad; July 3, 2025 at 10:56 PM
sowmya said...: Nice article! I’ve been focusing on SAP Governance and Risk Training recently, and I can relate to what you’ve written about the importance of risk and compliance in SAP modules.; July 5, 2025 at 1:58 AM
A1_Township said...: Great post with good content and thanks for sharing!!
HMDA layout near Shadnagar; July 7, 2025 at 11:18 PM
radissonhotels said...: Great post with good content and thanks for sharing!!
luxury hotel rooms in kochi; July 22, 2025 at 4:58 AM
radissonhotels said...: Great post with good content and thanks for sharing!!
fine dining restaurants in kochi; July 23, 2025 at 10:51 PM
sowmya said...: Great article! CPC certification is a game-changer for those aiming to build a solid career in medical coding. I found this CPC Certification Training in Hyderabad very helpful — expert trainers and real-time coding practice make it stand out.; July 29, 2025 at 3:19 AM
sowmya said...: For those aiming to advance their medical coding career, the CCS Training in Hyderabad is an excellent opportunity. This course is designed to help professionals gain in-depth knowledge of advanced coding systems, hospital coding practices, and prepare thoroughly for the AHIMA CCS certification. With expert trainers and hands-on learning, it’s a great step toward higher-paying roles in the healthcare industry.; August 5, 2025 at 1:48 AM
radissonhotels said...: Great post with good content and thanks for sharing!!
hotels near marine drive kochi; August 6, 2025 at 4:05 AM
sowmya said...: Great content—you’ve made the complexities of medical coding easy to grasp! If you're looking to level up, I highly recommend checking out the IP-DRG coding training in Hyderabad. It's packed with hands-on learning, real-life examples, and knowledgeable mentors to support you through mastering coding workflows and nailing your certification.; August 18, 2025 at 11:13 PM
nani said...: Nice Article!

Thanks for sharing with us 🙂

SAT Coaching in Hyderabad; August 25, 2025 at 9:20 PM
sowmya said...: Great post—your insights on medical coding are very clear and helpful! If you’re planning to advance your skills, I recommend checking out the CCS Certification Course in Hyderabad offered by Version IT. The training provides expert guidance, real-time practice, and certification support, making it an excellent choice for building a successful career in healthcare coding.; August 30, 2025 at 2:22 AM
sowmya said...: Great read! For anyone interested in starting a career in medical coding, Medical Coding Course in Hyderabad offers practical training and expert guidance to master coding skills.; September 1, 2025 at 2:47 AM
sowmya said...: Very informative post! For those looking to become certified medical coders, CPC Certification Training in Hyderabad provides hands-on learning and expert guidance to help you succeed.; September 1, 2025 at 2:48 AM
A1_Township said...: Nice Article!
top 10 real estate agents in hyderabad; September 11, 2025 at 10:18 PM
sowmya said...: Nice post — very useful!
If you wish to become a certified coder, CPC Certification Classes in Hyderabad at Version IT are the right choice.; September 17, 2025 at 2:34 AM
radissonhotels said...: Nice post — very useful!
best wedding venues & banquet halls in kochi; September 22, 2025 at 2:26 AM
sowmya said...: Great post with very helpful details!
To build a strong career in medical coding, CPC Certification in Hyderabad at Version IT is highly recommended.; September 25, 2025 at 11:48 PM
sowmya said...: Very informative article!
For specialization in hospital coding, IP-DRG Coding Training in Hyderabad at Version IT is a top choice.; September 30, 2025 at 4:37 AM