The data science pyramid

Categories Data Science

Let’s not start with data science this time. Let’s start with psychology. I am far from having any competence in this domain, but I remember in high school being presented the Maslow’s hierarchy of needs. The best I can describe it is the different stage humans must go through to find happiness. To get better understanding of it, you can look here.

Here is the famous pyramid.

The key thing to get out of this is that you need to go through all steps, you can skip any, never. 

Needs lower down in the hierarchy must be satisfied before individuals can attend to needs higher up.

OK wait a minute, is this supposed to be a blog about technology? Yes, you are right, here it goes. 

New Job

As you may know, I recently change career. I move to Shopify as a Data Scientist. Like any other career changes, by working in a new environment, with new people, new boss, new everything, you discover new way of doing things, new way of seeing the world.

Like a mentor told me one day, “You will see, by changing job, what looked so hard in your old job will not be tough at all in the new place. What you never knew could be an issue will be rough challenge in the new place.” Like many other times, he was right. 

In a previous role I used to find myself a bit disconnected as a machine learning “expert” or whatever you like to call it. I would be doing very cool things but it was sometime a bit hard to get the full picture. 

When I joined Shopify I was introduced to the Data Science Hierarchy of Needs. You can read the full article here, I think it’s really worth it. But TLDR It is useless to hire top data scientists if you don’t have the proper base.

So to make this more concrete, here is the famous pyramid.

It took me a while to really understand it, but as cool as doing crazy multi-level-convolutional-neural-net-deep-random-forest-other-cool-buzzword-maybee-even-microservices, you cannot jump to this stage before being accomplished in the underlying stage. 

Not only in terms of individual skills, I mean, as a Company, even as a project. When you work on a project, the project as to cross these stages. Most likely (Like the Maslow’s pyramid) you will never reach the last step for most of your projects. 

Like explained in the Maslow’s pyramid. There is absolutely no way you can be too far in the steps. If you jump to a late stage too fast, it is not a solid pyramid and has a high chance of failure. Moreover, to properly accomplish each of these step, you must use output from the previous step. At the base of the pyramid, output is data, processes and databases, later in the process it is more around insight, answers, etc. 

Each and every component from the previous step is crucial to the success of the following stage. 


My new role at Shopify being slightly different then in the past. I am involve in almost all phase of the projects I participate in. I can tell you something crazy, most of the value ($) comes from a good understanding of the business problem, the product and the data. Not from using funky algorithms, not from using super advance tools. 

Don’t get me wrong, these fancy algorithms are useful, the super-expensive tools are useful (depends, but this is another topic) but it should not be the first thing to do. It is never a low hanging fruit, it is rarely the high $ generator.

Unless you have done the rest you should not lose time with this.

All across the pyramid

The other thing I have discovered with this is the power of individually owning all the pyramid for a project (or part of a project).

Because I wrote the ETL for this data, because I was sitting with the Devs, when we decided what data to produce and how to wrap it, because I was with the project manager when we decided key metrics for the project and what we would A/B test. I understand the data, like I never did before, I truly understand where it comes from, how it is transformed and how to use it.

This makes my life so much easier when it gets the time to derive any insight. I am not dependent on anyone else to take time to explain me. If I find weird behavior. I can either, explain them or have them corrected. I can work with the data. 


When you jump to a higher stage too quickly, you are lying. Lying to yourself, lying to your team and lying to the great discipline of data science.  I know it does not sound cool, but be humble and invest the effort where it should be. But data deserves it.

1 thought on “The data science pyramid

  1. I love it when I can disagree with you 🙂 . I’m no psychologist either, but I disagree with the assertion: “you need to go through all steps, you can skip any, never”. If I’m in a dangerous situation, I’m might forego the fact that I’m hungry or tired to get out of danger way. If I’m psychologically distressed, I might need friends, love, encouragements to get out of harms way. I might be totally happy with my life, fulfilling all the checkboxes in the pyramid and still decide to undergo a long and difficult fasting for whatever reason, and that won’t reduce my happiness, might even raise it!

    As most things in life, nothing is black or white. Everything is some shade of grey. As for the Maslow pyramid, the Data Science pyramid stages may be bypassed in certain circumstances. They may well need to be bypassed in certain cases. And you may want to bypass some to get fulfillment as you could decide to perform fasting. For sure, fasting is not a “permanent” condition, but could be required for your well being and fulfillment.

    In a previous job, we had a way to collect, move and store data, but not in any reliable way. Still, we were able and had to perform most of the above mentioned staged in order to demonstrate the potential of the technology. In order to get it accepted and that in the future we could revisit the parts that were sub-optimal. Sometime you have to get some comforting from a friend in order to get out of harm way before you can eat!

Leave a Reply

Your email address will not be published. Required fields are marked *