Let’s not start with data science this time. Let’s start with psychology. I am far from having any competence in this domain, but I remember in high school being presented the Maslow’s hierarchy of needs. The best I can describe it is the different stage humans must go through to find happiness. To get better understanding of it, you can look here.
Here is the famous pyramid.
The key thing to get out of this is that you need to go through all steps, you can skip any, never.
‘Needs lower down in the hierarchy must be satisfied before individuals can attend to needs higher up.’ ― simplypsychology ―
OK wait a minute, is this supposed to be a blog about technology? Yes, you are right, here it goes.
As you may know, I recently change career. I move to Shopify as a Data Scientist. Like any other career changes, by working in a new environment, with new people, new boss, new everything, you discover new way of doing things, new way of seeing the world.
Like a mentor told me one day, “You will see, by changing job, what looked so hard in your old job will not be tough at all in the new place. What you never knew could be an issue will be rough challenge in the new place.” Like many other times, he was right.
In a previous role I used to find myself a bit disconnected as a machine learning “expert” or whatever you like to call it. I would be doing very cool things but it was sometime a bit hard to get the full picture.
When I joined Shopify I was introduced to the Data Science Hierarchy of Needs. You can read the full article here, I think it’s really worth it. But TLDR It is useless to hire top data scientists if you don’t have the proper base.
So to make this more concrete, here is the famous pyramid.
It took me a while to really understand it, but as cool as doing crazy multi-level-convolutional-neural-net-deep-random-forest-other-cool-buzzword-maybee-even-microservices, you cannot jump to this stage before being accomplished in the underlying stage.
Not only in terms of individual skills, I mean, as a Company, even as a project. When you work on a project, the project as to cross these stages. Most likely (Like the Maslow’s pyramid) you will never reach the last step for most of your projects.
Like explained in the Maslow’s pyramid. There is absolutely no way you can be too far in the steps. If you jump to a late stage too fast, it is not a solid pyramid and has a high chance of failure. Moreover, to properly accomplish each of these step, you must use output from the previous step. At the base of the pyramid, output is data, processes and databases, later in the process it is more around insight, answers, etc.
Each and every component from the previous step is crucial to the success of the following stage.
My new role at Shopify being slightly different then in the past. I am involve in almost all phase of the projects I participate in. I can tell you something crazy, most of the value ($) comes from a good understanding of the business problem, the product and the data. Not from using funky algorithms, not from using super advance tools.
Don’t get me wrong, these fancy algorithms are useful, the super-expensive tools are useful (depends, but this is another topic) but it should not be the first thing to do. It is never a low hanging fruit, it is rarely the high $ generator.
Unless you have done the rest you should not lose time with this.
All across the pyramid
The other thing I have discovered with this is the power of individually owning all the pyramid for a project (or part of a project).
Because I wrote the ETL for this data, because I was sitting with the Devs, when we decided what data to produce and how to wrap it, because I was with the project manager when we decided key metrics for the project and what we would A/B test. I understand the data, like I never did before, I truly understand where it comes from, how it is transformed and how to use it.
This makes my life so much easier when it gets the time to derive any insight. I am not dependent on anyone else to take time to explain me. If I find weird behavior. I can either, explain them or have them corrected. I can work with the data.
When you jump to a higher stage too quickly, you are lying. Lying to yourself, lying to your team and lying to the great discipline of data science. I know it does not sound cool, but be humble and invest the effort where it should be. But data deserves it.