Analytics Drift interacted with Arihant Jain, Lead Data Scientist at ZestMoney, to get his perspective on various data science topics to help freshers make effective career decisions. Arihant is a mechanical engineer by degree and a data scientist by choice. Over the years, he has been actively mentoring students who want to get into the data science field. Currently, Arihant has more than six years of data science experience while working with some prominent organizations like Vodafone, RBL Bank, and Genpact.
AD: According to you, what are some of the mistakes made by freshers?
Arihant Jain: Beginners often consider themselves data scientists after learning Python, Statistics, SQL, and Mathematics. But, the real world is different; one has to have a business understanding to obtain returns on investment for companies. Data scientists are expected to assimilate business challenges and analyze whether it is solvable using machine learning techniques. If problems can be solved, the next step involves collecting the right data from different sources to eventually implement technical learnings. Along with technical skills, they should focus on structural and critical thinking, understanding the domain, converting business problems into analytical problems, and more.
AD: Why do most of the data science projects fail to go into production?
Arihant Jain: There are multiple factors as to why data science projects fail to go into production. One of the reasons projects remain in proof of concept for life is because professionals fail to accurately define the problem statements. This is where business understanding plays a significant role. In addition, failing to communicate the results of the proof of concepts with the decision-makers leads to failure in many data science projects. As a result, effective storytelling skills are vital for data scientists to bring ideas into reality within organizations. Critical thinking, business understanding, and storytelling, although underlooked, are essential for any data scientist to thrive in their careers as these help in delivering value in organizations.
AD: Do you think data scientists need to know about product development?
Arihant Jain: In the last couple of years, there was a demand for talents who could build models, but the curve is shifting, and it is going to move rapidly. Companies now realize that there is no value in hiring people who can only create models in Jupyter Notebook. Relying on software engineers to develop the products might not be the best way forward since they may not implement data science projects in the desired way. This is why MLOps is a prominent trend, where data scientists need to write production-level code and understand the deployment of models.
It might be too much to ask from beginners to learn MLOps, but merely an understanding will differentiate them from the rest. Obviously, they can become proficient as they move ahead in their careers, at least, a basic knowledge like fundamentals of Dockers and wrapping it up to deploy models locally and on the cloud will take them a long way.
Today, data scientists do not need to know some secrets to build models. Anyone can refer to articles and GitHub to create ML models. Now, the core skill which remains is how to design a model, how to design a problem statement, how to sell it through better storytelling, and how to deploy it and measure the effects. Organizations are already considering these skills while hiring, and, in the coming years, the demand for such skills will grow dramatically.
AD: How will the AutoML impact data science jobs?
Arihant Jain: AutoML is another hype created in the last couple of years, but it surely will not eat up the data science jobs. However, I am still optimistic about AutoML solutions’ role in automating redundant jobs in data science workflows. But, identifying and defining a problem is something that can only be done by professionals. Automating trivial tasks is a win-win for data scientists and AutoML providers, as professionals can focus on creative things, thereby bringing value to the organizations. Data scientists still spend a lot of time cleaning and other redundant tasks. Eliminating such practices with AutoML is what everyone wants.
AD: Is there a talent supply and demand gap in the industry?
Arihant Jain: In a way, there is no supply and demand gap because we see many data scientists coming out every day in a month. But, there is a difference between a data scientist who learns in nine months with certification versus having real skills, which is where the gap exists. Data science would not have been popular the way it is only because someone can import a library and run code. Rather it is about the impact data science can bring if implemented with due diligence. Organizations need professionals who can assist in generating revenue in organizations, not just write some algorithms that do not deliver value.
Although data science practitioners are coming out of institutes who are trained on Python and other tools, when it comes to the value they can bring, the industry is struggling to find the right talent. Unfortunately, I think the talent gap will still exist until aspirants start thinking independently because that is what data scientists do; learn the necessary skills while having a strong foundation. However, aspirants get lost in the ocean of information and learn several techniques instead of honing up basics.
AD: Many aspirants pursue data science because of the hype. What would you advise them?
Arihant Jain: If learners want to identify whether data science is for them or are just here because of the hype, they should work for three months on various projects, and if they feel like learning more, then data science is for them. One of the best ways to access oneself is by participating in Kaggle competitions that run for three to six months. After the contest ends, evaluate the leaderboard position and contemplate if they loved the process. If the answer to the latter is no, then one will eventually get frustrated in a few months. Do not enter the field because of the high paycheck. This field requires dedication, commitment, and hard work to solve problems and create impact.