Data science aspirants, be it freshers or other IT professionals, believe that obtaining online certification is more than enough to land a job in the competitive domain. This is mainly because they consider themself data scientists after acquiring the knowledge of tools and techniques from different sources. Aspirants think they are ready to join any organizations and analyze whatever data they are provided with. However, this is a myth. You are not always provided with the necessary data in organizations. You would not have everything readily available at your table to apply your learning from MOOCs or other courses. Consequently, devising a data science career and succeeding in it is not as straightforward as aspirants envision.
To help our readers make the right data science career decision and understand why data scientists fail to deliver outside their theoretical knowledge, we interacted with Chiranjiv Roy, Chief of Data Science & AI Products at Access2Justice Technologies. Chiranjiv provided several valuable insights into the industrial practices of data science while also suggesting the right approach for aspirants to succeed.
We also asked Chiranjiv about his journey and the practices he embraced to make a fruitful data science career. In his 20 years long career, Chiranjiv has become a learned data scientist while working for some prominent organizations like ResoluteAI.in, Nissan Motors, Mercedes-Benz Research and Development, Hewlett Packard Enterprise, WNS Global Services, and HSBC.
Today, Chiranjiv is also a visiting faculty at Engineering and Business Schools. He has 14+ patents filed on the usage of data science to solve real-world automotive and manufacturing problems by developing, enhancing products and gaining efficiency.
Chiranjiv’s Data Science Journey Has No Shortcuts
Chiranjiv has a Bachelor in Statistics and Dual Masters in Statistics and Computational Mathematics, and a PhD in Applied Data Science for Industrial Engineering. His foundation was laid with a keen interest in computational mathematics, physics, and applied statistics during his education. The love for data helped him carry out his master’s thesis and research work on failure models in a manufacturing factory, thereby giving him the confidence to keep learning.
Starting his career at HSBC in 2001 as a Data Analyst, Chiranjiv’s expertise with data and statistics helped him quickly become a manager at the company. After his five-year and eight-month spell at HSBC, he moved to WNS Global Services as a Senior Data Manager. Later he went on to become a Lead Data Scientist at Mercedes Benz and then Nissan Motors. “Data science was not a field of great importance or as popular as it is now when I started, but just got lucky that I never had to make a shift in my career and had my journey from data engineering to data science in the last two decades,” says Chiranjiv.
While working in countries like the US and India for companies like Hewlett Packard, Nissan Motor, and Mercedes-Benz Research and Development India, and more, Chiranjiv has extensively worked in the area of risk management, automobile, manufacturing, and optimization systems. He believes that working with top-line researchers has helped him learn and fall more in love with data science in developing Real-Time Applications of Data Engineering, Analytics and Sciences in Data Monetizations.
Also Read: Creating 3D Images From 2D Images Using Autoencoder
Approach To Solving Data Science Problems
Chiranjiv says it takes years to become a data scientist since the roadmap is quite linear. Over the years, Chiranjiv has learned about the field in academics as well as while working for various organizations to gradually become the data scientist he is today. However, witnessing the unrest in beginners/practitioners about becoming a data scientist, he stresses the fact that today aspirants want to quickly dive into their data science career without understanding the intricacies of the landscape.
“Data engineering and analysis is the first step to be a good data scientist. But, aspirants believe learning some programming languages and algorithms will make them a successful data scientist. What they do not understand is that a data scientist is a problem solver not only a Python programmer,” says Chiranjiv.
To become a problem solver, one should know or understand the business challenges and convert them into data problems. Practitioners, in contrast, try to fit algorithms into problems. Given a business challenge, beginners immediately think of applying some machine learning models without even understanding the business domain or the real challenge.
This mostly happened due to numerous misinformation spread by different sources. Consequently, Chiranjiv explains the ideal approach, which one should follow in order to succeed in their data science career.
“From data science courses one can only obtain the fundamentals. But, that is not enough to solve problems. One needs to spend time in understanding the problem and knowing which domain the problem is related to. If you do not understand the business domain, talk to people who are well versed in the domain. The most critical aspect of becoming a data scientist is to understand the domain before forging towards data analysis. Unless you acquire domain knowledge, you should not jump into solving the problems by fitting data science algorithms,” says Chiranjiv.
“Once you have the business understanding, the next step is to assess the data and form business problems. Following this, you can start developing the models. That means, you need to evaluate which model is the right fit and then create a proof of concept (POC). However, this does not end here.”
“As a data scientist you also need to be a good storyteller. After the POC, you have to leverage your visualization skills and showcase how your models’ result can be effective in mitigating business challenges to the decision makers.”
“Storytelling is an essential skill for data scientists as top management do not know which machine learning or deep learning models are implemented to get the results. This is where visualization simplifies the job of decision makers in inferring outcomes and forecasting the value models can deliver. But value is only created if you are able to communicate effectively with the product developers. For this, you need to have knowledge of agile-based software development practices. In organizations, data scientists’ efforts can only bring business growth if their models are in the production,” he adds.
Is a PhD Essential To Succeed?
PhD teaches your process, time management, approach perfections, focus amidst enormous challenges, a self-starting attitude and helps you become a problem-solver more than a researcher.
When asked about whether PhD candidates have the edge over people who learn from online courses, Chiranjiv said that it depends on what aspirants want to become. There are two aspects of data science: Approach and Implementation. Academics focus on teaching the best approaches to solve problems. But, three or six month online courses can only teach the fundamentals of data science.
You should take at least a year-long course to ensure you go beyond just the fundamentals and learn the best data science approaches. This is where long-term courses, primarily PhD, assist; which is apparent through the number of research papers they publish.
Although long-term programs make you exceptional at approach, for the second aspect, implementation, you need to develop a product mindset. More often than not, the ideal approach for a solution might not be feasible to make a product. For instance, while solving a business problem, if you develop one of the superior classes of neural networks or deep learning techniques, the practical implementation might not be possible because of the absence of the required libraries or computational resources. Then you have to compromise on the best approach, which gives 99% accuracy versus an approach that can be implemented with the available libraries and computational resources but only delivers 90% accuracy. Such bargains have to be carried out by data scientists as product development strategy includes return on investment, timeline, and agile principles.
Academics can give you knowledge of fundamentals and techniques. But, your intelligence will come into the picture when you talk to the software development team and come out with the top five models out of which they might be able to implement the number three, four, and five to achieve the same result as the non-feasible but the best models — one and two. Optimizing the number three, four, or five models to achieve the outcome of what the best models would deliver is your intelligence. This is what organizations expect from data scientists.
You cannot make products just by doing MOOCs or other academic courses. It is not always about the best approach, you can only have a successful data science career in organizations if your skills can help deliver products. There is a high probability and propensity that your top three models will never get into the production — this is a practical reality.
Most of the business challenges in the world are solved by regression and support vector machines. Then who wins the game? Is it the data scientist who deploys basic approaches and makes products or the data scientist who has learned every approach in the world but keeps struggling to get his/her model into the production? The former, since he/she can bring business value to organizations with data-driven products. Which means a PhD is not necessary. You can learn from any resources but make sure that the fundamentals are clear.
However, if you do not want to work in organizations and be a professor, PhD is a must.
Also Read: LinkedIn Fairness Toolkit (LiFT) For Explainability In Machine Learning
Work Experience
Working with a wide range of technology organizations and entrepreneurs has helped Chiranjiv gain knowledge in highly regulated industries such as financial services, manufacturing, IT systems, and automotive.
Currently, as a data science leader at Access2Justice Technologies, Chiranjiv manages a team of data scientists, communicates with stakeholders, and creates a culture for the team to thrive. According to Chiranjiv, culture can be the difference between a successful and a bummer data science initiative, which can define organizational growth.
To accomplish the right culture, Chiranjiv pinpoints the importance of a leader’s role in companies. A leader has to set aside his time to guide the new joinee by educating them about the problems and ensure the delivery of desired results in the future. But, this seldom happens. Often, professionals are left to fit in the team, as a result, organizations fail to harness the full potential of practitioners.
As a part of his job, Chiranjiv also hires and promotes data scientists and ML engineers for potential roles. He assesses their intent to learn, knowledge of data structures and algorithms, and a self-challenging attitude. “I also help leaders and HRs who ask me to recommend proficient data scientists. As a result, I always helped aspirants/professionals in obtaining the first role or a good transition.
Final Thoughts For Aspirants
For aspirants, Chiranjiv explains that data science is value-driven, it is not cost reduction or resource calibration for any business. He suggests that problems are everywhere, which means one can obtain data from a wide range of things around them; Kaggle is not the only place where you can get datasets.
To further clarify, Chiranjiv cited an instance where a student from IIM asked him how he will get an internship or a job or newer datasets on Kaggle to solve real-world problems, given the hiring and the activity on data science platforms have slowed down due to COVID-19. Chiranjiv explained to him how he could use an Arduino in his hostel between the power source and televisions of different brands to collect the data of power fluctuation during power outages. Then he can use the gathered information to create a time series plot and develop a model to analyze the surge in power of several television brands. Eventually, he can come to a conclusion and showcase which manufacturer is more reliable, thereby helping the college to buy superior televisions in the future.
Going back to coding, working and mentoring startups from understanding actual business problems, developing system designs, data architecture, data models, DevOps, DataOps, and MLOps, the world demands from a data scientist is a leader who can take a concept to real-life product.
“Aspirants should come up with different use cases on their own, this will showcase their interest in solving business problems. However, aspirants are dependent on Kaggle for datasets because it prepares and provides the data. The reality, in contrast, is that you will not get data while working for organizations. You only get problem statements. One has to think through to find different sources to collect data, and trust me DATA IS ALL AROUND US,” concludes Chiranjiv.
Stay tuned to our website for our upcoming column: How To Become A Successful Data Scientist...