Tuesday, June 18, 2024
HomeNewsKaggle ML and DS Survey 2022: Key Insights

Kaggle ML and DS Survey 2022: Key Insights

Kaggle, the world’s largest data science community, released the Kaggle ML and DS Survey findings for 2022. Here are some insights from the survey:

Demographic characteristics

The survey shows that India exhibited a strictly increasing trend in the number of data scientists working and residing there during the last five years. Japan was amongst other countries that have shown a rising trend, while countries like the US have shown near-stagnant growth with a hike in the number of data scientists during 2022.

Programming skills and coding infrastructures

As per the survey, Python and SQL remained the most prominent programming languages a data scientist must know. Python outstands SQL by a significant margin and has surpassed R programming and other programming languages like C++, Java, or Javascript, which do not necessarily aid people in excelling in the field.

JupyterLab remains the most widely used source-coding notebook environment, followed by Google Colab and Kaggle notebooks, replacing the traditional R Studio and MATLAB. The survey also reveals that many data scientists have actively shifted to VS Code for software development. 

Machine Learning Framework

Scikit-Learn stands out as the most popular framework, followed by TensorFlow and XGBoost. While they have been on the top of data scientists’ lists, they exhibited a near-constant utility, while PyTorch has been growing steadily.

The findings include concrete numbers on the number of people working with data, trends in machine learning across industries, and the best approaches for aspiring data scientists to enter the profession. It is an intriguing example of a survey dataset because Kaggle provided all the data, not just the aggregated survey results, allowing analysts to study the data independently.

Kaggle ML and DS Survey Competition 2022

Kaggle announced a competition following the sixth annual industry-wide survey to surface a comprehensive view of the country’s machine learning and data science state. 

It is initiating the annual Data Science Survey Challenge and will award US$30,000 in prizes to notebook authors who best describe a particular segment of the data science and machine learning community. The challenge is an opportunity for people to use their imagination and create a story of a group of people with whom they identify.

Read More: AWS Open-Sourced its EC2 Trn1 Instances Powered by AWS-Designed Trainium Chips

The submissions will be evaluated on the following:

  • Composition: the narrative and the subject should be well put together, researched, and supported by data and visualizations. 
  • Documentation: the code and notebooks should be understandable to an ordinary reader, with adequately cited sources and a concise analysis of each step. The documentation should represent the rationale behind your story.
  • Originality: the entry should be informative, thought-provoking, and non-plagiarized.

A submission must be contained in a single notebook and made public before the submission deadline to be considered valid. In addition to the Kaggle Data Science survey, participants are welcome to utilize any other datasets. For a submission to be accepted, it must be made accessible to the general public on Kaggle by the deadline.

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our WhatsApp Channel and Discord Server to be a part of an engaging community.

Disha Chopra
Disha Chopra
Disha Chopra is a content enthusiast! She is an Economics graduate pursuing her PG in the same field along with Data Sciences. Disha enjoys the ever-demanding world of content and the flexibility that comes with it. She can be found listening to music or simply asleep when not working!


Please enter your comment!
Please enter your name here

Most Popular