2021 Kaggle Survey Analysis

Kashish Rastogi
3 min readNov 17, 2021

--

Data Visualization is an Essential Part of Data Understanding this notebook contains Plotly infographics charts.

Plotly Infographics (bar chart)
Plotly Bar Chart (Infographics)

Introduction

Many of you have heard or read articles about the buzz created by Data Science, Machine Learning. As Clive Humby states “Data is the new oil” It’s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc to create a valuable entity that drives profitable activity; so, must data be broken down, analyzed for it to have value. And In 2011, the senior vice-president of Gartner, Peter Sondergaard, took this concept even further by stating “Information is the oil of the 21st century, and analytics is the combustion engine”

This is the 5th year conducting an in-depth user survey & publicly sharing the results. Over 25,000+ data scientists and ML engineers submitted responses on their backgrounds and day-to-day experience — everything from educational details to salaries to preferred technologies and techniques.

This blog is divided into several parts here I am going to compare how students, Data Scientists, Machine Learning Engineers, and Research scientists work and will try to find out the common things they usually do. So, the students can start shifting their focus on those topics and get a step forward to be successful.

You can find all the codes here on Kaggle and data here.

Scatter chart Plotly

While overseeing the past data and the data given in this competition. Let’s find out the overall growth of the female to male ratio. Out Of the 25,973 survey participants, only 18.8% of respondents in the dataset were Women. We cannot state that only 18.8% of the data science community is female. Still, the lack of participation on the part of women is cause for great concern.

The data which I have chosen to do analysis consist of students, Data scientists, Research scientists, and machine learning engineers. Looks like students are participating in the survey more than any other data-related audience.

Let’s find out which level of education do students or data scientists have acquired till now or planning to do in the upcoming next 2 years. As the data has quite a significant number of students so we might have growth in no formal education past high school or in bachelor’s degree.

You can find all the codes here on Kaggle and data here. Do live an upvote if you find it interesting.

You can contact me here

Linkedin | Kaggle | Blog

--

--

Kashish Rastogi

Data Analyst | Data Visualization | Storyteller | Tableau | Plotly