Exploratory Data Analysis -Retail

- October 29, 2022

Task 03-Exploratory Data Analysis -Retail

Perform ' Exploratory Data Analysis ' (EDA) on dataset "SampleSuperstore"

Problem Statement :- As a business manager, try to find out the weak areas where you can work to make more profit.

What all business problems you can derive by exploring the data ?

So What is Exploratory Data Analysis (EDA) ?

Exploratory Data Analysis, or EDA, is an important step in any Data Analysis or Data Science project. EDA is the process of investigating the dataset to discover patterns, and anomalies (outliers), and form hypotheses based on our understanding of the dataset.

EDA involves generating summary statistics for numerical data in the dataset and creating various graphical representations to understand the data better. In this article, we will understand EDA with the help of an example dataset. We will use Python language (Pandas library) for this purpose.

Data :- https://bit.ly/3i4rbWl

Data Insight :-

Data Consist of 13 columns and 9994 row , Where data doesn't not have any Null- values present .

But it Consists of some Duplicates Values , some of the row have been repeated so we have to remove the Duplicates values.

Removing the Outliers present in the "Profit" and "Sales Columns using the boxplot for removing the Outliers.

(code can see in the github link provide below.............)

Data Visualisation :-

Visualization consist of "Ship Mode" , "Segment" , "Category" and "Sub-Category"

(some of visual shown here rest can seen through the code file attach below........)

Conclusion:-

All the discounts more than 20% result in loss for the company.

Data :-

https://drive.google.com/file/d/1ujZWSi9wguZvrgJGGwBFJ-2jOcmAgWrz/view?usp=sharing

Linked ID :-

https://www.linkedin.com/in/raghavcho/

github :-

https://github.com/dsraghav/Spark-Foundation-Internship-

Feel free to connect to me if any issues with analysis or any suggestion do it.

Thank You.

Comments