Week 1 >>> Basic Data Descriptors and Data Distributions >>> Basic Data Descriptors, Statistical Distributions, and Application to Business Decisions
Download the file “OrderList.xlsx.” This file contains basic customer order information, including an order number, the region where the order was placed, the age of the customer, and the total dollar amount of the order.
Note: Please use a “.” instead of a “,” to indicate a decimal point.
The total number of orders in the “OrderList.xlsx” file is:
The median of “Total Sale $” is larger than the mean. By how much? Round to 2 decimal places.
What is the standard deviation of Total Sale? Round to 2 decimal places.
What percentage of orders fell within the interquartile range of Total Sale?
What is the approximate shape of the distribution of total sales? (Hint: Create a histogram to see, or use what you know about the mean/median relationship and the rule of thumb.)
Given the limited information you have, your boss wants you to group customers in a meaningful way. You decide to take a look at how the order region impacts things. Calculate the average total sales from the North region only. What is the difference between the North region average total sales and the average total sales across all regions (including the North)? Round to 2 decimal places.
[Note: To calculate the average total sales from the North region only you could either “sort” the data and calculate the average or “filter” the data, copy and paste as values and then calculate the average. Please refer to Course 1 of this specialization for details on sorting and filtering data]
What is the absolute value of difference between the North region median total sales and all orders median total sales (across all regions including the North)?
Round your answer to two decimal places.
[Note: To calculate the median total sales from the North region only you could either “sort” the data and calculate the median or “filter” the data, copy and paste as values and then calculate the median. Please refer to Course 1 of this specialization for details on sorting and filtering data]
Next, take a look at customer age. Create 3 age groups: 21-30, 31-40, 41-50. What is the average total sales for the age group with the highest average? Round to 2 decimal places.
What is the median total sales of the age group with the highest average Total Sales? Round to 2 decimal places.
Given the mean and median of the group with the highest average sales, what can you say about the distribution of total sales within that group?
Based on this data, what would you recommend to your boss?