Chapter 1
AS you may have seen in text book, Mrs. Bruke, your client has provided the dataset for you (excel file). To prepare for this task, you have decided to review each worksheet and determine whether the data were gathered from internal sources, external sources, or have been generated from special studies. Also, you need to know whether the measures are categorical, ordinal, interval, or ratio. Prepare a report summarizing the characteristics of the metrics used in each worksheet.
Part 1
The dataset “Dealer Satisfaction” is taken from external source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Region |
Categorical |
Years |
Interval |
Survey Scale |
Ordinal |
Dealer Satisfaction Results: Number of Dealers |
Ratio |
Sample Size |
Ratio |
Part 2
The dataset “End user Satisfaction” is taken from external source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Region |
Categorical |
Years |
Interval |
Survey Scale |
Ordinal |
Dealer Satisfaction Results: Number of End-Users |
Ratio |
Sample Size |
Ratio |
Part 3
The dataset “Customer Survey 2014” is taken from external source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Region |
Categorical |
Quality |
Ordinal |
Ease of Use |
Ordinal |
Price |
Ordinal |
Service |
Ordinal |
Part 4
The dataset “Complaints” is taken from external source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Month |
Interval |
World |
Ratio |
NA |
Ratio |
SA |
Ratio |
Eur |
Ratio |
Pac |
Ratio |
China |
Ratio |
Part 5
The dataset “Mower Unit Sales” is taken from internal source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Month |
Interval |
NA |
Ratio |
SA |
Ratio |
Europe |
Ratio |
Pacific |
Ratio |
China |
Ratio |
World |
Ratio |
Part 6
The dataset “Tractor Unit Sales” is taken from internal source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Month |
Interval |
NA |
Ratio |
SA |
Ratio |
Eur |
Ratio |
Pac |
Ratio |
China |
Ratio |
World |
Ratio |
Part 7
The dataset “Industry Mower Total Sales” is taken from external source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Month |
Interval |
NA |
Ratio |
SA |
Ratio |
Eur |
Ratio |
Pac |
Ratio |
World |
Ratio |
Part 8
The dataset “Industry Tractor Total Sales” is taken from external source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Month |
Interval |
NA |
Ratio |
SA |
Ratio |
Eur |
Ratio |
Pac |
Ratio |
China |
Ratio |
World |
Ratio |
Part 9
The dataset “Unit Production Costs” is taken from internal source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Month |
Interval |
Tractor |
Ratio |
Mower |
Ratio |
Part 10
The dataset “Operating and Interest Expenses” is taken from internal source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Month |
Interval |
Administrative |
Ratio |
Depreciation |
Ratio |
Interest |
Ratio |
Part 11
The dataset “On-time delivery” is taken from internal source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Month |
Interval |
Number of deliveries |
Ratio |
Number On Time |
Ratio |
Percent |
Ratio |
Part 12
The dataset “Defects after Delivery” is taken from internal source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Month |
Interval |
2008 |
Ratio |
2009 |
Ratio |
2010 |
Ratio |
2011 |
Ratio |
2012 |
Ratio |
Part 13
The dataset “Time to Pay Suppliers” is taken from internal source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Month |
Interval |
Working Days |
Ratio |
Part 14
The dataset “Response time” is taken from special study, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Q1 2011 |
Ratio |
Q2 2011 |
Ratio |
Q3 2011 |
Ratio |
Q4 2011 |
Ratio |
Q1 2012 |
Ratio |
Q2 2012 |
Ratio |
Q3 2012 |
Ratio |
Q4 2012 |
Ratio |
Part 15
The dataset “Employee Satisfaction” is taken from internal source, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Quarter |
Interval |
Design & Production |
Interval |
Sample size |
Ratio |
Manager |
Interval |
Sample size |
Ratio |
Sales & Administration |
Interval |
Sample size |
Ratio |
Total |
Interval |
Sample size |
Ratio |
Part 16
The dataset “Engines” is taken from special study, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Production Time (min) |
Ratio |
Part 17
The dataset “Transmission Costs” is taken from special study, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Current |
Ratio |
Process A |
Ratio |
Process B |
Ratio |
Part 18
The dataset “Blade Weight” is taken from special study, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Weight |
Ratio |
Part 19
The dataset “Mower Test” is taken from special study, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Observations |
Categorical |
Part 20
The dataset “Employee Retention” is taken from special study, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
YearsPLE |
Ratio |
YrsEducation |
Ratio |
College GPA |
Interval |
Age |
Interval |
Gender |
Categorical |
College Grad |
Categorical |
Local |
Categorical |
Part 21
The dataset “Shipping Cost Existing” is taken from special study, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Existing Plant |
Categorical |
Customer |
Categorical |
Mowers |
Ratio |
Tractors |
Ratio |
Part 22
The dataset “Shipping Cost Proposed” is taken from special study, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Proposed Plant |
Categorical |
Customer |
Categorical |
Mowers |
Ratio |
Tractors |
Ratio |
Part 23
The dataset “Fixed cost” is taken from special study, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Current Plants |
Categorical |
Additional Capacity |
Ratio |
Cost |
Ratio |
Proposed Locations |
Categorical |
Maximum capacity |
Ratio |
Cost |
Ratio |
Part 24
The dataset “Purchasing Survey” is taken from special study, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Delivery speed |
Interval |
Price level |
Interval |
Price flexibility |
Interval |
Manufacturing image |
Interval |
Overall service |
Interval |
Salesforce image |
Interval |
Product quality |
Interval |
Usage Level |
Ratio |
Satisfaction Level |
Interval |
Size of firm |
Categorical |
Purchasing Structure |
Categorical |
Industry |
Categorical |
Buying Type |
Categorical |
Part 25
The dataset “Prices” is taken from special study, which have following metrics and classifications:
Metric in the datasheet |
Classification of Data |
Year |
Ratio |
Tractor Price |
Interval |
Mowers Price |
Interval |
Chapter 3:
You have been tasked with putting together an overview of PLE’s business performance and market position. You have specifically been asked to construct appropriate charts and summarize your conclusions for:
Step 1
To begin this we’ll again need to put together subsets of the data, e.g. Dealer Satisfaction for North America, and so on. Once we are through subsetting the data we can create the plots that describe what is going on performance wise. But first, let’s look at the data:
> str(DealerSatisfaction)
‘data.frame’: 23 obs. of 9 variables:
$ Region: Factor w/ 4 levels “CH”,”EU”,”PA”,..: NA NA NA NA NA 4 4 4 4 4 …
$ Year : int 2010 2011 2012 2013 2014 2010 2011 2012 2013 2014 …
$ L0 : int 1 0 1 1 2 0 0 0 0 1 …
$ L1 : int 0 0 1 2 3 0 0 0 1 1 …
$ L2 : int 2 2 1 6 5 0 0 1 1 2 …
$ L3 : int 14 14 8 12 15 2 2 4 3 4 …
$ L4 : int 22 20 34 34 44 6 6 11 12 22 …
$ L5 : int 11 14 15 45 56 2 2 14 33 60 …
$ Count : int 50 50 60 100 125 10 10 30 50 90 …
> str(EndUserSatisfaction)
‘data.frame’: 23 obs. of 9 variables:
$ Region: Factor w/ 4 levels “CH”,”EU”,”PA”,..: NA NA NA NA NA 4 4 4 4 4 …
$ Year : int 2010 2011 2012 2013 2014 2010 2011 2012 2013 2014 …
$ L0 : int 1 1 1 0 0 1 1 0 0 0 …
$ L1 : int 3 2 2 2 2 2 3 2 2 2 …
$ L2 : int 6 4 5 4 3 5 6 6 5 5 …
$ L3 : int 15 18 17 15 15 18 17 19 20 19 …
$ L4 : int 37 35 34 33 31 36 36 37 37 37 …
$ L5 : int 38 40 41 46 49 38 37 36 36 37 …
$ Count : int 100 100 100 100 100 100 100 100 100 100 …
There are 23 observations of 9 variables in each data file. There are six levels of satisfaction that have been recorded, which is a bit odd. Usually an odd number of levels of satisfaction are used for a Likert Scale. We could also do some descriptive statistics using the summary() function but since the years and regions would be aggregated I’m not sure that would reveal much.
First, let’s subset the data by region to create data tables. For example, we can start with the first region, North America or NA, and print out the return as:
> dealerSat_NA <- DealerSatisfaction[1:5, ]
> dealerSat_NA
Region Year L0 L1 L2 L3 L4 L5 Count
In case you had been wondering the expression “NA” is a standard expression in R and most programming environments or languages. NA typically denotes a missing value. So, R has automatically put brackets around NA in our data files even though NA for us means North America. You can set up objects this same way for South America (SA), Europe (EU), Pacific Rim (PA), and China (CH). Now, for the plotting.
Again, there will be many ways to create the required plots, e.g. the lattice package or ggplot2. We’ll use ggplot2 for this example. You’ll need to install the ggplot2 and labeling packages and attach them using library(ggplot2) and library(labeling) commands. If you have any trouble installing packages or attaching them using the library() function get in touch with me as soon as possible. This should not prevent you from completing your assignments.
There is plenty of information online about the ggplot2 package and the ggplot() function. Actually, there is too much information to go into any real detail in this document. The entire series of commands and the respective explanations I’ll use is:
First take the transpose of the desired columns of the original data table to get the data in the proper sequence for the melt command. The melt command is required to create the plot correctly.
> tdealerSat_NA <- t(dealerSat_NA[,3:8])
> tdealerSat_NA
1 2 3 4 5
L0 1 0 1 1 2
L1 0 0 1 2 3
L2 2 2 1 6 5
L3 14 14 8 12 15
L4 22 20 34 34 44
L5 11 14 15 45 56
Next, add the years as column names to get these in the proper sequence in the melded data.
> colnames(tdealerSat_NA) <- c(“2010”, “2011”, “2012”, “2013”, “2014”)
> tdealerSat_NA
2010 2011 2012 2013 2014
L0 1 0 1 1 2
L1 0 0 1 2 3
L2 2 2 1 6 5
L3 14 14 8 12 15
L4 22 20 34 34 44
L5 11 14 15 45 56
Use the melt() function to “melt” the data, i.e. put it in the proper sequence for plotting using ggplot(). The melt() function is part of the reshape2 package which you may need to install. Notice that all the levels and counts are sorted by each year. You can check this against the original data to make sure you’ve got the proper sequencing.
> data.m2 <- melt(tdealerSat_NA, id.vars=var1)
> data.m2
Var1 Var2 value
Add column names to the melded data in order to get the proper axis and legend labels in the plot.
> colnames(data.m2) <- c(“Level”, “Year”, “Counts”)
> data.m2
Level Year Counts
Create the plot with the ggplot() function. You can look this up in help() in RStudio or online.
> ggplot(data.m2, aes(x=Year, y=Counts)) + geom_bar(aes(fill=Level), position=”dodge”, stat=”identity”)
The plot this produces is:
where the number of counts for a particular level of satisfaction for the given year is shown.
Once you have your data in the proper sequence for plotting everything else is easy. To create the stacked bar charts just use:
> ggplot(data.m2, aes(x=Year, y=Counts, fill=Level)) + geom_bar(stat=”identity”)
The resulting plot is:
You can follow this same procedure for each Region for Dealer Satisfaction and End-User Satisfaction.
Step 2
There are also many ways in R to do the line plots, such as for Complaints. I’ll show you an example of a “brute force” method using the simple plot() function as follows:
> plot(Complaints$World, ylim=range(c(0,400)), type=”l”, xlab=”Month”, ylab=”Number of Complaints”) > par(new=TRUE)
> plot(Complaints$NA., ylim=range(c(0,400)), type=”l”, col=”red”, axes = FALSE, xlab = “”, ylab = “”) > par(new=TRUE)
> plot(Complaints$SA, ylim=range(c(0,400)), type=”l”, col=”green”, axes = FALSE, xlab = “”, ylab = “”) > par(new=TRUE)
> plot(Complaints$Eur, ylim=range(c(0,400)), type=”l”, col=”blue”, axes = FALSE, xlab = “”, ylab = “”) > par(new=TRUE)
> plot(Complaints$Pac, ylim=range(c(0,400)), type=”l”, col=”magenta”, axes = FALSE, xlab = “”, ylab = “”) > par(new=TRUE)
> plot(Complaints$China, ylim=range(c(0,400)), type=”l”, col=”deeppink4″, axes = FALSE, xlab = “”, ylab = “”)
In this case the axes labels are entered in the first command. The command “par(new=TRUE)” follows each line added to the plot so that you can continue to add lines to the same plot. After the first line the parameters “axes=FALSE, xlab=””, ylab=””” are added so that the axes labels are not continuously overwritten. You can use this or any other plotting methods, e.g. ggplot(), to create the line plots to finish Part 1 of Chapt
For this part of the exercise you are tasked with comparing the costs of shipping between existing locations and proposed locations using quartiles. If you have questions about quartiles the textbook can help or there is a lot of information online.
Step 1
We can lump all the costs for existing plants into one group and all the costs for proposed plants into a second group. Then compute the quartiles for shipping costs based on those groups.
This is actually quite simple in R. Just use the summary() function that we’ve been using to look at our data as follows:
> summary(ShippingCost_Existing$Mowers)
Min. 1st Qu. Median Mean 3rd Qu. Max. 1.000 1.312 1.480 1.420 1.528 1.720
> summary(ShippingCost_Existing$Tractors) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.260 1.768 1.840 1.879 2.105 2.340
> summary(ShippingCost_Proposed$Mowers)
Min. 1st Qu. Median Mean 3rd Qu. Max. 0.910 1.400 1.520 1.514 1.660 1.980
> summary(ShippingCost_Proposed$Tractors)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.170 1.775 2.010 1.958 2.170 2.680
Part 3 In the third part of this exercise you’ve been tasked with developing a summary about customer attributes. This summary is to be built on the average responses from customers in the 2014 Customer Survey. It should be done by region and include frequency distributions, histograms and quartiles as appropriate. The attributes in the survey include: Quality, Ease of Use, Price and Service.
Step 1
The only new function we’ll use to complete this part of the exercise is the hist() function to create the required histograms. But first we’ll need to subset the data in order to get it in the proper sequence to calculate the averages and frequency distributions.
First, look at the data as usual:
> str(CustomerSurvey2014)
‘data.frame’: 200 obs. of 5 variables:
$ Region : Factor w/ 4 levels “China”,”Eur”,..: NA NA NA NA NA NA NA NA NA NA …
$ Quality : int 4 4 4 5 5 5 5 5 4 4 …
$ Ease.of.Use: int 1 4 5 4 4 5 4 5 4 5 …
$ Price : int 3 4 4 4 5 3 4 4 4 4 …
$ Service : int 4 5 3 4 4 5 2 5 5 5 …
> summary(CustomerSurvey2014)
Region Quality Ease.of.Use
China: 10 Min. :1.000 Min. :1.000
Eur : 30 1st Qu.:4.000 1st Qu.:4.000
Pac : 10 Median :5.000 Median :4.000
SA : 50 Mean :4.395 Mean :4.165
NA’s :100 3rd Qu.:5.000 3rd Qu.:5.000
Max. :5.000 Max. :5.000
Price Service
Min. :1.00 Min. :1.00
1st Qu.:3.00 1st Qu.:4.00
Median :4.00 Median :4.00
Mean :3.67 Mean :4.14
3rd Qu.:4.00 3rd Qu.:5.00
Max. :5.00 Max. :5.00
This has already given us the frequency distribution by region. That is, 100 responses or 50% come from North America and so on. But, we have a problem because the data file codes “North America” as
“NA” which is a standard phrase in R for a missing value. This makes the straightforward application of functions a mess. There is a compounding problem in that the variable concerned, “Region”, is a factor variable. So…
If you have already imported the data file CustomerSurvey2014.csv into RStudio you will want to remove it using:
> rmCustomerSurvey2014)
In order to get the “NA” in this file to just be NA you need to have the parameter “stringsAsFactors” equal FALSE. In order to do that, if you are using the pull down menu to import data you need to uncheck the box “Strings as Factors” before you import the data. If you are using the command line use:
> CustomerSurvey2014 <- read.csv(“~/MyRWork/data/Evans/CustomerSurvey2014.csv”, stringsAsFactors=FAL SE)
Unfortunately, this is not all you need to do. R/RStudio will still recognize the text NA as representing missing values but now it is a text string and not the
> CustomerSurvey2014[is.na(CustomerSurvey2014)] <- “NorthA”
> str(CustomerSurvey2014)
‘data.frame’: 200 obs. of 5 variables:
$ Region : chr “NorthA” “NorthA” “NorthA” “NorthA” …
$ Quality : int 4 4 4 5 5 5 5 5 4 4 …
$ Ease.of.Use: int 1 4 5 4 4 5 4 5 4 5 …
$ Price : int 3 4 4 4 5 3 4 4 4 4 …
$ Service : int 4 5 3 4 4 5 2 5 5 5 …
Now, we can easily create a histogram, or because the variable is a factor variable the bar chart, as follows:
> barplot(table(CustomerSurvey2014$Region))
Because the remaining variables in the CustomerSurvey2014 data file are numeric variables we can use the hist() function as follows:
> hist(CustomerSurvey2014$Quality, main=”Quality – Number of Responses”, xlab=”Level of Quality”)
There are many additional things you can do to make plots in R/RStudio look very professional. I encourage you to explore all the options using the R/RStudio documentation and available information online.
Step 2
The remaining portion of this part of the exercise asks you to compute the Quartiles. You’ve done that before so it isn’t necessary to repeat that here.
You are tasks with proposing a dashboard of the most important business information needed on a routine basis. You are free to complete this part of the exercise as you think best.
Chapter 4
Part 1
For the Performance Lawn Equipment case study at the end of Chapter 4 you are tasked with developing the following:
Step 1 (Part 1)
In order to create the clustered and stacked bar charts in Chapter 3 we created a data object “tdealerSat_NA”. This was the transpose of the Dealer Satisfaction data for the North America region. We can use that data object to compute the mean satisfaction ratings and standard deviations per year. First, let’s recall what the tdealerSat_NA data object returns, with column headings:
> tdealerSat_NA
2010 2011 2012 2013 2014
L0 1 0 1 1 2
L1 0 0 1 2 3
L2 2 2 1 6 5
L3 14 14 8 12 15
L4 22 20 34 34 44
L5 11 14 15 45 56
These are frequencies at the different levels. Here is a brute force approach to computing the mean for
North America for the year 2010:
> m_NA2010 <- ((tdealerSat_NA[2,1]*1) + (tdealerSat_NA[3,1]*2) + (tdealerSat_NA[4,1]*3) + (tdealerSat_NA[
5,1]*4) + (tdealerSat_NA[6,1]*5))/sum(tdealerSat_NA[,1])
> m_NA2010 [1] 3.78
Because we were only given frequencies for the various levels we really can’t use other somewhat more sophisticated approaches. If you want to try other approaches you will have the same problem with the designation of North American as NA as before. So, you can go through a similar process for the data files Dealer Satisfaction and End-User Satisfaction to get the designation for North American set to NorthA. The difference is that this time the variable Region is a character variable. So, after we read in the data file with “stringsAsFactors” set to FALSE, we use:
> DealerSatisfaction[is.na(DealerSatisfaction)] <- “NorthA”
To use the brute force approach more easily than the up/down arrows and changing the column number you could write a short R script to loop over the data.
m=0
for (j in 1:25){
# print(j)
m[j] <- ((DealerSatisfaction[j,4]*1) + (DealerSatisfaction[j,5]*2) + (DealerSatisfaction[j,6]*3) + (DealerSatisfaction[j,7]*4)
+(DealerSatisfaction[j,8]*5))/sum(DealerSatisfaction[j,3:8])
# print(m[j])
print(m)
which gives you the mean values for all the regions for all the years in a vector m. To get this in a matrix use:
> n <- matrix(m, 5, byrow=FALSE)
> n
[,1] [,2] [,3] [,4] [,5]
[1,] 3.780000 4.000000 3.933333 3.200000 3.000000 [2,] 3.920000 4.000000 4.000000 3.400000 3.142857 [3,] 3.966667 4.266667 4.120000 3.666667 3.687500 [4,] 4.110000 4.500000 4.066667 4.100000 NA [5,] 4.112000 4.500000 4.066667 3.833333 NA
where the first column is North America, the second is South America and so on, the first row is 2010, the second 2011, and so on. Note that an artifact of looping through the data is the years 2010 and 2011 for China actually appear at the bottom of the column for China.
We can follow the same process to get the standard deviations and in fact combine the two processes. We’ll use the calculation for standard deviation using the mean. If you do not know what the calculation for this is you should look it up and understand how this works. The R script is:
m=0 n=0
for (j in 1:25){
# print(j)
m[j] <- ((DealerSatisfaction[j,4]*1) + (DealerSatisfaction[j,5]*2) + (DealerSatisfaction[j,6]*3) + (DealerSatisfaction[j,7]*4)
+(DealerSatisfaction[j,8]*5))/sum(DealerSatisfaction[j,3:8])
n[j] <- sqrt(((DealerSatisfaction[j,3] * (0 – m[j])^2) + (DealerSatisfaction[j,4] * (1 – m[j])^2) + (DealerSatisfaction[j,5] * (2 – m[j])^2) + (DealerSatisfaction[j,6] * (3 – m[j])^2) + (DealerSatisfaction[j,7] * (4 – m[j])^2) + (DealerSatisfaction[j,8] * (5 – m[j])^2))/ (sum(DealerSatisfaction[j,3:8])-1))
# print(m[j])
# print(n[j])
} print(m) print(n)
So, we have the standard deviations in a vector n[j] which we can convert to a matrix as before (note that
since our std’s are now in the matrix n we’ll increment our naming to p):
> p <- matrix(n, 5, byrow = FALSE)
> p
[,1] [,2] [,3] [,4] [,5]
[1,] 0.9749935 0.6666667 0.8837151 0.8366600 NaN [2,] 0.8533248 0.6666667 0.8451543 0.8944272 0.6900656 [3,] 0.9382036 0.8276820 0.7257180 1.0327956 0.7932003 [4,] 1.0720979 0.8630747 0.6396838 0.7378648 NA [5,] 1.0940897 0.9149200 0.7396800 0.8348471 NA
Now that you have a process you can calculate the values for End-User Satisfaction and report your findings.
Step 2 (Part 2)
Getting descriptive statistics in R is very easy. There are a number of packages that have been built to handle this. Let’s use the psych package. You’ll need to install and attach this package. It will provide the following:
and range
For example, the “item name” will be Quality or Ease of Use, etc. The “item number” is really irrelevant right now. The “number of cases” is the number of observations, e.g. for North America will be 100. You may need to do a few more calculations, e.g. subtracting the minimum from the maximum to get the range. However, it is really easy to get descriptive statistics in R.
The hard part, as usual, is getting the data in the correct format or sequence to use. In this case we’ll need to subset the customer survey data by region in order to complete this part of the exercise. We can use the following to do that:
> custSurveyNA <- CustomerSurvey2014[1:100,-1]
> custSurveySA <-CustomerSurvey2014[102:151,-1]
The mode is not something that is typically a part of an R package but there is a lot of information about finding the mode in R online. I’ll leave that for you to find the way you prefer to do it. We’ve already used different ways of finding the sum and count. I’ll also leave that for you to review and determine which way you prefer.
Once you have the data subset into data objects then just use the describe() function in the psych package as follows:
> describe(custSurveySA)
vars n mean sd median trimmed mad min max range skew kurtosis
Quality 1 50 4.26 0.78 4 4.38 0.74 1 5 4 -1.49 4.07
Ease.of.Use 2 50 3.94 0.74 4 4.00 0.00 1 5 4 -1.39 3.92
Price 3 50 3.54 1.07 4 3.60 1.48 1 5 4 -0.59 -0.49
Service 4 50 4.20 0.83 4 4.30 1.48 1 5 4 -1.20 2.30 se
Quality 0.11
Ease.of.Use 0.1
Price 0.15
Service 0.12
Another package and function to use are the pastecs package and stat.desc() function. For the same data the stat.desc() function returns:
> stat.desc(custSurveySA)
Quality Ease.of.Use Price Service
So, in these descriptive statistics you get the mean of the Confidence Interval, the sum, the variance as well as the standard deviation and so on.
Step 3 (Part 3)
The data file Response Time is already set-up by quarters as follows:
> str(ResponseTime)
‘data.frame’: |
50 obs. of 8 variables: |
$ Q1.2013: num |
4.36 5.42 5.5 2.79 5.55 3.65 8.02 4 3.34 4.92 … |
$ Q2.2013: num |
4.33 4.73 1.63 4.21 6.89 0.92 5.27 0.9 3.85 5 … |
$ Q3.2013: num |
3.71 2.52 2.69 3.47 5.12 1 3.44 6.04 2.53 2.39 … |
$ Q4.2013: num |
4.44 4.07 5.11 3.49 4.69 6.36 8.26 1.91 8.93 6.85 … |
$ Q1.2014: num |
2.75 3.24 4.35 5.58 2.89 5.09 2.33 1.69 3.88 3.39 … |
$ Q2.2014: num |
3.45 1.95 2.77 1.83 3.72 4.59 1.17 1.46 1.9 2.95 … |
$ Q3.2014: num |
1.67 2.58 3.47 3.12 1 5.4 3.9 4.49 2.06 4.49 … |
$ Q4.2014: num |
2.55 2.3 1.04 1.59 3.11 4.05 3.38 1.26 0.9 2.31 … |
So, to find how the response differs by quarter we can look at our descriptive statistics and create a plot of the mean. To do this we’ll calculate and store the means of the quarters in a data object in one command. I’ll just name this data object “mResponseTime”.
[1] 3.9152 3.7260 3.7472 4.4530 3.0880 3.1136 3.2034 2.5278
Try to find the value for alabels yourself to get a plot as follows:
> plot(mResponseTime, type = “l”, lwd=2, col=”green”, ylab = “Mean”, xlab = “Quarter”, main=”Mean Response
Time”, ylim = c(0,5), xaxt=”n”)
> axis(side = 1, at = c(1:8), labels = alabels, pch=0.5)
Try to find the definition of alabels yourself. To get a plot as follows:
Step 4 (Part 4)
This is very similar to Part c above. We just need to find the means of the defects after delivery over time and plot. Looking at the data we find:
> str(DefectsAfterDelivery)
‘data.frame’: 12 obs. of 6 variables:
$ Month: Factor w/ 12 levels “April”,”August”,..: 5 4 8 1 9 7 6 2 12 11 ..
Again, already set-up as needed. I’ll leave it to you to determine how you want to complete this part of
the exercise.
Step 5 (Part 5)
The last part of the exercise for Chapter 4 is a bit different. Now we’re tasked with determining the correlation between PLE’s sales and overall Industry sales by mowers and tractors. As usual, the hardest part will be getting the data in the proper format/sequence to apply a function for computing the correlation. First, looking at the data we find:
> str(MowerUnitSales)
‘data.frame’: 60 obs. of 7 variables:
$ Month : Factor w/ 60 levels “Apr-10″,”Apr-11”,..: 21 16 36 1 41 31 26 6 56 51 …
$ NA. : int 6000 7950 8100 9050 9900 10200 8730 8140 6480 5990 …
$ SA : int 200 220 250 280 310 300 280 250 230 220 …
$ Europe : int 720 990 1320 1650 1590 1620 1590 1560 1590 1320 …
$ Pacific: int 100 120 110 120 130 120 140 130 130 120 …
$ China : int 0 0 0 0 0 0 0 0 0 0 …
$ World : int 7020 9280 9780 11100 11930 12240 10740 10080 8430 7650 …
> str(IndustryMowerTotalSales)
‘data.frame’: 60 obs. of 6 variables:
$ Month: Factor w/ 60 levels “Apr-10″,”Apr-11”,..: 21 16 36 1 41 31 26 6 56 51 …
$ NA. : int 60000 77184 77885 86190 96117 97143 84757 79804 64800 59307 …
$ SA : int 571 611 658 778 886 882 848 735 657 595 …
$ Eur : int 13091 17679 22759 27966 27895 30566 29444 28364 28393 24444 …
$ Pac : int 1045 1111 1068 1237 1313 1176 1359 1238 1215 1154 …
$ World: int 74662 96585 102369 116171 126210 129768 116409 110141 95065 85500 …
which is interesting, but involves some work. We really want to combine these data files keeping the month/year variables from the Mower Unit Sales data file. Let’s proceed as follows:
> totalMowerSales <- MowerUnitSales[,]
> str(totalMowerSales)
‘data.frame’: 60 obs. of 7 variables:
$ Month : Factor w/ 60 levels “Apr-10″,”Apr-11”,..: 21 16 36 1 41 31 26 6 56 51 …
$ NA. : int 6000 7950 8100 9050 9900 10200 8730 8140 6480 5990 …
$ SA : int 200 220 250 280 310 300 280 250 230 220 …
$ Europe : int 720 990 1320 1650 1590 1620 1590 1560 1590 1320 …
$ Pacific: int 100 120 110 120 130 120 140 130 130 120 …
$ China : int 0 0 0 0 0 0 0 0 0 0 …
$ World : int 7020 9280 9780 11100 11930 12240 10740 10080 8430 7650 …
> totalMowerSales[,8:12]<- IndustryMowerTotalSales[,-1]
> str(totalMowerSales)
‘data.frame’: 60 obs. of 12 variables:
$ Month : Factor w/ 60 levels “Apr-10″,”Apr-11”,..: 21 16 36 1 41 31 26 6 56 51 …
$ NA. : int 6000 7950 8100 9050 9900 10200 8730 8140 6480 5990 …
$ SA : int 200 220 250 280 310 300 280 250 230 220 …
$ Europe : int 720 990 1320 1650 1590 1620 1590 1560 1590 1320 …
$ Pacific: int 100 120 110 120 130 120 140 130 130 120 …
$ China : int 0 0 0 0 0 0 0 0 0 0 …
$ World : int 7020 9280 9780 11100 11930 12240 10740 10080 8430 7650 …
$ NA..1 : int 60000 77184 77885 86190 96117 97143 84757 79804 64800 59307 …
$ SA.1 : int 571 611 658 778 886 882 848 735 657 595 …
$ Eur : int 13091 17679 22759 27966 27895 30566 29444 28364 28393 24444 …
$ Pac : int 1045 1111 1068 1237 1313 1176 1359 1238 1215 1154 …
$ World.1: int 74662 96585 102369 116171 126210 129768 116409 110141 95065 85500 …
We just need to get our variable names or column headings straightened out as follows:
> colnames(totalMowerSales) <- c(“Date”, “NorthA”, “SA”, “Eur”, “Pac”, “China”, “World”, “IndustryNorthA”, “IndustrySA “, “IndustryEur”, “IndustryPac”, “IndustryWorld”)
> head(totalMowerSales)
Date NorthA SA Eur Pac China World IndustryNorthA IndustrySA IndustryEur IndustryPac IndustryWorld
1 Jan-10 6000 200 720 100 0 7020 60000 571 13091 1045 74662
2 Feb-10 |
7950 220 990 120 |
0 9280 |
77184 |
611 |
17679 |
1111 |
965 85 |
3 Mar-10 |
8100 250 1320 110 |
0 9780 |
77885 |
658 |
22759 |
1068 |
102369 |
4 Apr-10 |
9050 280 1650 120 |
0 11100 |
86190 |
778 |
27966 |
1237 |
116171 |
5 May-10 |
9900 310 1590 130 |
0 11930 |
96117 |
886 |
27895 |
1313 |
126210 |
6 Jun-10 |
10200 300 1620 120 |
0 12240 |
97143 |
882 |
30566 |
1176 |
129768 |
and we are set. We can get the coefficient of variance for mower sales using the stat.desc() function as before:
> stat.desc(totalMowerSales)
Date NorthA SA Eur Pac China World IndustryNorthA IndustrySA IndustryE ur
nbr.val NA 6.000000e+01 6.000000e+01 6.000000e+01 6.000000e+01 60.000000 6.000000e+01 6.000000e+01 6.000000 e+01 6.000000e+01
nbr.null NA 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 51.000000 0.000000e+00 0.000000e+00 0.00000
0e+00 0.000000e+00
nbr.na NA 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000 0.000000e+00 0.000000e+00 0.000000 e+00 0.000000e+00
min NA 4.350000e+03 1.800000e+02 3.000000e+02 1.000000e+02 0.000000 5.350000e+03 4.259600e+04 4.620000e
+02 6.977000e+03
max NA 1.037000e+04 3.900000e+02 1.650000e+03 2.400000e+02 26.000000 1.228000e+04 1.006800e+05 8.860000 e+02 3.056600e+04
range NA 6.020000e+03 2.100000e+02 1.350000e+03 1.400000e+02 26.000000 6.930000e+03 5.808400e+04 4.240 000 e+02 2.358900e+04
sum NA 4.525400e+05 1.694000e+04 6.894000e+04 1.035000e+04 113.000000 5.488830e+05 4.354853e+06 4.05520
0e+04 1.267206e+06
median NA 7.870000e+03 2.800000e+02 1.260000e+03 1.700000e+02 0.000000 9.390000e+03 7.588300e+04 6.54000
0e+02 2.383150e+04
mean NA 7.542333e+03 2.823333e+02 1.149000e+03 1.725000e+02 1.883333 9.148050e+03 7.258088e+04 6.758667 e+02 2.112010e+04
SE.mean NA 2.273237e+02 6.108097e+00 4.870278e+01 4.810681e+00 0.709138 2.672965e+02 2.1598 52e+03 1.3439
22e+01 8.605123e+02
CI.mean NA 4.548737e+02 1.222227e+01 9.745403e+01 9.626151e+00 1.418982 5.348591e+02 4.321854e+03 2.68918
2e+01 1.721881e+03var NA 3.100564e+06 2.238531e+03 1.423176e+05 1.388559e+03 30.172599 4.286845e+06 2.798977e+08 1.083676e
+04 4.442888e+07
std.dev NA 1.760842e+03 4.731312e+01 3.772501e+02 3.726338e+01 5.492959 2.070470e+03 1.673014e+04 1.040998 e+02 6.665499e+03
coef.var NA 2.334612e-01 1.675789e-01 3.283291e-01 2.160196e-01 2.916615 2.263291e-01 2.305034e-01 1.540241e-0
1 3.155998e-01
IndustryPac IndustryWorld nbr.val 6.000000e+01 6.000000e+01 nbr.null 0.000000e+00 0.000000e+00 nbr.na 0.000000e+00 0.000000e+00 min 1.045000e+03 5.398200e+04 max 2.182000e+03 1.297680e+05 range 1.137000e+03 7.578600e+04 sum 9.769200e+04 5.760249e+06 median 1.552500e+03 9.795500e+04 mean 1.628200e+03 9.600415e+04
SE.mean 4.272222e+01 2.816721e+03
CI.mean 8.548696e+01 5.636245e+03 var 1.095113e+05 4.760350e+08 std.dev 3.309249e+02 2.181823e+04 coef.var 2.032458e-01 2.272634e-01
To find the correlation table and simultaneously find the significance of the correlations we’ll use the Hmisc package. To install this package you may have to install other dependent packages, e.g. acepack and data.table. If you get error messages just look for missing packages and install what you need. Once you have Hmisc installed and attached using the library() function then you can use the rcorr() function to get the correlation table and significance as follows (I’ve highlighted the correlations between PLE’s mower sales and Industry mower sales for SA, Eur and Pac and corresponding P values):
> rcorr(as.matrix(totalMowerSales[2:12]))
NorthA SA Eur Pac China World IndustryNorthA IndustrySA IndustryEur IndustryPac IndustryWorld
NorthA 1.00 0.70 0.70 -0.10 0.21 0.99 1.00 0.85 0.67 -0.08 0.97
SA 0.70 1.00 0.44 0.52 0.50 0.71 0.67 0.76 0.49 0.50 0.68
Eur 0.70 0.44 1.00 -0.33 0.05 0.78 0.69 0.76 0.98 -0.33 0.83
Pac -0.10 0.52 -0.33 1.00 0.44 -0.11 -0.12 -0.09 -0.24 0.99 -0.15
China 0.21 0.50 0.05 0.44 1.00 0.21 0.22 0.23 0.21 0.43 0.24
World 0.99 0.71 0.78 -0.11 0.21 1.00 0.99 0.87 0.76 -0.10 0.99
IndustryNorthA 1.00 0.67 0.69 -0.12 0.22 0.99 1.00 0.83 0.67 -0.10 0.97
IndustrySA 0.85 0.76 0.76 -0.09 0.23 0.87 0.83 1.00 0.76 -0.09 0.87
IndustryEur 0.67 0.49 0.98 -0.24 0.21 0.76 0.67 0.76 1.00 -0.25 0.82
IndustryPac -0.08 0.50 -0.33 0.99 0.43 -0.10 -0.10 -0.09 -0.25 1.00 -0.14
IndustryWorld 0.97 0.68 0.83 -0.15 0.24 0.99 0.97 0.87 0.82 -0.14 1.00
n= 60
P
NorthA SA Eur Pac China World IndustryNorthA IndustrySA IndustryEur IndustryPac IndustryWorl
NorthA 0.0000 0.0000 0.4676 0.1161 0.0000 0.0000 0.0000 0.0000 0.5345 0.0000
SA 0.0000 0.0004 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
Eur 0.0000 0.0004 0.0104 0.6856 0.0000 0.0000 0.0000 0.0000 0.0106 0.0000
Pac 0.4676 0.0000 0.0104 0.0005 0.4020 0.3708 0.5106 0.0594 0.0000 0.2517
China 0.1161 0.0000 0.6856 0.0005 0.1144 0.0946 0.0795 0.1069 0.0007 0.0661
World 0.0000 0.0000 0.0000 0.4020 0.1144 0.0000 0.0000 0.0000 0.4527 0.0000
IndustryNorthA 0.0000 0.0000 0.0000 0.3708 0.0946 0.0000 0.0000 0.0000 0.4287 0.0000
IndustrySA 0.0000 0.0000 0.0000 0.5106 0.0795 0.0000 0.0000 0.0000 0.4921 0.0000
IndustryEur 0.0000 0.0000 0.0000 0.0594 0.1069 0.0000 0.0000 0.0000 0.0517 0.0000
IndustryPac 0.5345 0.0000 0.0106 0.0000 0.0007 0.4527 0.4287 0.4921 0.0517 0.2787
IndustryWorld 0.0000 0.0000 0.0000 0.2517 0.0661 0.0000 0.0000 0.0000 0.0000 0.2787
Because the P values are very low, essentially 0, the correlations are statistically significant. You can follow this same procedure for tractor sales. So, this you can write up in your laboratory report and concludes Chapter 4.
Chapter 5
Step 1
To determine which distribution is appropriate to model the failure of an individual mower consider the section on the Bernoulli Distribution that starts on page 146. Remember that the Bernoulli Distribution has two outcomes; success or failure. So, the answer to Question 1 is the Bernoulli Distribution.
Step 2
The mower test data has 100 observations and 30 samples per observation. So, you’ll need to read in the csv file for MowerTest. 100 times 30 is 3,000. So, to get the overall failure rate we need to determine how many tests were “Fail”. This is easy is R/RStudio. Use the length() function as follows:
> countFail <- length(which(MowerTest == “Fail”))
> countFail
[1] 54
So there are 54 “Fail” in the MowerTest data set, or the fraction of “Fail”, i.e. the probability of “Fail” is
54/3000 = 0.018.
Step 3
The next question asks us to find the probability of having from 0 to 20 failures in the next 100 mowers tested. Again, we’ll use the binomial distribution. The R/RStudio function is dbinom(). The entire command is:
> y <- dbinom(0:20, 100, .018)
> y
[1] 1.626106e-01 2.980642e-01 2.704432e-01 1.619354e-01 7.198046e-02 [6] 2.533243e-02 7.352080e-03 1.809677e-03 3.856160e-04 7.225391e-05 [11] 1.205213e-05 1.807485e-06 2.457222e-07 3.048911e-08 3.472938e-09 [16] 3.649767e-10 3.554063e-11 3.218967e-12 2.720716e-13 2.152308e-14 [21]
1.597793e-15
> plot(y)
And the plot looks like:
Step 4:
Question 4 asks us to find the average blade weight and how much variability there is in blade weights. To answer this question we need the BladeWeight csv file. Read in the data set then use sum() in the command to calculate the average blade weight as follows:
> BladeWeight <- read.csv(“~/MyRWork/data/Evans/BladeWeight.csv”)
> View(BladeWeight)
> str(BladeWeight)
‘data.frame’: 350 obs. of 2 variables:
$ Sample: int 1 2 3 4 5 6 7 8 9 10 …
$ Weight: num 4.88 4.92 5.02 4.97 5 4.99 4.86 5.07 5.04 4.87 …
> (sum(BladeWeight$Weight)/350) [1] 4.9908
The variation is calculated as the standard deviation. R/RStudio uses the sd() function for the standard
deviation as follows:
> sd(BladeWeight$Weight) [1] 0.1092876
So, we could expect the blade weight to be 4.99 +/- 2*0.11. Note that we’ve rounded the standard
deviation and assumed a 2-tail solution via the empirical rules.
Step 5
Question 5 asks us to determine the probability that the blade weight can exceed 5.20. To do this we use the pnorm() function for the Normal Distribution in R/RStudio. The commands are as follows:
> y = pnorm(5.20, mean=4.99, sd=0.11)
> y
[1] 0.9718748
> 1 – y
[1] 0.02812518
Step 6
Question 6 asks us to determine the probability that the blade weight will be less than 4.80. Again, we’ll
use the pnorm() function. The command is:
> pnorm(4.80, mean=4.99, sd=0.11) [1] 0.04205935
Step 7
Question 7 asks us to find the number of blades that exceeded 5.20 or were less than 4.80 from the data.
We’ll use the length() function again, as follows:
> countblade <- length(which(BladeWeight$Weight > 5.20))
> countblade
[1] 7
> countblade2 <- length(which(BladeWeight$Weight <= 4.80))
> countblade2 [1] 8
Notice that I could have gotten a different answer if I had set the test to be greater than or equal to using the >= logical operator. Likewise I could have gotten a different answer if I had use < rather than <= as the logical operator in the second computation.
Step 8
Question 8 asks us to examine, over time, the process that makes the blades by considering changes in blade weights over time. We can just plot the blades manufactured to see if there is any variation over time. We can use the plot() function to generate a scatterplot to look at this as follows:
> plot(BladeWeight$Sample, BladeWeight$Weight)
And, the plot looks like:
From the scatterplot it doesn’t look like there is too much variation about the average blade weight of
4.99
Step9
Looking at the scatter plot, only close to the 200th blade was there any trouble. It isn’t too much trouble to find that this is blade #171.
Step 10
Last, we are asked if the normal distribution is a good assumption for the blade weight data. To do this we’ll want to plot a histogram of the data. Histograms are easy to generate in R/RStudio. Just use the hist() function as follows:
> hist(BladeWeight$Weight)
As is usually true, the hist() function has many additional parameters. You might want to try a few, e.g. setting up the bins the way you want them rather than allowing the function to automatically create bins. The plot looks like:
Which looks pretty normal. If desired you can add a line for the probability density function. I’ll leave it up to you to look up how to do this in R/RStudio.
Chapter 6:
Part 1
What proportion of customer’s rate the company with “top box” survey responses (which is defined as scale levels 4 and 5) on quality, ease of use, price, and service in the 2014 Customer Survey worksheet? How do these proportions differ by geographic region?
Step1
The proportion of customer’s rate the company with “top box” survey responses (which is defined as scale levels 4 and 5) on quality, ease of use, price, and service in the 2014 Customer Survey worksheet is:
Proportion of customers with “top box” ratings: |
||||
Region |
Quality |
Ease of Use |
Price |
Service |
Total |
0.9 |
0.88 |
0.65 |
0.82 |
The most tap box ratings is obtained for “ease of use and quality” which is 90% customers.
Proportion of customers in each geographic region with “top box” ratings: |
||||
Region |
Quality |
Ease of Use |
Price |
Service |
China |
0.7 |
0.9 |
0.2 |
0.1 |
Eur |
0.77 |
0.90 |
0.77 |
0.73 |
NA |
0.96 |
0.9 |
0.66 |
0.89 |
Pac |
0.9 |
0.8 |
0.9 |
0.9 |
SA |
0.9 |
0.84 |
0.6 |
0.86 |
The china region is rated with less top box rating in quality, ease of use, price and service compared to the region European, NA, SA and Pac.
Part 2
What estimates, with reasonable assurance, can PLE give customers for response times to customer service calls?
Step 1:
The estimates are calculated below:
Q1 2013 |
Q2 2013 |
Q3 2013 |
Q4 2013 |
Q1 2014 |
Q2 2014 |
Q3 2014 |
Q4 2014 |
|
Sample Size |
50 |
50 |
50 |
50 |
50 |
50 |
50 |
50 |
Sample Mean |
3.92 |
3.73 |
3.75 |
4.45 |
3.09 |
3.11 |
3.20 |
2.53 |
Sample Standard Deviation |
1.482 |
1.916 |
1.399 |
2.119 |
1.585 |
1.228 |
1.279 |
1.131 |
Confidence Level |
95% |
95% |
95% |
95% |
95% |
95% |
95% |
95% |
Alpha |
0.05 |
0.05 |
0.05 |
0.05 |
0.05 |
0.05 |
0.05 |
0.05 |
Margin of Error |
0.421178454 |
0.544515602 |
0.397561722 |
0.602084062 |
0.45052 |
0.3490466 |
0.3636004 |
0.3213805 |
Lower Confidence Limit |
3.49 |
3.18 |
3.35 |
3.85 |
2.64 |
2.76 |
2.84 |
2.21 |
Upper Confidence Limit |
4.34 |
4.27 |
4.14 |
5.06 |
3.54 |
3.46 |
3.57 |
2.85 |
PLE can offer the overhead interval estimates for the response time in each of the quarter.
Part 3
Engineering has collected data on alternative process costs for building transmissions in the worksheet Transmission Costs. Can you determine whether one of the proposed processes is better than the current process?
Step 1
Current |
Process A |
Process B |
|
Sample Size |
30 |
30 |
30 |
Sample Mean |
$289.60 |
$285.50 |
$298.43 |
Sample Standard Deviation |
$45.40 |
$64.94 |
$20.87 |
Confidence Level |
95% |
95% |
95% |
Alpha |
0.05 |
0.05 |
0.05 |
Margin of Error |
16.95257621 |
24.25024319 |
7.79120178 |
Lower Confidence Limit |
$272.65 |
$261.25 |
$290.64 |
Upper Confidence Limit |
$306.55 |
$309.75 |
$306.22 |
According to the above analysis indicates no significance difference between the current process and the two proposed process. Thus, we can select one of the proposed process against the current process.
Part
What would be a confidence interval for an additional sample of mower test performance as in the worksheet Mower Test?
Step 4
The analysis is shown below:
Mean fraction of the failures |
0.018 |
Standard deviation |
0.0024 |
Prediction Interval |
95% |
Alpha |
0.05 |
n |
30 |
t |
2.045229642 |
Margin of Error |
0.005046539 |
Lower Confidence Limit |
0.013 |
Upper Confidence Limit |
0.023 |
Thus, the confidence interval is (0.013, 0.023)
Part 5
For the data in the worksheet Blade Weight, what is the sampling distribution of the mean, the overall mean, and the standard error of the mean? Is a normal distribution an appropriate assumption for the sampling distribution of the mean?
Step 5
The sampling distribution of the mean is equal to the population mean and it will be approximately normally distributed. The overall mean is 4.9. The standard error of the mean is 0.005842.
Part 6
How many blade weights must be measured to find a 95% confidence interval for the mean blade weight with a sampling error of at most 0.2? What if the sampling error is specified as 0.1?
Step 6
The 2 blade weights must be measured to find a 95% confidence interval for the mean blade weight with a sampling error of at most 0.2 and The 5 blade weights must be measured to find a 95% confidence interval for the mean blade weight with a sampling error of at most 0.1.
Essay Writing Service Features
Our Experience
No matter how complex your assignment is, we can find the right professional for your specific task. Contact Essay is an essay writing company that hires only the smartest minds to help you with your projects. Our expertise allows us to provide students with high-quality academic writing, editing & proofreading services.Free Features
Free revision policy
$10Free bibliography & reference
$8Free title page
$8Free formatting
$8How Our Essay Writing Service Works
First, you will need to complete an order form. It's not difficult but, in case there is anything you find not to be clear, you may always call us so that we can guide you through it. On the order form, you will need to include some basic information concerning your order: subject, topic, number of pages, etc. We also encourage our clients to upload any relevant information or sources that will help.
Complete the order formOnce we have all the information and instructions that we need, we select the most suitable writer for your assignment. While everything seems to be clear, the writer, who has complete knowledge of the subject, may need clarification from you. It is at that point that you would receive a call or email from us.
Writer’s assignmentAs soon as the writer has finished, it will be delivered both to the website and to your email address so that you will not miss it. If your deadline is close at hand, we will place a call to you to make sure that you receive the paper on time.
Completing the order and download