The aim of the study was to use the statistical knowledge gained in class to work on some real life situation data. The idea was to obtain some datasets and conduct some statistical analysis. The dataset used for this study was obtained from Julie Pallant which has a set of collected datasets. The link to the data is given here click here to see the link to the data. The data is named sleep.
This particular dataset is a real data file that is condensed from a study that was done to explore the prevalence and impact of sleep problems on various aspects of people’s lives. The subjects of this study were the staff from a university in Melbourne, Australia who were invited to complete a questionnaire containing set of questions regarding their sleep behaviour (for instance number of hours slept per night). For the purposes of analysis I chose a sample of 35 from the dataset.
The data contains 35 observations with a total of seven variables (three categorical and four numerical/quantitative variables) namely;
Variable |
Type |
Sex of the respondent |
Categorical |
Marital status of the respondent |
Categorical |
Highest education level achieved |
Categorical |
Age of the respondent |
Numerical/Quantitative |
Hours of sleep in weekends |
Numerical/Quantitative |
How many hours sleep needed |
Numerical/Quantitative |
Cigs per day |
Numerical/Quantitative |
I began by doing simple one quantitative variable analysis where I analyzed measures such as mean, median, quartiles range, minimum and maximum as well as the measures of distribution (skewness and kurtosis).
Table 1 below gives the summary statistics of the variable number of hours of sleep in the weekends. The average sleep hours during the weekend is 8.3857 hours with a media hours sleep being 8 hours. The most frequent (mode) number of hours of sleep is 8 hours with the minimum and maximum number of hours being 4 hours and 14 hours respectively. The range for the 95% confidence interval is 0.6614.
Table 1: hours sleep/ week ends |
|
Mean |
8.385714286 |
Standard Error |
0.325451021 |
Median |
8 |
Mode |
8 |
Standard Deviation |
1.925394208 |
Sample Variance |
3.707142857 |
Kurtosis |
2.200243786 |
Skewness |
0.119166062 |
Range |
10 |
Minimum |
4 |
Maximum |
14 |
Sum |
293.5 |
Count |
35 |
Confidence Level (95.0%) |
0.661396051 |
The skewness value is 0.119; this value is between − 0.5 and + 0.5, I can say that the distribution is approximately symmetric.
The kurtosis value is 2.2 which is greater than 0 so this data could be following a mesokurtic distribution.
To test for normality a Kolmogorov-Smirnov test or Shapiro-Wilk test in SPSS can be used to explore whether the data is normally distributed. I presented the normality test in table 2 below;
Table 2: Tests of Normality |
||||||
Kolmogorov-Smirnova |
Shapiro-Wilk |
|||||
Statistic |
df |
Sig. |
Statistic |
df |
Sig. |
|
hours sleep/ week ends |
.221 |
35 |
.000 |
.908 |
35 |
.007 |
a. Lilliefors Significance Correction |
From the normality test, it is clear that the data is not normally distributed.
The histogram presented below attempts to visualize the distribution of the data. Looking at the figure it looks like it is normally distributed though the test gives a different conclusion.
After analyzing one quantitative variable I analyzed one categorical variable which is highest level of education. Results showed that majority (28.6%, n =10) of the respondents had secondary education. Undergraduate and postgraduate respondents were represented by 25.7% (n = 9) each. Those with post-secondary training were represented by 20% (n = 7).
Table 3: highest education level achieved |
|||||
Frequency |
Percent |
Valid Percent |
Cumulative Percent |
||
Valid |
secondary school |
10 |
28.6 |
28.6 |
28.6 |
trade training/ post-secondary training |
7 |
20.0 |
20.0 |
48.6 |
|
undergraduate degree |
9 |
25.7 |
25.7 |
74.3 |
|
postgraduate degree |
9 |
25.7 |
25.7 |
100.0 |
|
Total |
35 |
100.0 |
100.0 |
Next I analyzed relationship between two categorical variables. The variables of interest are gender and marital status. The idea was to test whether there is a relationship between gender and marital status.
To explore the relationship, a Chi-Square test of independence was conducted;
H0: There is no association between gender and marital status.
H1: There is association between gender and marital status.
This was tested at 5% level of significance (α = 0.05).
Table 4: marital status * sex Cross tabulation |
|||||
sex |
Total |
||||
female |
male |
||||
marital status |
single |
Count |
4 |
6 |
10 |
Expected Count |
5.7 |
4.3 |
10.0 |
||
% within sex |
20.0% |
40.0% |
28.6% |
||
married/defacto |
Count |
14 |
7 |
21 |
|
Expected Count |
12.0 |
9.0 |
21.0 |
||
% within sex |
70.0% |
46.7% |
60.0% |
||
divorced |
Count |
2 |
2 |
4 |
|
Expected Count |
2.3 |
1.7 |
4.0 |
||
% within sex |
10.0% |
13.3% |
11.4% |
||
Total |
Count |
20 |
15 |
35 |
|
Expected Count |
20.0 |
15.0 |
35.0 |
||
% within sex |
100.0% |
100.0% |
100.0% |
Looking at the table we can see that 20% (n = 4) of the female respondents were single while 40% (n = 6) of their male counterparts were single. 70% (n = 14) of the females were married while only 46.7% (n = 7) of the males were married
I conducted a chi-square test of association to investigate whether there is association between sex of respondents and their marital status. There was no association between sex of respondents and the marital status of the respondents, = 0.357, p > .05. We can conclude that sex of the respondent does not significantly the marital status of the respondent.
Chi-Square Tests |
|||
Value |
df |
Asymp. Sig. (2-sided) |
|
Pearson Chi-Square |
2.061a |
2 |
.357 |
Likelihood Ratio |
2.065 |
2 |
.356 |
Linear-by-Linear Association |
.624 |
1 |
.430 |
N of Valid Cases |
35 |
||
a. 3 cells (50.0%) have expected count less than 5. The minimum expected count is 1.71. |
Analysis of One Relationship between a Categorical Variable and a Quantitative Variable:
In this section I looked at the relationship between one quantitative variable and one categorical variable. I looked at sex of the respondent and the hours of sleep/weekend. To achieve this I set my hypothesis as follows;
H0: The mean hours of sleep/weekend for the male and female are same
H1: The mean hours of sleep/weekend for the male and female are different.
This was tested at 5% level of significance.
In testing the hypothesis, an independent t-test was used. This test is usual when comparing the two groups of data sets like in our case.
Table 6: Group Statistics |
|||||
sex |
N |
Mean |
Std. Deviation |
Std. Error Mean |
|
hours sleep/ week ends |
female |
20 |
8.050 |
1.9595 |
.4381 |
male |
15 |
8.833 |
1.8484 |
.4773 |
Table 6 above gives the group statistics. As can be seen, the average hours of sleep/weekend for the male respondents is 8.833 while that of the female respondents is 8.05. Male respondents spent more hours sleeping compared to the female respondnets.
Independent Samples Test |
||||||||||
Levene’s Test for Equality of Variances |
t-test for Equality of Means |
|||||||||
F |
Sig. |
t |
df |
Sig. (2-tailed) |
Mean Difference |
Std. Error Difference |
95% Confidence Interval of the Difference |
|||
Lower |
Upper |
|||||||||
hours sleep/ week ends |
Equal variances assumed |
.081 |
.778 |
-1.199 |
33 |
.239 |
-.7833 |
.6535 |
-2.1128 |
.5461 |
Equal variances not assumed |
-1.209 |
31.209 |
.236 |
-.7833 |
.6479 |
-2.1043 |
.5377 |
An independent samples t-test was done to establish whether there exits any difference in the average number of hours spent sleeping by the male and female respondents. There was no significant difference in the hours of sleep/weekend for the male respondents (M = 8.83, SD = 1.85) and the female respondents (M = 8.05, SD = 1.96) conditions; t (33) = -1.199, p = 0.239 (> 0.05).
Analysis of One Relationship between Two Quantitative Variables:
In this section, I analyzed the relationship between two quantitative variables. I basically considered age of the respondent and number of hours of sleep/weekend. I used Pearson correlation test to check on the relationship that exists between the two variables based on the coefficient that exists.
Table 8: Correlations |
|||
age |
hours sleep/ week ends |
||
age |
Pearson Correlation |
1 |
-.287 |
Sig. (2-tailed) |
.118 |
||
N |
31 |
31 |
|
hours sleep/ week ends |
Pearson Correlation |
-.287 |
1 |
Sig. (2-tailed) |
.118 |
||
N |
31 |
35 |
The Pearson correlation coefficient is -0.287 and the relationship is insignificant at 5% level of significance (r = -0.287, p > 0.05). The negative coefficient value implies that there is a negative relationship between the two variables (age of the respondent and the number of hours of sleep). Negative linear relationship means that an increase in the age of the respondent would result to a decrease in the sleep hours.
We can see that there is a negative linear relationship between the two variables.
Apart from Pearson correlation, I ran a linear regression equation model to verify the relationship that exists between age and hours spent sleeping/weekend. The model was to predict the number of sleep hours/weekend based age of the respondent. The following is the regression equation model predicted;
The value of R-Squared is 0.287; this implies that 28.7% of the variation is explained by the age of the respondent in the model
Model Summary |
||||
Model |
R |
R Square |
Adjusted R Square |
Std. Error of the Estimate |
1 |
.287a |
.082 |
.051 |
1.9710 |
a. Predictors: (Constant), age |
Analysis of Variance (ANOVA) is presented in table 10 below. The ANOVA helps to check the goodness of the model. As we can see the model is not fit to predict the number of hours of sleep.
Table 10: ANOVAa |
||||||
Model |
Sum of Squares |
df |
Mean Square |
F |
Sig. |
|
1 |
Regression |
10.109 |
1 |
10.109 |
2.602 |
.118b |
Residual |
112.665 |
29 |
3.885 |
|||
Total |
122.774 |
30 |
||||
a. Dependent Variable: hours sleep/ week ends |
||||||
b. Predictors: (Constant), age |
Table 11 presents the regression coefficients. As can be seen the coefficient of the respondent’s age is -0.051. This value implies that a unit increase in age of the respondent would result to a decrease in the number of hours slept by the respondents by 0.051. On the same note, decreasing the age of the respondent by one unit would result to an increase in the dependent variable (number of hours slept)
Coefficientsa |
||||||
Model |
Unstandardized Coefficients |
Standardized Coefficients |
t |
Sig. |
||
B |
Std. Error |
Beta |
||||
1 |
(Constant) |
10.298 |
1.275 |
8.077 |
.000 |
|
age |
-.051 |
.031 |
-.287 |
-1.613 |
.118 |
|
a. Dependent Variable: hours sleep/ week ends |
Conclusion:
The main aim of this study was to apply statistical concepts learnt in class to analyze data. I looked for data and carried out from simple univariate analysis to advanced two variable relationship. To analyze the data, I began by computing the summary data then looked at the relationships. Results showed that on average participants spent there 8.3857 hours sleeping during the weekends with a median of 8 hours and most frequent number of hours slept by the respondents during the weekend being 8 hours too. However, I found out that there is no significant difference in the hours of sleep/weekend for the male respondents and the female respondents.
Armstrong, J. S. (2012). “Illusions in Regression Analysis. International Journal of Forecasting, 28(3), 689.
Bagdonavicius, V., & Nikulin, M. S. (2011). Chi-squared goodness-of-fit test for right censored data. The International Journal of Applied Mathematics and Statistics, 30-50.
Corder, G. W., & Foreman, D. I. (2014). Nonparametric Statistics: A Step-by-Step Approach.
John , A. R. (2006). Mathematical Statistics and Data Analysis.
Kutner, M. H., Nachtsheim, C. J., & Neter , J. (2004). Applied Linear Regression Models. 25.
Schneider , A., Hommel, G., & Blettner, M. (2010). Linear Regression Analysis. 107(44), 776-82.
Tofallis, C. (2009). Least Squares Percentage Regression. Journal of Modern Applied Statistical
Essay Writing Service Features
Our Experience
No matter how complex your assignment is, we can find the right professional for your specific task. Contact Essay is an essay writing company that hires only the smartest minds to help you with your projects. Our expertise allows us to provide students with high-quality academic writing, editing & proofreading services.Free Features
Free revision policy
$10Free bibliography & reference
$8Free title page
$8Free formatting
$8How Our Essay Writing Service Works
First, you will need to complete an order form. It's not difficult but, in case there is anything you find not to be clear, you may always call us so that we can guide you through it. On the order form, you will need to include some basic information concerning your order: subject, topic, number of pages, etc. We also encourage our clients to upload any relevant information or sources that will help.
Complete the order formOnce we have all the information and instructions that we need, we select the most suitable writer for your assignment. While everything seems to be clear, the writer, who has complete knowledge of the subject, may need clarification from you. It is at that point that you would receive a call or email from us.
Writer’s assignmentAs soon as the writer has finished, it will be delivered both to the website and to your email address so that you will not miss it. If your deadline is close at hand, we will place a call to you to make sure that you receive the paper on time.
Completing the order and download