Statistical Analysis Of Sleep Data: A Real Life Situation

Dataset Description

The aim of the study was to use the statistical knowledge gained in class to work on some real life situation data. The idea was to obtain some datasets and conduct some statistical analysis. The dataset used for this study was obtained from Julie Pallant which has a set of collected datasets. The link to the data is given here click here to see the link to the data. The data is named sleep.

This particular dataset is a real data file that is condensed from a study that was done to explore the prevalence and impact of sleep problems on various aspects of people’s lives. The subjects of this study were the staff from a university in Melbourne, Australia who were invited to complete a questionnaire containing set of questions regarding their sleep behaviour (for instance number of hours slept per night). For the purposes of analysis I chose a sample of 35 from the dataset.

The data contains 35 observations with a total of seven variables (three categorical and four numerical/quantitative variables) namely;

Variable	Type Save Time On Research and Writing Hire a Pro to Write You a 100% Plagiarism-Free Paper. Get My Paper
Sex of the respondent	Categorical
Marital status of the respondent	Categorical
Highest education level achieved	Categorical
Age of the respondent	Numerical/Quantitative
Hours of sleep in weekends	Numerical/Quantitative
How many hours sleep needed	Numerical/Quantitative
Cigs per day	Numerical/Quantitative

I began by doing simple one quantitative variable analysis where I analyzed measures such as mean, median, quartiles range, minimum and maximum as well as the measures of distribution (skewness and kurtosis).

Table 1 below gives the summary statistics of the variable number of hours of sleep in the weekends. The average sleep hours during the weekend is 8.3857 hours with a media hours sleep being 8 hours. The most frequent (mode) number of hours of sleep is 8 hours with the minimum and maximum number of hours being 4 hours and 14 hours respectively. The range for the 95% confidence interval is 0.6614.

Table 1: hours sleep/ week ends
Mean	8.385714286
Standard Error	0.325451021
Median	8
Mode	8
Standard Deviation	1.925394208
Sample Variance	3.707142857
Kurtosis	2.200243786
Skewness	0.119166062
Range	10
Minimum	4
Maximum	14
Sum	293.5
Count	35
Confidence Level (95.0%)	0.661396051

The skewness value is 0.119; this value is between − 0.5 and + 0.5, I can say that the distribution is approximately symmetric.

The kurtosis value is 2.2 which is greater than 0 so this data could be following a mesokurtic distribution.

To test for normality a Kolmogorov-Smirnov test or Shapiro-Wilk test in SPSS can be used to explore whether the data is normally distributed. I presented the normality test in table 2 below;

Table 2: Tests of Normality
	Kolmogorov-Smirnov^a	Shapiro-Wilk
Statistic	df	Sig.	Statistic	df	Sig.
hours sleep/ week ends	.221	35	.000	.908	35	.007
a. Lilliefors Significance Correction

From the normality test, it is clear that the data is not normally distributed.

The histogram presented below attempts to visualize the distribution of the data. Looking at the figure it looks like it is normally distributed though the test gives a different conclusion.

One Quantitative Variable Analysis

After analyzing one quantitative variable I analyzed one categorical variable which is highest level of education. Results showed that majority (28.6%, n =10) of the respondents had secondary education. Undergraduate and postgraduate respondents were represented by 25.7% (n = 9) each. Those with post-secondary training were represented by 20% (n = 7).

Table 3: highest education level achieved
	Frequency	Percent	Valid Percent	Cumulative Percent
Valid	secondary school	10	28.6	28.6	28.6
trade training/ post-secondary training	7	20.0	20.0	48.6
undergraduate degree	9	25.7	25.7	74.3
postgraduate degree	9	25.7	25.7	100.0
Total	35	100.0	100.0

Next I analyzed relationship between two categorical variables. The variables of interest are gender and marital status. The idea was to test whether there is a relationship between gender and marital status.

To explore the relationship, a Chi-Square test of independence was conducted;

H₀: There is no association between gender and marital status.

H₁: There is association between gender and marital status.

This was tested at 5% level of significance (α = 0.05).

Table 4: marital status * sex Cross tabulation
	sex	Total
female	male
marital status	single	Count	4	6	10
Expected Count	5.7	4.3	10.0
% within sex	20.0%	40.0%	28.6%
married/defacto	Count	14	7	21
Expected Count	12.0	9.0	21.0
% within sex	70.0%	46.7%	60.0%
divorced	Count	2	2	4
Expected Count	2.3	1.7	4.0
% within sex	10.0%	13.3%	11.4%
Total	Count	20	15	35
Expected Count	20.0	15.0	35.0
% within sex	100.0%	100.0%	100.0%

Looking at the table we can see that 20% (n = 4) of the female respondents were single while 40% (n = 6) of their male counterparts were single. 70% (n = 14) of the females were married while only 46.7% (n = 7) of the males were married

I conducted a chi-square test of association to investigate whether there is association between sex of respondents and their marital status. There was no association between sex of respondents and the marital status of the respondents, = 0.357, p > .05. We can conclude that sex of the respondent does not significantly the marital status of the respondent.

Chi-Square Tests
	Value	df	Asymp. Sig. (2-sided)
Pearson Chi-Square	2.061^a	2	.357
Likelihood Ratio	2.065	2	.356
Linear-by-Linear Association	.624	1	.430
N of Valid Cases	35
a. 3 cells (50.0%) have expected count less than 5. The minimum expected count is 1.71.

Analysis of One Relationship between a Categorical Variable and a Quantitative Variable:

In this section I looked at the relationship between one quantitative variable and one categorical variable. I looked at sex of the respondent and the hours of sleep/weekend. To achieve this I set my hypothesis as follows;

H₀: The mean hours of sleep/weekend for the male and female are same

H₁: The mean hours of sleep/weekend for the male and female are different.

This was tested at 5% level of significance.

In testing the hypothesis, an independent t-test was used. This test is usual when comparing the two groups of data sets like in our case.

Table 6: Group Statistics
	sex	N	Mean	Std. Deviation	Std. Error Mean
hours sleep/ week ends	female	20	8.050	1.9595	.4381
male	15	8.833	1.8484	.4773

Table 6 above gives the group statistics. As can be seen, the average hours of sleep/weekend for the male respondents is 8.833 while that of the female respondents is 8.05. Male respondents spent more hours sleeping compared to the female respondnets.

Independent Samples Test
	Levene’s Test for Equality of Variances	t-test for Equality of Means
F	Sig.	t	df	Sig. (2-tailed)	Mean Difference	Std. Error Difference	95% Confidence Interval of the Difference
Lower	Upper
hours sleep/ week ends	Equal variances assumed	.081	.778	-1.199	33	.239	-.7833	.6535	-2.1128	.5461
Equal variances not assumed			-1.209	31.209	.236	-.7833	.6479	-2.1043	.5377

An independent samples t-test was done to establish whether there exits any difference in the average number of hours spent sleeping by the male and female respondents. There was no significant difference in the hours of sleep/weekend for the male respondents (M = 8.83, SD = 1.85) and the female respondents (M = 8.05, SD = 1.96) conditions; t (33) = -1.199, p = 0.239 (> 0.05).

One Categorical Variable Analysis

Analysis of One Relationship between Two Quantitative Variables:

In this section, I analyzed the relationship between two quantitative variables. I basically considered age of the respondent and number of hours of sleep/weekend. I used Pearson correlation test to check on the relationship that exists between the two variables based on the coefficient that exists.

Table 8: Correlations
	age	hours sleep/ week ends
age	Pearson Correlation	1	-.287
Sig. (2-tailed)		.118
N	31	31
hours sleep/ week ends	Pearson Correlation	-.287	1
Sig. (2-tailed)	.118
N	31	35

The Pearson correlation coefficient is -0.287 and the relationship is insignificant at 5% level of significance (r = -0.287, p > 0.05). The negative coefficient value implies that there is a negative relationship between the two variables (age of the respondent and the number of hours of sleep). Negative linear relationship means that an increase in the age of the respondent would result to a decrease in the sleep hours.

We can see that there is a negative linear relationship between the two variables.

Apart from Pearson correlation, I ran a linear regression equation model to verify the relationship that exists between age and hours spent sleeping/weekend. The model was to predict the number of sleep hours/weekend based age of the respondent. The following is the regression equation model predicted;

The value of R-Squared is 0.287; this implies that 28.7% of the variation is explained by the age of the respondent in the model

Model Summary
Model	R	R Square	Adjusted R Square	Std. Error of the Estimate
1	.287^a	.082	.051	1.9710
a. Predictors: (Constant), age

Analysis of Variance (ANOVA) is presented in table 10 below. The ANOVA helps to check the goodness of the model. As we can see the model is not fit to predict the number of hours of sleep.

Table 10: ANOVA^a
Model	Sum of Squares	df	Mean Square	F	Sig.
1	Regression	10.109	1	10.109	2.602	.118^b
Residual	112.665	29	3.885
Total	122.774	30
a. Dependent Variable: hours sleep/ week ends
b. Predictors: (Constant), age

Table 11 presents the regression coefficients. As can be seen the coefficient of the respondent’s age is -0.051. This value implies that a unit increase in age of the respondent would result to a decrease in the number of hours slept by the respondents by 0.051. On the same note, decreasing the age of the respondent by one unit would result to an increase in the dependent variable (number of hours slept)

Coefficients^a
Model	Unstandardized Coefficients	Standardized Coefficients	t	Sig.
B	Std. Error	Beta
1	(Constant)	10.298	1.275		8.077	.000
age	-.051	.031	-.287	-1.613	.118
a. Dependent Variable: hours sleep/ week ends

Conclusion:

The main aim of this study was to apply statistical concepts learnt in class to analyze data. I looked for data and carried out from simple univariate analysis to advanced two variable relationship. To analyze the data, I began by computing the summary data then looked at the relationships. Results showed that on average participants spent there 8.3857 hours sleeping during the weekends with a median of 8 hours and most frequent number of hours slept by the respondents during the weekend being 8 hours too. However, I found out that there is no significant difference in the hours of sleep/weekend for the male respondents and the female respondents.

Armstrong, J. S. (2012). “Illusions in Regression Analysis. International Journal of Forecasting, 28(3), 689.

Bagdonavicius, V., & Nikulin, M. S. (2011). Chi-squared goodness-of-fit test for right censored data. The International Journal of Applied Mathematics and Statistics, 30-50.

Corder, G. W., & Foreman, D. I. (2014). Nonparametric Statistics: A Step-by-Step Approach.

John , A. R. (2006). Mathematical Statistics and Data Analysis.

Kutner, M. H., Nachtsheim, C. J., & Neter , J. (2004). Applied Linear Regression Models. 25.

Schneider , A., Hommel, G., & Blettner, M. (2010). Linear Regression Analysis. 107(44), 776-82.

Tofallis, C. (2009). Least Squares Percentage Regression. Journal of Modern Applied Statistical

Turn in your highest-quality paper
Get a qualified writer to help you with

“ Statistical Analysis Of Sleep Data: A Real Life Situation ”

Get high-quality paper

NEW! AI matching with writer