One hundred students in the Holmes institute were sampled and the data recorded. The data is to be used to examine the relationship between the preparation time spent by each student for the exam and the reported mark.
Solutions
A one on one survey can be conducted by the instructors on the students inform of an interview where they can respond to question regarding their preferences on the time they spend in class and the time they spend learning on their own. Alternatively, the students can be given questionnaires to fill their preferences. This methods of survey are most preferable because they are simple and less time consuming.
Simple random sampling would be used to select the sample size required. The method is more preferable since it is simple, faster and it ensures that the sample chosen is a true representation of the whole population.
Since we are trying to evaluate how the marks scored by the student are related to preparation time, the dependent variable will be the marks scored in the test while the independent variable will be the time in spent in preparation. Both the variables are continuous numeric in nature with preparation time indicating the amount of time spent preparing for the exam and the marks scored representing the score obtained in the examination.
The instructor is likely to collect incorrect and misleading information caused by the likelihood of some of the respondent’s unwillingness to provide the correct information and maybe lack of time to give sufficient information.
The instructor is also likely to obtain a sample that is not representative of the whole population if he performs a poor simple random sampling. If for example he/she chooses the random sample on a single class or a group of individuals who most probably share common preferences, then the sample will not be representative of the whole population.
Developing a distribution table with class interval, frequency, relative frequency and cumulative frequency for each variable. Additionally, drawing the frequency histogram, relative frequency histogram, cumulative relative frequency histogram, and comment with reasons on the shape of the frequency histogram for each variable.
The frequency histogram is slightly skewed to the left indicating that majority of the times spent in preparation slightly lie below the average. The same is depicted the numeric measure of skewness shown in the table below.
It is similar to the frequency histogram only that here the frequencies are expressed relative of the sum of all frequencies.
The frequency histogram is slightly skewed to the left indicating that majority of the scores obtained in the test after the preparation slightly lie below the average. The same is depicted the numeric measure of skewness shown in the table below.
It is similar to the frequency histogram only that here the frequencies are expressed relative of the sum of all frequencies.
The histogram for the cumulative relative frequency for the score obtained in the test is as shown below:
The scatter plot for the relationship between the marks scored and the time spent in preparing for the exam is as shown below:
The variable marks’ is the dependent variable and therefore placed on the Y-axis while the variable preparation time is the independent variable and therefore placed on the x-axis. The scatter plot indicates that there is a positive linear relationship between marks scored and time spent in preparation.
The fitted line is shown in the scatter plot above. The resulting regression equation is:
This means that the for every unit change of the independent variable time, then the marks obtained in the test would change with a factor of 0.5831.
The numerical measures determined by functions in excel are shown in the table below:
The numerical measures that determine the strength and direction of the of the linear relationship between the dependent and the independent variable is the correlation coefficient (Linoff, 2008). It is shown on the table below:
The correlation coefficient is positive 0.5466 indicating that there is a positive linear relationship between the dependent and the independent variable. Since the value is closer to one than it is closer to zero, then the relationship is relatively strong.
A part of multiple regression excel output used to determine the whether or not the height of sons is related to the fathers and mother’s height. It is to be used to answer a series of questions as below:
The standard error of the statistic and what it actually means
Standard error is a component in statistics that measures the accuracy by which a sample represents a population. A larger sample size should have a smaller standard error while a smaller sample size should have a quite a higher. In this case the standard error of the model is 8.068 and for the sample size provided, this value is averagely appropriate and therefore the sample can be said to be representative of the population. Alternatively, the standard error is used in statistics as a measure of precision for which regression coefficients are determined. In the case the standard error for the coefficient of the first variable is 0.0412, and that of second variable is 0.0395. Since the coefficient of variable 1 is larger than its standard error then it can be different from zero. The coefficient of variable 2 is less than its standard error and thus it can be said to be zero.
The coefficient of determination and what it means.
The coefficient of determination is the r-squared value. It is used to tell the number of points that lie on the line of regression. In this case the coefficient of determination is 26.72% (0.2672) meaning that the only 26.72% of the variation of the dependent variable on the Y-axis around the mean are explained by values of the independent variables along the x-axis. In other words, only 26.72% of the values are fitted in the model.
The adjusted coefficient of determination for the degree of freedom and the meaning of the coefficient of determination and the adjusted coefficient of determination and what they say about how well the model fits the data.
The adjusted coefficient of determination is the value of the adjusted r-squared in the model and is used to adjust term numbers in the statistical model. In this case the value of is 26.35% (0.2635). Both the coefficient of determination and the adjusted coefficient of determination are used to explain the number of values fitted in the model. However, the adjusted coefficient of determination is used when there are more than one independent variables. In this case therefore, the coefficient of determination would be the most favorable in describing how the values fit the model.
To test of the overall utility of the model we use the values of the p-values for the coefficients and their respective t-statistics. A p-value less than the chosen level of significance and t-stat of greater magnitude mean that the predictor is statistically significant (Rumsey, 2007). In this case and using the default significance level of 0.05, the first independent variable (x1) has a p-value of 0.0000 and t-statistic of 11.7772 meaning it is statistically significant. On the contrary, the second independent variable (X2) has a p-value of 0.5615 and a t-statistic of -0.5811. Since the p-value is greater than the significance level and the t-statistic is close to zero, variable is not statistically significant and therefore can be dropped. The overall model can be said to be statistically significant since the value of significance F is less than the chosen level of significance which is 0.05.
In this model we have more than one independent variable and therefore, the coefficients of this variables will tell us how much the variable on the Y-axis or otherwise the dependent variable is impacted by a unit change in an independent variable when all other independent variables are held constant. In this model the first independent variable (X1) has a coefficient of 0.4849 meaning that the dependent variable would change by a factor of 0.4849 for a unit change of this independent variable. Likewise, the second independent variable (X2) has a coefficient of -0.0229 and therefore the dependent variable would reduce by a factor of 0.0229 for a unit positive change of this independent variable.
Determination of whether the data allow statistic practitioner to infer that the heights of sons and the fathers are linearly related.
Taking the first variable in the table labelled “Year” as the first independent variable (X1) representing the father’s heights, then it is okay to say that the data allows the statistic practitioner to infer that the heights of sons and the fathers are linearly related in the positive direction. This because this predictor is statistically significant as determined above.
Determination of whether the data allows the statistic practitioner to infer that the heights of the sons and mothers are linearly related.
Taking the second variable in the table labelled “Rate” as the second independent variable (X2) representing the mother’s heights, then we can say that despite the variable having a negative coefficient indicating a negative linear relationship, it still not okay for the statistic practitioner to conclude that the heights of sons and mothers are linearly related. This due to the fact that the variable is not statistically significant as determined above.
References
Linoff, G. (2008). Data analysis using SQL and Excel. Indianapolis, Ind.: Wiley Pub.
Rumsey, D. (2007). Intermediate statistics for dummies. 1st ed. Hoboken, N.J.: Wiley.
Essay Writing Service Features
Our Experience
No matter how complex your assignment is, we can find the right professional for your specific task. Contact Essay is an essay writing company that hires only the smartest minds to help you with your projects. Our expertise allows us to provide students with high-quality academic writing, editing & proofreading services.Free Features
Free revision policy
$10Free bibliography & reference
$8Free title page
$8Free formatting
$8How Our Essay Writing Service Works
First, you will need to complete an order form. It's not difficult but, in case there is anything you find not to be clear, you may always call us so that we can guide you through it. On the order form, you will need to include some basic information concerning your order: subject, topic, number of pages, etc. We also encourage our clients to upload any relevant information or sources that will help.
Complete the order formOnce we have all the information and instructions that we need, we select the most suitable writer for your assignment. While everything seems to be clear, the writer, who has complete knowledge of the subject, may need clarification from you. It is at that point that you would receive a call or email from us.
Writer’s assignmentAs soon as the writer has finished, it will be delivered both to the website and to your email address so that you will not miss it. If your deadline is close at hand, we will place a call to you to make sure that you receive the paper on time.
Completing the order and download