Baycoast is a (fictitious) local government area (called a ‘city’) within greater Melbourne, Australia. It consists of a number of different suburbs, all with their own history of development. The city grew in
different stages, with new suburbs gradually emerging. It covers some wealthy suburbs and some not so wealthy. As the name would indicate, the city is located on the Bay.
The city stretches for several kilometres along the Bay’s lovely beaches, and for several kilometres inland. About 60,000 people live in the suburbs of Baycoast.
The main objective is to conduct exploratory, descriptive and causal analysis is to gain a comprehensive understanding of house prices in the Baycoast region and an understanding of the
most important factors that impact prices. Your analysis will be based on a random sample of 120 houses from the city. Note that for the purpose of the assignment the unit of analysis is a ‘House’. It
is defined as a stand-alone dwelling. That is, flats, apartments, etc are not included in the database.
The assignment requires five separate tasks:
– Summary of House Prices
– Factors influencing house prices
– Development of a multiple regression model
– Time Series analysis
– Critique the Business Research Approach
This study data includes 24 variables. Out of these House no. is an identifying variable same as serial number. Price ($’000) is the dependent variable of this study. And Rest 22 variables; Rooms, Lot Size(sq m), Age, Area (sq m), Material, To Train(km), To Bus(km), To Shops(km), Street, Storeys, Style, Bedrooms, Bathrooms, Kitchen, Heating, AirCon, Bay Views, Suburb, Weekly Rent $, Rental Return %, Condition and Rental Status, are independent variables. Out of these indpendent variables; Lot Size(sq m), Age, Area (sq m), To Train(km), To Bus(km), To Shops(km), Bay Views, Weekly Rent $ and Rental Return % are continuous variables while rest are categorical variables.
The dependent variable price is a continuous variable. So, it described descriptively by calculating mean and standard deviation. To understand the distribution of the data skewness and kurtosis were also calculated.
Mean for Price is 886.57 and median is 852, median is little smaller than mean implying the positive skewed or skewed to the right which is also supported by the positive value of skewness (0.426). Negative value of kurtosis (-0.1478) implies the slight flat peaks of the normal curve as can be observed in the histogram plotted below. Standard Deviation is 324.95.
Variable price seem to follow almost normal distribution with slight right skewness and flat peak as observed in the histogram presented below.
Only one outlier with a value of >1500 was found as an outlier in the boxplot.
To find the factors affecting House Price, first correlation was run for all the independent variables with variable price. Amongst these correlation coefficients, any four values of correlation coefficient greater than 0.4 were selected and these factors were studied. The cut-off value of 0.4 was chosen so as to have variables with at least moderately good correlation with the variable price. So, based on the correlation coefficient the variables selected are – Rooms, Street, Bay Views and Weekly Rent ($). By common sense if we think then we will find that these variables really affect the price of a house.
Correlation coefficients are reported below –
|
Correlation coefficient |
Rooms |
0.505469 |
Lot Size(sq m) |
0.41083 |
Age |
-0.36346 |
Area (sq m) |
0.567769 |
Material |
0.164491 |
To Train(km) |
0.003012 |
To Bus(km) |
-0.02383 |
To Shops(km) |
-0.08373 |
Street |
0.72257 |
Storeys |
0.565098 |
Style |
0.364688 |
Bedrooms |
0.539745 |
Bathrooms |
0.331223 |
Kitchen |
-0.05608 |
Heating |
0.243694 |
AirCon |
0.122417 |
Bay Views |
0.676194 |
Suburb |
0.373759 |
Weekly Rent $ |
0.665591 |
Rental Return % |
-0.40384 |
Condition |
0.354435 |
Rental Status |
0.068507 |
The correlation coefficients for the factors selected are –
|
Correlation coefficient |
Rooms |
0.505469 |
Street |
0.72257 |
Bay Views |
0.676194 |
Weekly Rent $ |
0.665591 |
Scatter plot between Weekly Rent and Price was plotted to visually inspect the relation between these two variables. Scatter plot was plotted between these two variables only because these are continuous variables. Slight linear relationship is observed between Weekly Rent and Selling Price.
To compare the two continuous variables descriptively, descriptive statistics described below –
Price($’000) |
|
Weekly Rent $ |
|
Mean |
886.575 |
Mean |
604.3333333 |
Standard Error |
29.66343575 |
Standard Error |
20.67073241 |
Median |
852 |
Median |
595 |
Mode |
811 |
Mode |
835 |
Standard Deviation |
324.9466579 |
Standard Deviation |
226.4365284 |
Sample Variance |
105590.3305 |
Sample Variance |
51273.5014 |
Kurtosis |
-0.14778497 |
Kurtosis |
0.225284528 |
Skewness |
0.426005063 |
Skewness |
0.523856086 |
Range |
1569 |
Range |
1170 |
Minimum |
192 |
Minimum |
200 |
Maximum |
1761 |
Maximum |
1370 |
Sum |
106389 |
Sum |
72520 |
Count |
120 |
Count |
120 |
Confidence Level(95.0%) |
58.73656476 |
Confidence Level(95.0%) |
40.9301142 |
The standard error and skewness of the two variables is quite comparable. Weekly rent has positive kurtosis as compared to negative kurtosis for Price implying steeper peaks than Price.
The other three variables are categorical variables so mean Price against each category was calculated to view the effect of price of each category.
Average of Price($’000) |
|
Rooms |
Total |
5 |
744.86 |
6 |
721.00 |
7 |
808.00 |
8 |
1137.32 |
9 |
1182.69 |
Grand Total |
886.58 |
Average of Price($’000) |
|
Street |
Total |
3 |
520.92 |
4 |
684.29 |
5 |
866.93 |
6 |
884.40 |
7 |
882.60 |
8 |
1103.10 |
9 |
1359.88 |
10 |
1499.00 |
Grand Total |
886.58 |
Average of Price($’000) |
|
Bay Views |
Total |
0 |
753.12 |
0.1 |
817.50 |
0.2 |
867.25 |
0.3 |
991.17 |
0.4 |
1041.29 |
0.5 |
1079.40 |
0.6 |
1110.00 |
0.7 |
1176.80 |
0.8 |
1362.50 |
0.9 |
1455.00 |
1 |
1510.00 |
Grand Total |
886.58 |
From the above tables we can observe that as the category increases, i.e number of rooms increases or number of street increases or the proportion of Bay views increases then Price of the House also increases with only one exception of price for 7 Street is less than price for Street 6.
As the data contains 22 independent variables and Excel has limitation on the number of dependent variables to be used, two multiple regression with 11 variables each were run sing Data analysis toolpak of excel. Amongst these two multiple regression whichever variables showed significant p-value (p < 0.05) for t-test for slope were included in the model. The model included 9 variables out of which variable Bedroom showed not significant p-value and was excluded. This leads to final multiple regression model with 8 independent variables influencing Price of a house. These variables are – Lot size, Age, Area, Street, Storeys, Bay Views, Weekly Rent and Rental rent.
Price ($ ‘000) = 116.0136 + 0.0642 * Lot size – 2.0876 *Age + 1.0457 * Area + 39.3334 * Street + 92.9236 * Storeys + 314.9985483 * Bay Views + 0.4481 * Weekly Rent – 66.6201 Rental rent
The variables Age and Rental rent are negatively linearly related to Price meaning if age of the house increases or the rental rate increases than Price decrease or vice versa. All other variables; Lot size, Area, Street, Storeys, Bay Views and Weekly Rent are positively linearly related meaning increase in any of these variables are associated with increase in house price.
The constant value of 116.0136 implies that if all the variables are zero still the price of house is 116.0136 ($ ‘000) which indirectly means its land price coz without anything there could be no house practically. The coefficients of variables imply that if the variable is increased by unit then Price of the house will increase by units equal to coefficient value. Eg: If the lot size increases by one unit then Price will increase by 0.0642 units.
The R squared value obtained for this model is 0.9804. R squared value represents goodness of fit of the model. So 98.04% value of goodness of fit implies very good fit of the model. This also explains that 98.04% of variance in the model is explained by the independent variables included in the model and only approx. 2% remains unexplained. Adjusted R square value is 97.90 which also imply very good fit of the model. The p value for Analysis of Variance obtained is very less which implies the model to be highly significant at 5% level of significance.
If we compare the positive coefficients then Bay views has the highest coefficient followed by Storeys followed by Street followed by Area followed by Weekly Rent followed by Lot size. Lot size influences the price of a house by least strength and Bay views by highest strength.
Time series data from 4th quarter of 2009 to 3rd quarter of 2013 was used to seasonalize the trend using the software Megastat an add-in of Excel. Deseasonalization was carried out using multiplicative model.
Forecast equation derived through regression is –
y = 25.42t+504.5
where 504.5 is constant and 25.42 is a coefficient or slope of the line and t is the number of serial order.
Using this equation Median House Price was forecasted from 4th quarter of 2013 to 3rd Quarter of 2014 using t from 17 to 20 as below –
t |
Median House price |
17 |
936.64 |
18 |
962.06 |
19 |
987.48 |
20 |
1012.9 |
Using the observed median house price provided we calculated Mean Absolute Percentage Error (MAPE) by following formula
MAPE = 100/n*∑(Actual – Forecast)/Actual
Where n = number of actual observations provided
t |
Forecasted |
Actual |
ratio |
abs ratio |
17 |
936.64 |
980.00 |
0.04 |
0.04 |
18 |
962.06 |
1062.00 |
0.09 |
0.09 |
19 |
987.48 |
1206.00 |
0.18 |
0.18 |
20 |
1012.90 |
954.00 |
-0.06 |
0.06 |
sum |
0.38 |
|||
MAPE |
9.53 |
The value of MAPE obtained for this data is 9.53 which imply good fit of the model and there is very less difference between forecasted values and observed values.
The data in this study includes 120 observations and 24 variables out of which one, Price, is a dependent variable. Rest 23 variables, except House no., are independent variable. House No. is identifying variable. Amongst all these variables few variables were more correlated with Price than other few. Few variables showed positive correlation and few negative. Most of the negative correlations observed were very weak. Out of positive correlation variables Street, Bay views and Weekly Rent show strong correlation.
Out of the 22 variables, 8 variables; Lot size, Age, Area, Street, Storeys, Bay Views, Weekly Rent and Rental rent were observed to determine the price of the house significantly. Thus, these are the factors which contribute to the estimation of House price significantly. Out of these Age and Rental rent are inversely related to House Price. The model showed a good fit with R squared value of approx. 98% implying that approx. 98% of variance in House Price is explained by these independent variables and model. Only approx. 2% of variance in House price remained unexplained. The model obtained was significant with highly significant p value at 5% level of significance implying that these results should be considered and are not just obtained by chance.
Bay views most strongly affect the House price and lot size with the least strength.
Using time series data from 4th Quarter of 2009 to 3rd quarter of 2013 was deseasonalised using Multiplicative model. Forecast equation was obtained using regression. Then values of median house price were forecasted from 4th quarter of 2013 to 3rd quarter of 2014. These values were then compared with the observed values by calculating Mean Absolute Percentage Error determining the fit of the forecasted values. Mean Absolute Percentage Error of 9.53 indicates good fit.
Thus, the results of multiple regression and time series analysis implies that this data provides a good fit. Thus, the model fit is able to forecast the values with high accuracy. This data has high potential.
Essay Writing Service Features
Our Experience
No matter how complex your assignment is, we can find the right professional for your specific task. Contact Essay is an essay writing company that hires only the smartest minds to help you with your projects. Our expertise allows us to provide students with high-quality academic writing, editing & proofreading services.Free Features
Free revision policy
$10Free bibliography & reference
$8Free title page
$8Free formatting
$8How Our Essay Writing Service Works
First, you will need to complete an order form. It's not difficult but, in case there is anything you find not to be clear, you may always call us so that we can guide you through it. On the order form, you will need to include some basic information concerning your order: subject, topic, number of pages, etc. We also encourage our clients to upload any relevant information or sources that will help.
Complete the order formOnce we have all the information and instructions that we need, we select the most suitable writer for your assignment. While everything seems to be clear, the writer, who has complete knowledge of the subject, may need clarification from you. It is at that point that you would receive a call or email from us.
Writer’s assignmentAs soon as the writer has finished, it will be delivered both to the website and to your email address so that you will not miss it. If your deadline is close at hand, we will place a call to you to make sure that you receive the paper on time.
Completing the order and download