Life expectancy at birth is the function of mortality profile. Life expectancy is the average number of years lived by new born if the mortality pattern is constant in the future. Life expectancy is indicator of health status of the country. Life expectancy increases day by day for developed countries. The increment in the life expectancy is due to the health concern, facilities, environment, habits, living standard, education etc. Australia and New Zealand both are developed countries have more similarities as standard of living, taste in food and music, livable climate etc.
In this study we studied the life expectancy in Australia and New Zealand and its comparison with other East Asia and Pacific countries. We studied the relationship between life expectancy at birth and health expenditure per capita (current US$). We also studied the relation between life expectancy at birth and Death rate, crude (per 1,000 people). We group the countries using cluster analysis by mean life expectancy at birth.
This study will be useful for demographer, researchers and academicians. This study will reveals the difference between life expectancy in Australia and New Zealand. We have collected the data from World Bank (https://databank.worldbank.org).
We save as the data in csv (comma separated values) file. We load this csv file in R using read.csv. First of all set the directory in using setwd().
#Set the working directory where dir is directory >setwd(“dir”) |
We read the csv file in R as
#Load The data > Data=read.csv(“data.csv”, header = 1) |
Data has 962 rows and 24 column. We can accessed the dimension of data as
> dim(Data) [1] 962 24 |
We can get the structure of data as
> structure(Data) |
In this section, we have studied the life expectancy at birth, health expenditure per capita (current US$) and Death rate, crude (per 1,000 people) in Australia and New Zealand. From data we can see that life expectancy at birth is given for year 2001 to 2014.
We extract the required data using filter function
# Library for the data extraction > library(dplyr) |
We extract the life expectancy at birth for Australia in LE_AUS variable
# Data is non numeric so we used as.numeric to convert in numeric > LE_AUS=as.numeric(t(filter(Data, Country.Code == “AUS”, Series.Code==”SP.DYN.LE00.IN”)[,5:18])) > LE_AUS [1] 79.63415 79.93659 80.23902 80.49024 80.84146 81.04146 [7] 81.29268 81.39512 81.54390 81.69512 81.89512 82.04634 [13] 82.14878 82.25122 |
We extract the life expectancy at birth for New Zealand in LE_NZL variable
> LE_NZL=as.numeric(t(filter(Data, Country.Code == “NZL”, Series.Code==”SP.DYN.LE00.IN”)[,5:18])) > LE_NZL [1] 78.69268 78.84634 79.14634 79.54878 79.85122 80.04878 [7] 80.15122 80.35122 80.70244 80.70244 80.90488 81.15610 [13] 81.40732 81.40488 |
Similarly we extract the health expenditure per capita (current US$) and Death rate, crude (per 1,000 people) for Australia and New Zealand.
# Health expenditure per capita (current US$) for Australia and New Zealand. > HE_AUS=as.numeric(t(filter(Data, Country.Code == “AUS”, Series.Code==”SH.XPD.PCAP”)[,5:18])) > HE_AUS [1] 1665.200 1883.316 2370.881 2933.229 3214.031 3421.908 [7] 4077.852 4410.438 4256.641 5324.517 6368.424 6543.524 [13] 6258.467 6031.107 > HE_NZL=as.numeric(t(filter(Data, Country.Code == “NZL”, Series.Code==”SH.XPD.PCAP”)[,5:18])) > HE_NZL [1] 1058.842 1261.338 1623.884 1992.664 2307.097 2315.654 [7] 2713.535 3318.768 3145.237 3742.560 4251.403 4470.859 [13] 4661.795 4896.348 |
# Death rate, crude (per 1,000 people) for Australia and New Zealand.
|
We referred Berenson(2012), Bickel and Doksum (2015), Casella and Burger (2002), DeGroot and Schervish (2012), Devore and Berk (2007), Groebner et al. (2008) and Ross (2014).
Here we obtain the summary statistics of life expectancy at birth, Health expenditure per capita (current US$) and Death rate, crude (per 1,000 people) for Australia and New Zealand.
We first create the data frame for simultaneously calculating the values for all the variables.
> d=data_frame(LE_AUS, LE_NZL,HE_AUS, HE_NZL, CDR_AUS,CDR_NZL) > d # A tibble: 14 x 6 LE_AUS LE_NZL HE_AUS HE_NZL CDR_AUS CDR_NZL <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 79.63415 78.69268 1665.200 1058.842 6.6 7.16 2 79.93659 78.84634 1883.316 1261.338 6.8 7.10 3 80.23902 79.14634 2370.881 1623.884 6.6 6.95 4 80.49024 79.54878 2933.229 1992.664 6.5 6.95 5 80.84146 79.85122 3214.031 2307.097 6.4 6.54 6 81.04146 80.04878 3421.908 2315.654 6.4 6.75 7 81.29268 80.15122 4077.852 2713.535 6.7 6.75 8 81.39512 80.35122 4410.438 3318.768 6.7 6.85 9 81.54390 80.70244 4256.641 3145.237 6.5 6.73 10 81.69512 80.70244 5324.517 3742.560 6.5 6.53 11 81.89512 80.90488 6368.424 4251.403 6.6 6.86 12 82.04634 81.15610 6543.524 4470.859 6.6 6.82 13 82.14878 81.40732 6258.467 4661.795 6.4 6.65 14 82.25122 81.40488 6031.107 4896.348 6.5 6.88 |
We obtain minimum, first quartile, median, mean, third quartile and maximum for life expectancy at birth, Health expenditure per capita (current US$) and Death rate, crude (per 1,000 people) for Australia and New Zealand.
# For minimum, first quartile, median, mean, third quartile and maximum > summary(d) LE_AUS LE_NZL HE_AUS HE_NZL Min. :79.63 Min. :78.69 Min. :1665 Min. :1059 1st Qu.:80.58 1st Qu.:79.62 1st Qu.:3003 1st Qu.:2071 Median :81.34 Median :80.25 Median :4167 Median :2929 Mean :81.18 Mean :80.21 Mean :4197 Mean :2983 3rd Qu.:81.85 3rd Qu.:80.85 3rd Qu.:5854 3rd Qu.:4124 Max. :82.25 Max. :81.41 Max. :6544 Max. :4896 CDR_AUS CDR_NZL Min. :6.400 Min. :6.530 1st Qu.:6.500 1st Qu.:6.735 Median :6.550 Median :6.835 Mean :6.557 Mean :6.823 3rd Qu.:6.600 3rd Qu.:6.933 Max. :6.800 Max. :7.160 |
We obtained the standard deviation for studying the variation life expectancy at birth, Health expenditure per capita (current US$) and Death rate, crude (per 1,000 people) for Australia and New Zealand.
# Standard deviation for the variables > apply(d,2,sd) LE_AUS LE_NZL HE_AUS HE_NZL 0.8428499 0.9043788 1696.8620220 1285.6333224 CDR_AUS CDR_NZL 0.1222500 0.1846172 |
We can see that
We plot box plot to study the variation more rigorously
# For Boxplot > boxplot(LE_AUS, LE_NZL,names=c(“Australia”, “New Zealand”),ylab=”Life Expectancy at birth”) > boxplot(HE_AUS, HE_NZL,names=c(“Australia”, “New Zealand”),ylab=”Health expenditure per capita (current US$)”) > boxplot(CDR_AUS, CDR_NZL,names=c(“Australia”, “New Zealand”),ylab=”Death rate, crude (per 1,000 people)”) |
From the Figure 1, 2 and 3 we can studied the variation in variables.
We studied the correlation between for life expectancy at birth, Health expenditure per capita (current US$) and Death rate, crude (per 1,000 people) for Australia and New Zealand.
# Correlation between Life expectancy at birth and Health expenditure per capita (current US$) for Australia > cor(LE_AUS,HE_AUS) [1] 0.9683106 # Correlation between Life expectancy at birth and Death rate, crude (per 1,000 people) for Australia > cor(LE_AUS,CDR_AUS) [1] -0.3309002 # Correlation between Life expectancy at birth and Health expenditure per capita (current US$) for New Zealand > cor(LE_NZL,HE_NZL) [1] 0.9805011 # Correlation between Life expectancy at birth and Death rate, crude (per 1,000 people) for New Zealand > cor(LE_NZL,CDR_NZL) [1] -0.5956892 |
We observed that there is high positive correlation between Life expectancy at birth and Health expenditure per capita (current US$) for Australia and New Zealand whereas we observed negative correlation between Life expectancy at birth and Death rate, crude (per 1,000 people) for Australia and New Zealand.
In the following scatter plot we can see the relation between variables.
#For Australia > pairs(data.frame(LE_AUS,HE_AUS,CDR_AUS)) #For New Zealand > pairs(data.frame(LE_NZL,HE_NZL,CDR_NZL)) |
Scatter plot of life expectancy at birth, Health expenditure per capita (current US$) and Death rate, crude (per 1,000 people) for Australi
Scatter plot of life expectancy at birth, Health expenditure per capita (current US$) and Death rate, crude (per 1,000 people) for New Zealand
From Figure 4 and 5, we observed that
We carried k-means clustering and linear regression in this section. We used
Romesburg (2004) and Kaufman and Rousseeuw (2009)
Clustering is task of grouping in which we group the set of objects which is similar in some characteristic than other group. Each group is known as cluster.
k-means clustering is the clustering technique where we make the k clusters.
Clustering Analysis according to life expectancy at birth for year 2014 for East Asia and Pacific countries:
Firstly we create required data
#Data for Life expectancy at birth for all countries and all years > d1=filter(Data, Series.Code==”SP.DYN.LE00.IN”) > Country_Name=d1$Country.Name > LE_2014=d1$X2014..YR2014. > d2=data.frame(Country_Name,LE_2014) > d3=subset(d2,d2$LE_2014!=LE_2014[1]) > LE_2014=d3$LE_2014 > Country=d3$Country_Name > LE_2014 [1] 82.25121951 78.80958537 68.21229268 75.78226829 [5] 70.08912195 76.54168293 79.12602439 83.9804878 [9] 68.8884878 83.58780488 65.95168293 70.07468293 [13] 82.15585366 66.11736585 80.55309756 74.71829268 [17] 69.10107317 69.46390244 65.85785366 77.57317073 [21] 81.40487805 62.60692683 68.26563415 73.51182927 [25] 82.64634146 67.93080488 74.42202439 68.25914634 [29] 72.79219512 71.91831707 75.62912195 > Country=d3$Country_Name > Country [1] Australia [2] Brunei Darussalam [3] Cambodia [4] China [5] Fiji [6] French Polynesia [7] Guam [8] Hong Kong SAR, China [9] Indonesia [10] Japan [11] Kiribati [12] Korea, Dem. People’s Rep. [13] Korea, Rep. [14] Lao PDR [15] Macao SAR, China [16] Malaysia [17] Micronesia, Fed. Sts. [18] Mongolia [19] Myanmar [20] New Caledonia [21] New Zealand [22] Papua New Guinea [23] Philippines [24] Samoa [25] Singapore [26] Solomon Islands [27] Thailand [28] Timor-Leste [29] Tonga [30] Vanuatu [31] Vietnam |
We group the data in 3 groups, we used k-means clustering
> kmeans(LE_2014,3) K-means clustering with 3 clusters of sizes 9, 9, 13 Cluster means: [,1] 1 74.76543 2 81.61281 3 67.75531 Clustering vector: [1] 2 2 3 1 3 1 2 2 3 2 3 3 2 3 2 1 3 3 3 1 2 3 3 1 2 [26] 3 1 3 1 1 1 Within cluster sum of squares by cluster: [1] 26.50978 26.48555 53.63666 (between_SS / total_SS = 90.6 %) Available components: [1] “cluster” “centers” “totss” [4] “withinss” “tot.withinss” “betweenss” [7] “size” “iter” “ifault” |
We can group the countries using clustering vector. We can observe that Australia and New Zealand are in second group which has high life expectancy. We observed that about 90.6 % variation is explained by the clusters.
We fit the linear regression to the life expectancy at birth by time for Australia and New Zealand. We used Baayen (2008) and Hair et al. (1998).
Australia:
We have data of life expectancy at birth from year 2001 to 2014. We fitted the linear regression to predict the life expectancy at birth for future.
> LE_AUS=as.numeric(t(filter(Data, Country.Code == “AUS”, Series.Code==”SP.DYN.LE00.IN”)[,5:18])) > Year=2001:2014 > dataAUS=data.frame(Year,LE_AUS) > result=lm(LE_AUS~Year,data=dataAUS) > summary(result) Call: lm(formula = LE_AUS ~ Year, data = dataAUS) Residuals: Min 1Q Median 3Q Max -0.25045 -0.09935 0.01686 0.10833 0.21686 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.174e+02 1.988e+01 -15.96 1.91e-09 Year 1.985e-01 9.905e-03 20.04 1.36e-10 (Intercept) *** Year *** Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1494 on 12 degrees of freedom Multiple R-squared: 0.971, Adjusted R-squared: 0.9686 F-statistic: 401.8 on 1 and 12 DF, p-value: 1.359e-10 |
We found that R2 is 0.971 which suggest that fitting is good. Each year brings 0.1985 more years for new born baby in Australia.
New Zealand:
We have data of life expectancy at birth from year 2001 to 2014. We fitted the linear regression to predict the life expectancy at birth for future.
> LE_NZL=as.numeric(t(filter(Data, Country.Code == “NZL”, Series.Code==”SP.DYN.LE00.IN”)[,5:18])) > Year=2001:2014 > dataNZL=data.frame(Year,LE_NZL) > result1=lm(LE_NZL~Year,data=dataNZL) > summary(result1) Call: lm(formula = LE_NZL ~ Year, data = dataNZL) Residuals: Min 1Q Median 3Q Max -0.195122 -0.086900 0.002895 0.080046 0.178344 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.496e+02 1.727e+01 -20.25 1.21e-10 Year 2.141e-01 8.601e-03 24.89 1.07e-11 (Intercept) *** Year *** Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1297 on 12 degrees of freedom Multiple R-squared: 0.981, Adjusted R-squared: 0.9794 F-statistic: 619.8 on 1 and 12 DF, p-value: 1.068e-11 |
We found that R2 is 0.981 which suggest that fitting is good. Each year brings 0.214 more years for new born baby in New Zealand.
We can observed that mean life expectancy at birth is higher in Australia than New Zealand, mean Health expenditure per capita (current US$) is higher in Australia than New Zealand and mean Death rate, crude (per 1,000 people) is lower in Australia than New Zealand
We observed that there is high positive correlation between Life expectancy at birth and Health expenditure per capita (current US$) for Australia and New Zealand whereas we observed negative correlation between Life expectancy at birth and Death rate, crude (per 1,000 people) for Australia and New Zealand.
We can observe that Australia and New Zealand are in second group which has high life expectancy. We observed that about 90.6 % variation is explained by the clusters.
Each year brings 0.1985 more years for new born baby in Australia. Each year brings 0.214 more years for new born baby in New Zealand.
Data filter and sub selection of variables of interest is main problem in this analysis we solve this problem by using filter function define in dplyr library. After getting the desired data, it was interesting to work on the given problem under study. By doing this study, we got the confidence on the data analysis of big data.
References
Berenson, M., Levine, D., Szabat, K.A. and Krehbiel, T.C., 2012. Basic business statistics: Concepts and applications. Pearson higher education AU.
Bickel, P.J. and Doksum, K.A., 2015. Mathematical statistics: basic ideas and selected topics, volume I (Vol. 117). CRC Press.
Casella, G. and Berger, R.L., 2002. Statistical inference (Vol. 2). Pacific Grove, CA: Duxbury.
DeGroot, M.H. and Schervish, M.J., 2012. Probability and statistics. Pearson Education.
Devore, J.L. and Berk, K.N., 2007. Modern mathematical statistics with applications. Cengage Learning.
Groebner, D.F., Shannon, P.W., Fry, P.C. and Smith, K.D., 2008. Business statistics. Pearson Education.
Ross, S.M., 2014. Introduction to probability models. Academic press.
Baayen, R.H., 2008. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge University Press.
Kaufman, L. and Rousseeuw, P.J., 2009. Finding groups in data: an introduction to cluster analysis (Vol. 344). John Wiley & Sons.
Romesburg, C., 2004. Cluster analysis for researchers. Lulu. com.
Hair, J.F., Black, W.C., Babin, B.J., Anderson, R.E. and Tatham, R.L., 1998. Multivariate data analysis (Vol. 5, No. 3, pp. 207-219). Upper Saddle River, NJ: Prentice hall.
Essay Writing Service Features
Our Experience
No matter how complex your assignment is, we can find the right professional for your specific task. Contact Essay is an essay writing company that hires only the smartest minds to help you with your projects. Our expertise allows us to provide students with high-quality academic writing, editing & proofreading services.Free Features
Free revision policy
$10Free bibliography & reference
$8Free title page
$8Free formatting
$8How Our Essay Writing Service Works
First, you will need to complete an order form. It's not difficult but, in case there is anything you find not to be clear, you may always call us so that we can guide you through it. On the order form, you will need to include some basic information concerning your order: subject, topic, number of pages, etc. We also encourage our clients to upload any relevant information or sources that will help.
Complete the order formOnce we have all the information and instructions that we need, we select the most suitable writer for your assignment. While everything seems to be clear, the writer, who has complete knowledge of the subject, may need clarification from you. It is at that point that you would receive a call or email from us.
Writer’s assignmentAs soon as the writer has finished, it will be delivered both to the website and to your email address so that you will not miss it. If your deadline is close at hand, we will place a call to you to make sure that you receive the paper on time.
Completing the order and download