Health research and its privacy protection provides a significant benefit to the society. Health research is very important for improving health care. Information is said to be one of the most significant organizational assets. Information in an organization is precious and therefore should always be appropriately protected. The security of information can be defined as the protection of information along with the hardware that is in use. Security means combining operations and other internal controls which will ensure integrity and also helps in keeping the data confidential (Fairfield and Engel 2015). It is also known that the security of information performs different types of important functions which includes keeping the data safe which the organization has collected, protecting the organization’s capability to work and also protects the technology assets which the organization uses. Information security is very crucial for an organization because when information is misused or is left unprotected it can destroy lives along with causing harm to lives.
All the personal information is handled very carefully by the Australian Digital Health Agency. In the recent years the information of the different types of healthcare collected have risen a lot in the recent years. The involvement in the delivery of the health care of various individuals and groups of providers forces to document in every details of the information. With the rise in technologies which is used for therapies and diagnosis it becomes easily available for everyone. On top of that generic data have also become very easily available for prenatal testing. The documentation of all risk factors is also quite significant for promoting continuity of care and can also be termed as a method of defense against any charge of malpractices. The primary health care record provides many functions other than direct health care. The health record is a very significant tool for the providers of health care who can record their impressions, instructions and observations. However, the third-party payers use the record of the information of the patients for making various kinds of payments.
Information privacy has become an important aspect in the health care sectors. It is very important to keep the electronically stored health information confidentially. The reason behind this is that the ability to access various information of the health records and large volumes of data has become very easy. Health records should be stored electronically as photocopying paper records is not possible for large number of medical reports. Secondly, the health data which is computer based have also recently become a valuable commodity. It has been seen that many companies gather information from the computer of the physicians and various records from pharmacy for selling to other companies in order to earn incentives. In recent years data from various resources is usually combined and linked with other profiles. When data are stored electronically it becomes very easy to explore the networks of the database even from other remote locations (Source: Legislation.gov.au. 2018). However, had not the security systems been designed to record any kind of access, anybody could have entered the database and could easily gather information without leaving any kind of evidences of having done so. Providers can easily identify certain trends when they can access to electronic data which concerns the health of the entire population (Source: Digitalhealth.gov.au. 2018). The providers can also view information about the patients who are being treated for any kid of information from the electronic health records in order to take any kind of preventive measures. The emergence of the electronic health care records which pairs with the mobile technology can be termed as biggest technological advancements in the recent years of the industry of health care. The HER have also played an important role in the ability of the industry of health care which continues to evolve along with improving the overall quality (Legislation.gov.au. 2018). With the advancement of the electronic health records the workflows for the providers takes place in a very less time when compared to the paper-based methods. When the information which is related to the patient health and financial management is entered electronically most of the input errors can be reduced which also helps in lowering various risk of problems that includes problems like objections of reimbursement claims. The drawbacks of various legal protection related to confidentiality includes obligations of confidentiality are same for health records which is generally kept either on paper or are computer based (Wei et al. 2014). Although, the degree to which confidentiality should be maintained will vary depending on the holder of the information and on the type of the information held.
The personal information is collected and well protected from the end of Australian Digital Agency. The agency controls the privacy principles of Australia contained from the end of Privacy Act of 1988. The organization would deal with personal information as per Health Record System Operator. Personal information is an opinion about a detected individual who is reasonably identified for communicating with people and meeting business aims and obligations (Pearson 2013). Majorly, the Australian Digital Agency keeps communication with job applicants, former and current employees, business associates, goods and service providers as well as a stakeholder organization.
‘My Health Record System’ is maintained by the system operator and the operators records the information of healthcare recipients. The other records are stored as per registered repository operators. The data as per the laws of a State or territory restricted from sharing. Criminal and civil penalties are applicable if a person gathers, utilizes or discloses information from data base without authorization. Not only that, enforceable undertakings and injunctions are also available to enforce the provisions of the act.
The common people have the right for requesting access and correct the personal information that the Agency holds about them. The agency might ask for identity of the individual before granting a request for access or correction. For any enquiries and complaints, the respond would respond as quickly as possible. It would inform enquiries and complaints about the individuals. If anybody have complaint, the agency would respond as quickly as possible and inform about the development of the complaint. The agency might gather additional personal information for testing and complaining. For the purposes of the act of system operator, the parental responsibility for a healthcare recipient to be an authorized representative is restricted for aged under 18. A healthcare recipient has one or more authorized representatives in relation to the healthcare representation. The healthcare recipient acts as per respective purposes. The health care act has an influence to the subject of the recipients to any modifications prescribed by regulations (Malathi et al. 2014). It is notable fact that this act has effect for all requirements as if there had been things in relation to the healthcare recipient.
The agency gets information from telephone, facsimile, mail, common public and health care operator. The agency collects regarding job title, image of an individual, employee records, bank account details, contact information and work history. Healthcare identification for each individual gets assisted by healthcare organization to register for digital health services. The record of first aid is administered to an individual of Agency premises. The agency use and disclose the personal information to manage its employment relationships and duties. Not only that, in few critical situations like conducting deals with its business associates and contractors, engaging and managing the workforce, delivering the functions and meeting legal obligations and providing marketing information about services, goods, events or initiatives, the communication information might be offered. Individuals who receive the materials of marketing from the Agency might opt for the further communications of the nature. If the agency discloses the information to the Department of Human Services, then it would be a support of healthcare organization as a part of digital health services. When an individual applies to register for digital healthcare services via ‘Document Verification Service (DVS)’, then the agency would disclose the personal information held by the issuer (Thompson and Richard 2015). However, the security of any data provided to the Agency via internet cannot be protected as per unsecure environment of the individuals. The agency might collect the information as per the computer’s IP address also. The support of cookies and cache files are available in the website of Agency called ‘www.digitalhealth.gov.au.’ The agency goes for geo-location data associated to the IP address. The agency might gather information from sign up drop box and downloads also.
The electronic health record is changing drastically in order to handle huge technological transformation underlining the need for targeted support (Peppet 2014). Presence of huge variations in the nation indicates a continuous role for the states in support of electronic health record utilization. Therefore, it can be said that the structure of the electronic health market is going through various changes as proposed by the authorities. As the electronic health records have become a significant part in most of the hospitals, the provider has seen certain changes in the patient outcomes. According to the survey conducted by the National Center of Health Statistics, it has been found that the electronic health record system helped in enhancing patient care for seventy five percent of the providers. The electronic systems not just only help in storing data but also provides various information for the patient’s benefits. The electronic health record system also evaluates when the patients are taking any new medications and also alerts the physician for the potential issues. Digital health technologies in the recent years have changed a lot and also plays an important role in the lives of the patients.
The exploratory data analysis refers that-
The graphical plots are-
The most number of cases in case of considered variables ‘Number of Time 30-59 days past due not worse’, ‘Number of Time 60-89 days past due not worse’ and ‘Number of Times 90 days late’ is 0 (Frequencies are just less than 150000 in all cases).
The exploratory data analysis of monthly income indicates that- the distribution is highly positive or right skewed. The distribution of number of open credit lines and loans is also positively skewed (Most of its values lie below 20). The distribution of number of real estate loans or lines is positively distributed (Most of its values lie below 4). Although, the distribution of age is normally distributed (Most of its values lie in the interval 40 years to 65 years).
The Scatter plot between ‘Number of dependents’ and ‘Number of Real Estate Loans or Lines’ refers that –
The number of real estate loans or lines is greater for higher number of dependents.
The scatter plot between ‘Person experienced 90 days past due delinquency or worse’ and ‘Revolving Utilization of Unsecured Lines’ refers that for no experience (value = 0), values of the revolving utilization of Unsecured Lines is higher. Also, for ‘Revolving Utilization of Unsecured Lines’ shows that for presence of experience (value = 1), the value of the revolving utilization of Unsecured Lines is comparatively lower.
The number of real estate loans or lines is kept in X-axis, Revolving utilization of unsecured lines is kept in Y-axis and Number of Open credit and Loans are kept in Z-axis. The data is plotted with respect to the delinquency of 90 days or greater. The two levels of the data set (0 and 1) show that for greater values of real estate loans or lines, the presence of experience also increases. For lower values of real estate loans or lines as well as number of open credit and loans, the presence of experience is observed comparatively lower.
The ‘revolving utilization of unsecured lines’ is plotted in X-axis, ‘SeriousDlqin2yrs’ in Y-axis, ‘Number of open Credit Lines and loans’ is plotted in Z-axis. The values are plotted as per number of open credit lines and loans. For the presence of ‘SeriousDlqin2yrs’, the rest of the factors have comparatively higher values. On the other hand, for the absence of ‘SeriousDlqin2yrs’, these factors have comparatively lower values.
The correlation matrix achieved from raw data indicates that-
The selected variables that are found mostly correlated with the response variable ‘Serious delinquency in 2 years’ in the cleaned data are ‘age’ (0.115), ‘Debt Ratio’ (0.007), ‘Number of Open Credit Lines and Loans’ (0.031), ‘Number of Time 30-59 days past due not worse’ (-0.0119) and ‘Number Real-Estate loans or lines’ (0.007) (Waldhör and Baldauf 2015).
The process of decision tree accomplished by ‘RapidMiner’ is shown in the above snip-shot.
The predicted number of elements of range2 (SeriousDlqin2yrs= 0) as per true range2 is found to be 33. On the other hand, the estimated number of elements of range2 (SeriousDlqin2yrs= 0) as per true range1 is 29. Therefore, the class precision is found to be 53.23% (Chirumamilla et al. 2014).
The predicted number of elements of range1 (SeriousDlqin2yrs= 1) as per true range2 is found to be 1972. On the other hand, the estimated number of elements of range1 (SeriousDlqin2yrs= 1) as per true range1 is 27966. Therefore, the class precision is found to be 93.41% (Bhargava et al. 2013).
The accuracy: 93.33%, Classification error: 6.67%, Precision: 93.41%, Recall (for range 1): 99.90%, f_measure: 96.55%, Sensitivity: 99.90% and Specificity: 1.55% (Sharma, Sharma and Mansotra 2016).
The performance of parameters indicate that the decision tree has maximum depth of 40 followed y 25, 15 and 10 respectively.
A decision tree is a ‘Decision support tool’ that utilizes a tree-like graph or model of decisions to find the probable consequences, outcomes, utilities and resource costs. It allows an individual or organization to have the probable actions based on the costs, advantages and possibilities. The identification of outcomes is mostly desirable that shows that it is crucial to take preferences of decisions preferable to the low-risk options for the broader advantages.
The process of logistic regression accomplished by ‘RapidMiner’ is shown in the above snip-shot.
The p-values of the independent variables in the previous table shows that ‘Number of time 30-59 days past due not worse’ (p-value = 0), ‘Number of Real-Estate loans or lines’ (p-value = 0) and ‘age’ (p-value = 0) are the significant factors. The other two predictive factors, ‘Debt ratio’ (p-value = 0.181>0.05) and ‘Number of open credit lines and loans’ (p-value = 0.208>0.05) are insignificant.
Log-odds of ‘SeriousDlqin2yrs’ = 1.065 – 0.041* ‘NumberRealEstateLoansOrLines’ – 0.038* ‘NumberOfTime30-59DaysPastDueNotWorse’ + 0.003* ‘NumberOfOpenCreditLinesAndLoans’ + 0.033* ‘Age’ (Peng, Lee and Ingersoll 2002).
Odds of ‘SeriousDlqin2yrs’ = exp (1.065 – 0.041* ‘NumberRealEstateLoansOrLines’ – 0.038* ‘NumberOfTime30-59DaysPastDueNotWorse’ + 0.003* ‘NumberOfOpenCreditLinesAndLoans’ + 0.033* ‘Age’).
“SeriousDlqin2yrs” is mostly likely for level ‘1’ (94%) and less likely for level ‘0’ (6%). The bars regarding the red color of the certain predictors convey insignificance and the bars regarding the green color of the certain predictors convey significance.
Logistic model tree (LMT) is a classification model relevant to the supervised training algorithm that combines logistic regression and decision tree learning. Logistic regression model is based on the earlier concept of the model tree. In logistic variant, the logistic algorithm accomplishes the LR model at every node of the tree. The basic logistic model tree algorithm utilizes the cross-validation to find the fitting of the training data. Logistic regression and trees differ in the way that they complete decision boundaries resulting in two types of models. Decision trees separate the data-space into two regions that fits a single line for dividing in exactly two segments. It is simply a list of coefficients showing the prediction path.
The process of binomial classification accomplished by ‘RapidMiner’ is shown in the above snip-shot.
The actual output of many binomial classification algorithm predicts the score. The binomial score refers the certainty of the system that the given observation belongs to the positive class (Aggarwal 2014). For making decisions about whether the observation should be classified as positive or negative, the classification would be accomplished by serious delinquency of 2 years. Any observation with scores greater than threshold are then estimated as the positive class and scores lower than the threshold are estimated as the negative class. The binomial classification tree shows 73% positive and 27% negative responses as per the dependent variable ‘Seriousdlqin2yrs’.
A confusion matrix is a table that often describes the performance of a classification model or classifier on the testing data set retrieved from the training data set. It is the summary of prediction results on a classification problem where the correct and incorrect predictions are summarized with count values segregated by each class.
Performance:
Performance Vector [
*****accuracy: 93.33% +/- 0.02% (micro average: 93.33%)
Confusion Matrix:
True: range2 range1
range2: 75 54
range1: 7946 111925
classification_error: 6.67% +/- 0.02% (micro average: 6.67%)
Confusion Matrix:
True: range2 range1
range2: 75 54
range1: 7946 111925
AUC: 0.559 +/- 0.003 (micro average: 0.559) (positive class: range1)
precision: 93.37% +/- 0.06% (micro average: 93.37%) (positive class: range1)
Confusion Matrix:
True: range2 range1
range2: 75 54
range1: 7946 111925
recall: 99.95% +/- 0.05% (micro average: 99.95%) (positive class: range1)
True: range2 range1
range2: 75 54
range1: 7946 111925
f_measure: 96.55% +/- 0.01% (micro average: 96.55%) (positive class: range1)
Confusion Matrix:
True: range2 range1
range2: 75 54
range1: 7946 111925
sensitivity: 99.95% +/- 0.05% (micro average: 99.95%) (positive class: range1)
Confusion Matrix:
True: range2 range1
range2: 75 54
range1: 7946 111925
specificity: 0.94% +/- 1.07% (micro average: 0.94%) (positive class: range1)
Confusion Matrix:
True: range2 range1
range2: 75 54
range1: 7946 111925
Decision Tree.maximal_depth = 10
To predict “SeriousDlqin2yrs” by linear regression model has weightage in the descending order-
The ROC curves provide in the above graph about the predictive strengths by ‘Logistic Regression’ and ‘Decision Tree’. The decision tree method provides the better predictability at higher precision level where as the logistic regression provides the better predictability at lower precision level (Sharma, Sharma and Mansotra 2016).
The line plot shows the day wise and location wise distribution of total rainfall. The amount of total rainfall is highest in day 4 followed by day 22. The total amount of rainfall is more than 12K milliliter for 4th day, 22nd day and 23rd day of the month in all the months of the considered years. The total amount of rainfall is lesser in the ending days of the month especially in day 31 (less than 8K milliliter).
The day wise and location wise distribution of total rainfall in the selected state ‘Queensland’ refers that it rains highest in 4th day of the months. The total amount of rainfall in Queensland on fourth day is more than 2500 milliliter that is significantly greater than any other day. The total rainfall in ‘Queensland’ is least in 6th day of the months that is just greater than 1000 milliliter (Bandler et al. 2001).
Year wise and location wise distribution of total rainfall shows that total amount of rainfall is recorded least in 2007 (196 milliliter) followed by 2008 (5149 milliliter). The total amount of rainfall is recorded maximum in 2010 (44519 milliliter) followed 2011 (42792 milliliter). A significant inclination in total amount of rainfall is observed in 2008 to 2009 (5149 milliliter to 35949 milliliter) where as a significant declination in total amount of rainfall is observed in 2016 to 2017 (41742 milliliter to 20998 milliliter).
Year wise and location wise distribution of total rainfall in Queensland shows that total amount of rainfall is recorded least in 2008 (1105 milliliter) followed by 2017 (3243 milliliter). The total amount of rainfall is recorded maximum in 2010 (8347 milliliter) followed 2009 (6527 milliliter). A significant inclination in total amount of rainfall is observed in 2008 to 2009 (1015 milliliter to 6527 milliliter) where as a significant declination in total amount of rainfall is observed in 2010 to 2011 (8347 milliliter to 6113 milliliter).
Month wise total amount of rainfall in all the years is shown in all locations of Queensland. The four locations of Queensland are- Brisbane, Cairns, Gold-Coast and Townsville. The records of Gold-Coast are absent in the data set. The grouped bar chart indicates that-
The station wise (Station id) distribution of total rainfall is calculated in this ‘GeoMap View’ plot according to the year 2014. The significant amount of rainfall is observed in the station id 8014, 4024 and 2005. The distribution of rainfall is higher in the stations of New South Wales and Victoria (Grahne and Mendelzon 1999).
The dashboard simultaneously shows the distribution of rainfall in all locations of Australia and the all locations of selected state Queensland. The distribution of rainfall in all locations and all locations of Queensland are observed with respect to years, days of the month as well as month of the year. The total amount of rainfall in 2014 is denoted according to the all stations of Australia that refer most of stations located in New South Wales and Victoria are most in number (Benkart, Sottile and Stroomer 1996).
References:
Aggarwal, C.C. ed., 2014. Data classification: algorithms and applications. CRC Press.
Bandler, J.W., Georgieva, N., Ismail, M.A., Rayas-Sánchez, J.E. and Zhang, Q.J., 2001. A generalized space-mapping tableau approach to device modeling. IEEE Transactions on Microwave Theory and Techniques, 49(1), pp.67-79.
Benkart, G., Sottile, F. and Stroomer, J., 1996. Tableau switching: algorithms and applications. journal of combinatorial theory, Series A, 76(1), pp.11-43.
Bhargava, N., Sharma, G., Bhargava, R. and Mathuria, M., 2013. Decision tree analysis on j48 algorithm for data mining. Proceedings of International Journal of Advanced Research in Computer Science and Software Engineering, 3(6).
Chirumamilla, V., Bhagya, S.T., Velpula, S. and Sunkara, I., 2014. Novel approach to predict student placement chance with decision tree induction. International journal of systems and technologies.
Digitalhealth.gov.au. (2018). Privacy – Australian Digital Health Agency. [online] Available at: https://www.digitalhealth.gov.au/policies/privacy.
Fairfield, J.A. and Engel, C., 2015. Privacy as a public good. Duke LJ, 65, p.385.
Grahne, G. and Mendelzon, A.O., 1999, January. Tableau techniques for querying information sources through global schemas. In International Conference on Database Theory(pp. 332-347). Springer, Berlin, Heidelberg.
Legislation.gov.au. (2018). Healthcare Identifiers Act 2010 . [online] Available at: https://www.legislation.gov.au/Details/C2017C00239.
Legislation.gov.au. (2018). My Health Records Act 2012 . [online] Available at: https://www.legislation.gov.au/Details/C2017C00313.
Malathi, L., Sowmiya, E., Thamaraiselvi, A. and Saranya, R., 2014. Prevention of Health Care Data from the Security Attacks Using Secure Routing and Secure Retrieval.
Naik, A. and Samant, L., 2016. Correlation review of classification algorithm using data mining tool: WEKA, Rapidminer, Tanagra, Orange and Knime. Procedia Computer Science, 85, pp.662-668.
Pearson, S., 2013. Privacy, security and trust in cloud computing. In Privacy and Security for Cloud Computing (pp. 3-42). Springer, London.
Peng, C.Y.J., Lee, K.L. and Ingersoll, G.M., 2002. An introduction to logistic regression analysis and reporting. The journal of educational research, 96(1), pp.3-14.
Peppet, S.R., 2014. Regulating the internet of things: first steps toward managing discrimination, privacy, security and consent. Tex. L. Rev., 93, p.85.
Sharma, T., Sharma, A. and Mansotra, V., 2016. Performance analysis of data mining classification techniques on public health care data. International Journal of Innovative Research in Computer and Communication Engineering, 4(6), pp.11381-11386.
Sharma, T., Sharma, A. and Mansotra, V., 2016. Performance analysis of data mining classification techniques on public health care data. International Journal of Innovative Research in Computer and Communication Engineering, 4(6), pp.11381-11386.
Thompson, R.M. and Richard, M., 2015. Domestic Drones and Privacy: a primer (Vol. 43965, p. 30). Congressional Research Service.
Waldhör, K. and Baldauf, R., 2015. Recognizing drinking ADLs in real time using smartwatches and data mining. Proceedings of the RapidMiner Wisdom Europe, pp.1-18.
Wei, L., Zhu, H., Cao, Z., Dong, X., Jia, W., Chen, Y. and Vasilakos, A.V., 2014. Security and privacy for storage and computation in cloud computing. Information Sciences, 258, pp.371-386.
Essay Writing Service Features
Our Experience
No matter how complex your assignment is, we can find the right professional for your specific task. Contact Essay is an essay writing company that hires only the smartest minds to help you with your projects. Our expertise allows us to provide students with high-quality academic writing, editing & proofreading services.Free Features
Free revision policy
$10Free bibliography & reference
$8Free title page
$8Free formatting
$8How Our Essay Writing Service Works
First, you will need to complete an order form. It's not difficult but, in case there is anything you find not to be clear, you may always call us so that we can guide you through it. On the order form, you will need to include some basic information concerning your order: subject, topic, number of pages, etc. We also encourage our clients to upload any relevant information or sources that will help.
Complete the order formOnce we have all the information and instructions that we need, we select the most suitable writer for your assignment. While everything seems to be clear, the writer, who has complete knowledge of the subject, may need clarification from you. It is at that point that you would receive a call or email from us.
Writer’s assignmentAs soon as the writer has finished, it will be delivered both to the website and to your email address so that you will not miss it. If your deadline is close at hand, we will place a call to you to make sure that you receive the paper on time.
Completing the order and download