Main objective of this project is to use the Boston housing dataset to apply the data mining techniques to resolve a business problem. Analysis the provided data set to provide the suitable business solutions by using the Weka data mining tool. To analysis the provided data by review the current, methodologies and algorithms for business analytics. These are will be discussed and analysed in detail.
Analysis the provided data set, first user needs to understand the data set. The provided Boston housing dataset is described as below (Ahmadi & E Shiri Ahmad Abadi, 2013).
The provided dataset has following attributes such as,
Statistics data for provided dataset is shown below.
For ID attributes,
For Sale Conditions (Arabnia, Stahlbock, Abou-Nasr & Weiss, n.d.),
Visualization of provided data set is shown below.
In this task, user needs to discover the relationships existed among all the attributes. Here, we are applying the normalization techniques to discover the relationships among all the attributes in the Boston Housing data. The normalization technique is used to remove the duplicates in the data (Azzalini & Scarpa, 2012).
In this task, user requires to list the potential business analysis for a provided data set. Here, we are using the classification and prediction algorithm to resolve the business problem. And, also provide the effective solutions for that problem. The effective results is used to provides the following benefits for real estate consulting firm such as,
ZeroR is the most straightforward classification methods which depends on the objective and predicts all Predictors .ZeroR classifier essentially predicts the category which is class (Witten, Frank & Hall, 2011). Despite the fact that there is no consistency control in ZeroR, it is helpful for deciding a standard execution as a benchmark for other classification methods. Algorithm Construct a recurrence table for the objective and select it is most regular value. Predictors Contribution There is not something to be said about the Predictors commitment to the model on the grounds that ZeroR does not utilize any of them. Display Evaluation the ZeroR just predicts the greater part class accurately. As referenced previously, ZeroR is helpful for deciding a pattern execution for other classification methods. The ZeroR classification is demonstrated as below (Han, Kamber & Pei, 2012).
=== Classifier model (full training set) ===
ZeroR predicts class value: 180921.19589041095
Time taken to build model: 0 seconds
=== Cross-validation ===
=== Summary ===
Correlation coefficient -0.0508
Mean absolute error 57444.7035
Root mean squared error 79439.3263
Relative absolute error 100 %
Root relative squared error 100 %
Total Number of Instances 1460
The ZeroR algorithm predicts the mean Boston House class values is 180921.19589041095. it must achieve an RMSE better than this value. The ZeroR algorithm predicts the tested negative value for all instances as it is the majority class, and achieves an accuracy of 82 % (Kaluža, 2013).
=== Stratified cross-validation ===
=== Summary ===
Correctly Classified Instances 1198 82.0548 %
Incorrectly Classified Instances 262 17.9452 %
Kappa statistic 0
Mean absolute error 0.1056
Root mean squared error 0.2289
Relative absolute error 100 %
Root relative squared error 100 %
Total Number of Instances 1460
=== Detailed Accuracy By Class ===
TP Percentage FP Percentage Accuracy Recall F-Measure MCC ROC Area PRC Area Class
1.000 1.000 0.821 1.000 0.901 ? 0.496 0.819 Normal
0.000 0.000 ? 0.000 ? ? 0.495 0.069 Abnorml
0.000 0.000 ? 0.000 ? ? 0.489 0.084 Partial
0.000 0.000 ? 0.000 ? ? 0.199 0.003 AdjLand
0.000 0.000 ? 0.000 ? ? 0.433 0.007 Alloca
0.000 0.000 ? 0.000 ? ? 0.500 0.014 Family
Weighted Avg. 0.821 0.821 ? 0.821 ? ? 0.494 0.685
=== Confusion Matrix ===
a b c d e f <– classified as
1198 0 0 0 0 0 | a = Normal
101 0 0 0 0 0 | b = Abnorml
125 0 0 0 0 0 | c = Partial
4 0 0 0 0 0 | d = AdjLand
12 0 0 0 0 0 | e = Alloca
20 0 0 0 0 0 | f = Family
In light of the above tables and figures, we can obviously observe that for the Boston Housing data most significant accuracy is 100% and the least is 17.94 %. The other algorithm yields a normal accuracy of around 85%. In fact, the most important accuracy has a place with the Multi scheme classifier. ZeroR Classifier present at the base of the outline with percentage around 100%. A normal of 1198 instances out of absolute 1460 instances is observed to be effectively characterized with most elevated score of 262 occurrences contrasted with 1460 instances, which is the least score (Maimon & Rokach, 2010). The total time required to build the model is likewise a basic parameter in contrasting the classification algorithm. It is regular to recognize the reliability quality of the data gathered and their legality. This analysis suggests a normally utilized pointer which is mean of supreme errors and root mean squared errors. Then again, the relative errors are additionally utilized. It is found that the most important error is found in ZeroR Classifier with a normal score of around 0.821. A algorithm which has a lower error percentage will be favoured as it has all the more powerful classification capability, so after investigation we can say that ZeroR algorithm isn’t appropriate for a Data since it has most extreme number of errors and can’t classify the data effectively (Olson, 2017).
References
Ahmadi, F., & E Shiri Ahmad Abadi, M. (2013). Data Mining in Teacher Evaluation System using WEKA. International Journal Of Computer Applications, 63(10), 12-18. doi: 10.5120/10501-5268
Arabnia, H., Stahlbock, R., Abou-Nasr, M., & Weiss, G. DMIN 2017.
Azzalini, A., & Scarpa, B. (2012). Data Analysis and Data Mining. Oxford: Oxford University Press, USA.
Han, J., Kamber, M., & Pei, J. (2012). Data mining. Waltham, MA: Morgan Kaufmann/Elsevier.
Kaluža, B. (2013). Instant Weka how-to. Birmingham: Packt Pub.
Maimon, O., & Rokach, L. (2010). Data mining and knowledge discovery handbook. New York: Springer.
Olson, D. (2017). Descriptive Data Mining. Singapore: Springer Singapore.
Witten, I., Frank, E., & Hall, M. (2011). Data mining. Burlington, Mass.: Morgan Kaufmann Publishers.
Essay Writing Service Features
Our Experience
No matter how complex your assignment is, we can find the right professional for your specific task. Contact Essay is an essay writing company that hires only the smartest minds to help you with your projects. Our expertise allows us to provide students with high-quality academic writing, editing & proofreading services.Free Features
Free revision policy
$10Free bibliography & reference
$8Free title page
$8Free formatting
$8How Our Essay Writing Service Works
First, you will need to complete an order form. It's not difficult but, in case there is anything you find not to be clear, you may always call us so that we can guide you through it. On the order form, you will need to include some basic information concerning your order: subject, topic, number of pages, etc. We also encourage our clients to upload any relevant information or sources that will help.
Complete the order formOnce we have all the information and instructions that we need, we select the most suitable writer for your assignment. While everything seems to be clear, the writer, who has complete knowledge of the subject, may need clarification from you. It is at that point that you would receive a call or email from us.
Writer’s assignmentAs soon as the writer has finished, it will be delivered both to the website and to your email address so that you will not miss it. If your deadline is close at hand, we will place a call to you to make sure that you receive the paper on time.
Completing the order and download