This chapter describes the key concepts and research methodologies that are used in the extraction of services delivery sentiment on social media. It explain the overall process of methodizes used to extract comments from social media, data pre-processing, classification of those data and the last process is testing and training of data, as well as the method that used to visualize the model. Also this chapter is going to answer all our research questions in a good manner.
Methodology is the word which implies more than simply the methods you intend to use to collect data.
It also takes a consideration on different concepts which underlie the methods. The data collection was conducted through literature review where study different materials on how to extract data from social media especially instagram and Facebook where ikulu mawasiliano instagram account and Zitto Kabwe instagram and Facebook page used to collect data.
Literature review is a research tool which enables evaluator to make the best use of the previous work in the field under investigation.
These help the researcher to learn from experiences, findings and mistakes of the previous related work (Goode 1952). The literature review
In this project, data collection is done on Instagram account of ikulu Mawasiliano (kurugenzi ya mawasiliano ya Rais Ikulu) which is the official account of the government which used to produce different announcements and publication, official Facebook page and instagram page of zitto Kabwe who is a member of Tanzania parliament representing Kigoma. In the process of data collection different craping tools used to help the mining of data from the specific social page, those tools include chrome scraping extensions tool which is added as extension on chrome browser.
This scrapping tools help to mine all the comments from a specific posts.
Other method used in the mining of data is Octourse and parsehub. Instagram and Facebook especially the specified social media account (ikulu mawasiliano and Zitto Kabwe) was selected for data collection because by investigation don on social media, those account revealed used to post direct social services delivery related posts, and some of people use to comment on the post provided by sharing their opinions and their views on social media. Also are the page which are frequently posting means are active. The second reason is just area of specification, because there are many accounts and the research cannot use all the account to extract data.
Is the stage where un required data are being removed, for instance additional information like emoji, links, and other unrequired character which are included by people when they share their opinions. In order to prepare the data collected for machine learning tasks, the text pre-processing including stop word removal, tokenization, lemmatization, and stemming, feature engineering. Instance selection also cope with the infeasibility of learning from a large datasets (Kotsiantis, 2007), and it attempt to maintain the quality of mining with minimum sample size
For the non- English language such as Arabic language is highly derivative of tens or even hundreds of words that could be formed using only by one stem. Due to that one stem may form many other words. According to the Ahmed A Elbery working with the Arabic document without stemming may result to the enormous that number of words being input into the classification phase.
Tokenization it refers to the process of split text or words into unit that called tokens, and the process called tokenization. In tokenization text is being read, tokenizing it into tokens or words generally it take place through by either blank space or any other character.
Another step performed in this research work is removing of all Arabic Words that have little meaning that are occur frequency to the documents such as “or”,” whose”, “on”, “where”, “in”, “from”, “beyond”, “from” and “all”. Process of removing stops word result to the effective processing and ensure efficiency of the terms indexing procedure.
This is a process modifying the existing data features into the new features that will be used to train a machine learning model. This process is important because the machine learning algorithm learn from the given data.
The process of evaluating the model is done by using confusion matrix, this is done after data cleaning and preprocessing. Confusion matrix is the measurement of performance of machine learning classification problem and the output can be of two or more classes. This is a table which includes the combinations of actual values and predicted one
The confusion matrix is used to measure the accuracy of the model from a given dataset . accuracy of the model means the collectness of a classifier by using predicted value and the actual datasets.
This research will use the following classification techniques
Na?ve Bayes this is one of supervised machine learning algorithms which applies Bayesian theorem with the assumption of independence between every pair of features.
P(Y|X)= (P(X?Y)*P(Y))/(P(X))
But in real life problems, there are multiple X variables as shown below.
X=(x_1,x_2,x_3,··,x_n)
P(Y|x_1,x_2,x_3,··,x_n)= (P(?(x?_1,x_2,x_3,··,x_n)?Y)*P(Y))/(P(x_1,x_2,x_3,··,x_n))
It require the predicator to be independent, while the predicators are dependent in many real life cases, this can limit the performance of the classifier
Is the supervised machine learning which use hyper plane in a dimension space that classifies the data point?
Essay Writing Service Features
Our Experience
No matter how complex your assignment is, we can find the right professional for your specific task. Contact Essay is an essay writing company that hires only the smartest minds to help you with your projects. Our expertise allows us to provide students with high-quality academic writing, editing & proofreading services.Free Features
Free revision policy
$10Free bibliography & reference
$8Free title page
$8Free formatting
$8How Our Essay Writing Service Works
First, you will need to complete an order form. It's not difficult but, in case there is anything you find not to be clear, you may always call us so that we can guide you through it. On the order form, you will need to include some basic information concerning your order: subject, topic, number of pages, etc. We also encourage our clients to upload any relevant information or sources that will help.
Complete the order formOnce we have all the information and instructions that we need, we select the most suitable writer for your assignment. While everything seems to be clear, the writer, who has complete knowledge of the subject, may need clarification from you. It is at that point that you would receive a call or email from us.
Writer’s assignmentAs soon as the writer has finished, it will be delivered both to the website and to your email address so that you will not miss it. If your deadline is close at hand, we will place a call to you to make sure that you receive the paper on time.
Completing the order and download