COMP529: COURSEWORK


COMP529: COURSEWORK RESIT ASSIGNMENT #2 (STREAM ANALYTICS)
Introduction
This assessed coursework assignment is worth 20% of your overall grade for COMP529 module. Failure on this assignment can be compensated through higher marks in other assessments on the module. The assignment aims to test your understanding of streaming analytics, with a focus on your ability to use Storm to solve Big Data Analytic problems. More specifically, it aims to partially assess the following learning outcome for COMP529: “understanding of the middleware that can be used to enable algorithms to scale up to analysis of large data streams in real-time”.
AsSESSMENT
The report will be assessed according to the following criteria:
Criterion Percentage
Clarity of presentation (including succinctness) of main report 20%
Quality of Java code (including assessment of how easy it is to understand) 40%
Quality of analysis performed 40%
SUBMISSION
Please submit your coursework online using the COMP529 page on VITAL by 3pm on the 7th of August 2020. Standard lateness penalties will apply to any work handed in after this time. The report and Java program must be written by yourself, using your own words (see the University guidance on academic integrity for additional information).
TASK
Due to the spread of Corona virus -COVID-19, people around the world have had a restriction in going to work, meet friends and families. Citizen of every countries were mainly required to stay at home, work or study from home and use technologies to connect with each other.
To understand people’s concern and feeling during this pandemic, you have been asked to monitor twitter’s hashtag (e.g., covid-19) and aggregates these data in real-time COMP529课程作业代写、ANALYTICS留学生作业代做through utilising a Big Data Stream Analytic middleware like apache Storm. This will helps you to continuously monitor various topics, keywords, and sources that are possibly related with COVID-19.
The code for topological spout and bolt that extracts a “streaming” feed from Twitter is here:
https://github.com/davidkiss/storm-twitter-word-count
Your task is therefore as follows:
1)Set up a Storm cluster;
2)Write a Java program for a Storm topology job that includes a:
a.Spout that produces stream of tweets;
b.Bolt that collects information about COVID-19 pandemic and detects tweets that contain some keywords related to COVID-19 (e.g., stress, lonely, lost jobs...etc.).
3)Use Storm’s topology to predict people’s serious concern during this COVID-19 pandemic.
Your output report
The output from this coursework is a brief report suggested to have sections that describe:
1)Middleware configuration: How you configured the Storm middleware (including a description of your Storm cluster and your rationale for this choice).
2)Data Analytic Design: How you designed the Storm topology (including your rationale for your design).
3)Results: The results obtained (excluding any discussion).
4)Discussion of Results;
5)Conclusions and Recommendations (including discussion of how you would perform the task if it were to be undertaken at much larger scale).
Format of your report
1)The output from this coursework is a brief report to be less than or equal to two[ While the requirement is to produce no more than 2 pages, it is anticipated that the challenge will be to fit everything into those 2 pages: it is unlikely that a report of much less than 2 pages will result in a high mark.] A4 pages excluding any appendices (two pages only), text size is 12-point, justify text, and in only pdf/docx formats.
2)Make sure to save your file under your surname + module code (e.g., Abcd_COMP529).
3)You should include a listing of the Java program for your Storm topology in an appendix (no longer than 2 pages).

如有需要,请加QQ:99515681 或邮箱:[email protected] 微信:codehelp

你可能感兴趣的:(COMP529: COURSEWORK)