Social Network Analysis At Scale: Graph-based Analysis of Twitter Trends and Communities

dc.contributor.advisorTešić, Jelena
dc.contributor.authorNogueira de Moura, Lia
dc.contributor.committeeMemberMetsis, Vangelis
dc.contributor.committeeMemberHall-Phillips, Adrienne
dc.date.accessioned2020-11-18T14:33:19Z
dc.date.available2020-11-18T14:33:19Z
dc.date.issued2020-11
dc.description.abstractTwitter's influence on society and communication has motivated research work in the past decade. A large percentage of existing research focuses on specific Twitter datasets bound by time, location, topic, hashtag, and the analysis of the content of tweet messages of said datasets, and their influence on the fields of business, education, geography, health, linguistics, social sciences, and public governing. Researchers have attempted to answer a variety of questions, e.g. "What topics are being discussed in the Twitter dataset?", "What communities are formed within the set of users?", "Which users are at the center of a particular discussion?", "How are users reacting to real-time events?", and more important, "How can we combine and refine existing data science techniques that can be used in other Twitter research related work?" There have been very few attempts to address the scale and design of an end-to-end data processing and analysis pipeline at scale. This body of work offers one solution for a scalable way to gather, discover, analyze, and summarize joint sentiment of Twitter trends (topics, hashtags), and communities (groups of users that are bound by connection, topic, time period, or possibly location/language/interest) in the larger subspace of the Twitterverse. Topic discovery is improved by contextual network construction and tweet aggregation. The work offers an overarching pathway on how to construct an end-to-end data science pipeline for meaningful analysis of Twitter datasets at scale, namely data management, graph network construction, clustering, topic modeling, and graph data compression for meaningful visualization. We evaluate the data science package and different methods for graph construction and tweet data processing on over 12 million tweets over six different Twitter datasets.
dc.description.departmentComputer Science
dc.formatText
dc.format.extent143 pages
dc.format.medium1 file (.pdf)
dc.identifier.citationNogueira de Moura, L. (2020). <i>Social network analysis at scale: Graph-based analysis of Twitter trends and communities</i> (Unpublished thesis). Texas State University, San Marcos, Texas.
dc.identifier.urihttps://hdl.handle.net/10877/12933
dc.language.isoen
dc.subjectSocial networks
dc.subjectTwitter
dc.subjectGraph analysis
dc.subjectNetwork analysis
dc.subjectTopic modeling
dc.subjectTopic discovery
dc.subjectCommunities
dc.subjectData management
dc.subjectMongoDB
dc.subjectSemi-structured database
dc.subjectData science pipeline
dc.subject.lcshTwitter
dc.subject.lcshOnline social networks
dc.subject.lcshSocial media
dc.titleSocial Network Analysis At Scale: Graph-based Analysis of Twitter Trends and Communities
dc.typeThesis
thesis.degree.departmentComputer Science
thesis.degree.disciplineComputer Science
thesis.degree.grantorTexas State University
thesis.degree.levelMasters
thesis.degree.nameMaster of Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
NOGUEIRADEMOURA-THESIS-2020.pdf
Size:
50.76 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.97 KB
Format:
Plain Text
Description: