Location-Based Social Media for Activity Space Modeling

Date

2019-08

Authors

Wang, Xujiao

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Human activity research is rooted in the study of modeling the patterns of human activities in space and time. Previous studies have made prevalent progress in the theories, methods, and applications of human activity analysis. Among these studies, human activity space modeling has been a crucial topic in studying the spatial distribution of individual behaviors. Human activity space modeling aims to understand and solve various problems driven by human activities, such as urban expansion and traffic congestion in the process of urbanization. Many commonly used activity models in computational physics and computer science are constructed at an abstract and generic level. However, individual activities vary over space and time; it is therefore imperative to account for spatial-temporal dynamics and variations for activity space modeling at an individual-level. Compared to traditional data sources that are costly and time-consuming to collect, the development of location-based social media (LBSM) has provided more flexibility for researchers regarding where, when, and how to collect information about human activity behaviors. Studies utilizing LBSM to analyze human activity patterns have grown rapidly. However, there is a lack of understanding about the morphology and the internal structure of activity space extracted from LBSM datasets. In addition, many studies lack effectiveness tests about how reliable LBSM data can be used to explain human activity space. To this end, this study explores the effectiveness of representing activity space from an individual perspective when using LBSM data from three Chinese cities (i.e., Beijing, Shanghai, and Guangzhou). The two objectives of this dissertation are summarized as follows: First, due to the lack of effectiveness testing in deriving human movement from LBSM data, this study tests the effectiveness of intra-individual indicators in modeling activity spaces from LBSM data. We evaluate how data collection durations and the choice of indicators affect the reliability of intra-individual activity space modeling. We use a linear regression model with the logarithmic transformation to approximate how the magnitude of four external morphology features and three internal structure characteristics changes with different data collection durations – from 1 month to 12 months. The results demonstrate that as the data collection duration increases, the magnitude of all defined indicators approaches a steady point; however, there are also outlier users who exhibit distinct patterns. It provides a useful reference to explore the balance point between data effectiveness and appropriate sample size from the LBSM database on empirical analysis. Second, little research was conducted to test the effectiveness of inter-individual models in comparing the internal structure of individual activity spaces based on unevenly distributed data. To fill this gap, this dissertation investigated how different models perform in identifying inter-individual similarities between LBSM users. We first clustered LBSM check-ins based on the density-based spatial clustering of applications with noise (DBSCAN). Appropriate clustering parameters are chosen with the help of the elbow method. We then import those clustered activities into a vector space model (VSM) and a spatial-temporal vector space model (ST-VSM). The former only considers the spatial locations of the check-ins, whereas the latter is also determined by the time period (i.e., morning, afternoon, and night) of the check-ins. We then measure LBSM user activity similarities by applying an extended cosine similarity analysis. The results successfully captured spatial-temporal activity similarities between LBSM users. In conclusion, this study evaluated the effectiveness of LBSM for activity space modeling. Here we define “effectiveness” as the stability of activity space indicators with different amounts of data used. There are two contributions to the study of activity space modeling. On the one hand, this study explores the effectiveness of LBSM in modeling intra-individual activity space. The results of the effectiveness test demonstrated how data collection duration impacts the magnitude of different activity space indicators. As the data size increases, the magnitude of four external and three internal indicators all approach a steady point in three cities. It provided a useful reference to explore a balance point between effective indicators and the appropriate sample sizes from LBSM data. The indicators and methods used in this study can also be applied to other social media platforms to test their stability and extensibility. On the other hand, it provides a robust method to measure individuals’ spatial-temporal similarities based on LBSM data. We conducted an analysis to evaluate the effectiveness of different models in measuring the inter-individual similarity between LBSM users based on their unevenly distributed check-ins. The results indicated that the similarity measurement is effective in discovering the spatial-temporal similarity between LBSM users. This extended similarity measurement provided a more robust method to measure users’ activity similarities based on low-resolution LBSM data. To sum up, this study generated valuable results in evaluating the effectiveness of LBSM for activity space modeling. The effectiveness tests on both intra-individual indicators and inter-individual similarity measures offer a new perspective on examining the performance of LBSM data in human activity space modeling. In addition, we also explored the activity patterns of the three largest cities in a rapidly developing country. The extracted activity patterns and outliers provided valuable input for urban planners and policy makers to understand the dynamics of urban residents in three densely populated Chinese cities. We foresee that this research will enhance the understanding of applying LBSM data to human activity studies and other widely applicable areas of geography, such as transportation, urban planning, and location-based services.

Description

Keywords

activity space modeling, data quality, location-based social media (LBSM), big geodata

Citation

Wang, X. (2019). Location-based social media for activity space modeling (Unpublished dissertation). Texas State University, San Marcos, Texas.

Rights

Rights Holder

Rights License

Rights URI