TRAFFIC EVENT DETECTION USING TWITTER DATA BASED ON ASSOCIATION RULES
Social media platforms allow millions of people worldwide to instantly share their thoughts online. Many people use social media to share traffic related experiences and events with online posts. A large amount of traffic related data can be obtained from these online posts – especially geosocial media data, where posts are tagged with geolocation information such as coordinates or place names. By extracting traffic events from geosocial media data, drivers can adapt to changing traffic conditions, while traffic management departments can propose timely and effective plans to improve traffic conditions. Most of the existing studies query traffic-related information based on a list of single keywords, which result in large amounts of noisy data – negative data containing one or more traffic-related keywords, but do not actually represent real-world traffic events. This paper aims to filter noisy data by mining association rules among words in positive data containing messages representing traffic events. Messages are more likely to be true traffic events if they follow the co-occurrence pattern of words mined from positive samples. A case study was conducted in Toronto, Canada using Twitter data. The tweets queried by the association rules were classified into non-traffic event, traffic accidents, roadwork, severe weather conditions, and special events with an 85% accuracy based on supervised machine learning methods. Compared with hourly average travel speed data, 81% of detected events were identified as real-world traffic events. This research sheds light on traffic condition monitoring in smart transportation platforms, which plays an important role for smart cities.