Semantic News Analysis and Prediction
MetadataShow full metadata
Active stock trading firms have a need for quick analysis of financial news items. News affects markets. Predicting how a news article may move a stock’s price can give a trader an edge over competitors and this involves the automatic understanding of a news item’s semantics. Years of research on semantic Web Services has yielded a variety of techniques to discern or provide meaning beyond the basic WSDL syntax. I believe that this research into Web Service semantics has relevance in other fields, specifically the content analysis of news as it applies to markets.
The purpose of the present study is to determine if specific academic models of Web-based semantic analysis can be utilized to provide market price predictions. The study’s design allows for an objective measure of accuracy by comparing predictions against actual market changes. In the study, I explore the application of current “Top-Down” Web service semantic analyzers to distill the various approaches into abstract concepts. I take a common approach of textual content matching and apply it with and without synonym-analysis (a form of spread activation) with promising results.
Using the securities in the Russell 1000 Index (chosen for market liquidity and activity), I collected corresponding news articles from Reuters for 8 months. For each article, I pulled one-minute snapshots of market data for the article’s publishing date and corresponding security. I then divided the news items into two groups: an in-sample learning set and an out-of-sample input set. The in-sample set of news provided “predictions” for price movement and I could contrast this against what the input item actually did in the market.
Simple semantic analysis produced encouraging results with a rate of return (profit) better than random for shorter hold durations (one to five minutes). A synonym-based strategy showed a stronger return for longer hold periods (thirty to forty-five minutes). Both strategies performed better than a random matching approach, which lost money for every hold duration. These results show potential for similar and broader market analysis using established academic models of semantic Web analysis.