US federal securities laws require publicly traded companies to disclose information on an ongoing basis. US companies must submit annual reports on Form 10-K, and quarterly reports on form 10-Q. These quarterly reports usually lead to significant volatility in stock prices and provide opportunity for alpha generation for long short equity investors. Most analysts pay careful attention to corporate performance numbers in quarterly reports and compare these against street expectations (See investopedia).Tucker Balch, co-founder of Lucena Research and Professor at Georgia Tech, instead believes that monitoring words can predict stock prices even better than numbers. Dr. Balch worked with a research student at Georgia Tech, Junwei Li, to track and understand which words impacted stock prices the most regardless of the actual reported financial results.
Mr Li found that of the information in a 10-Q/K, the earnings numbers reported are probably the most impactful, but the commentary is important as well. Lucena Research as well as other quant shops utilize such unstructured data and combine that with machine learning capabilities to develop proprietary investment strategies.
Stefanova: When reviewing quarterly reports, how did you quantify the impact of words?
Dr. Balch: To investigate how words in a 10-K or 10-Q impact company stock prices we borrowed some well-used techniques from the Natural Language Processing literature.
We looked at the filings of 56 technology stocks from 2000 to 2013, for a total of 3126 filings. Reports were classified as “negative” or “positive” according to whether, afterwards, the stock price went up or down relative to the market the next day.
We scored each word using the following criteria:
- The number of times it appeared in negative reports.
- The number of times it appeared in positive reports.
- The magnitude of price movement after each report.
Without going into the math too much, we devised a ranking scheme that provides a high score, near 1.0, for words that frequently appear in positive reports and less frequently in negative reports. Similarly words that appear often in negative reports get a negative score near -1.0. Words that don’t seem to have much affect either way end up with scores near 0.0.
Stefanova: Were you suprized by what you found in terms of the correlation between words and stock price performance?
Dr. Balch: For the most part the “bad” words associated with price drops are the kinds of things you’d expect: Words like “questionable,” “disturbing,” and “slander.” And similarly on the “good” side we find words like “outperform” and “confident.” It is especially nice to see “outperform” on the good side and “underperform” on the bad side.