Why Use Stock Screens
Using stock screens to sort huge databases of company metrics is attracting more retail investors. It does not require any knowledge of accounting or time to read reports or research the business. It has credible back-testing to prove it works. And, hey! Didn’t someone prove that markets are ‘efficient’, and that all the publicly available information was already incorporated into the stock price? But any simple understanding amounts to a wrong understanding. This page presents the system in more detail so you can see the problems and avoid the traps. The specific data comes from What Works on Wall Street by James O’Shaugnnessy. For another example of researcher’s metric picks see this list.
Data Snooping
The biggest problem with this strategy lies in its basic premise, that back-testing proves something. Academics have given this issue the name ‘data-snooping’. Given enough time, enough attempts, and enough imagination, almost any pattern can be teased out of any dataset. And there have been decades and decades of dredging done on the historical stock exchange database. Proponents of the efficient markets hypothesis argue that many of the predictable patterns that have been identified in financial markets may be due to simple chance (Reality Check by Park and Irwin 2009). The relationships simply do not exist outside of the specific data set analysed (e.g. in the future).
Understanding the Metrics
The metrics used here may be essentially the same as those used to understand and evaluate a company heuristically. But there are differences in interpretation. Here, the investor has no need to agree with the calculation of any metric. Nor should he care if there appears to be no explanation for why a metric works.
This is because they are not meant to show that one company is ‘better’ than another. No CEO should attempt to run his company using these metrics as operating objectives. E.g. ‘1-yr sales %growth’ is negatively correlated with stock appreciation, but we would never want management to aim for lower sales. The metrics are used as predictors of near-term stock price movements ONLY. And not of an individual stock’s price, but of the basket portfolio created by the screen.
The metrics should be thought of like a poker player’s ‘tell’. They are predictors of future stock price movements, just like a player’s tell indicates the value of his hand. The tell says nothing about the player’s long term ability to win, only his probable performance this hand.
Classification of Metrics (examples only)
- Market Attention metrics:
- Stock price hitting new highs
- 3 mo, 6 mo, 1 yr stock price % appreciation
- 5 year stock price % appreciation (a negative indicator because of reversion to the mean)
- Growth metrics:
- 1 year EPS % growth
- 5 year EPS % growth (negative indicator: reversion to the mean)
- ROE (= theoretically possible growth)
- 1 year sales % growth (a negative indicator per O’Shaugnnessy)
- Risk metrics:
These use Balance Sheet accounts to measure liquidity and leverage. - Valuation metrics:
There are five main valuation metrics. The modifying factor can be thought of as the link between the metric and Net Income.- Price/Sales is modified by the factor ‘net margin’.
- Price/Book Value is modified by the factor ‘ROE’
- Price/Earnings is modified by the factor ‘growth’
- Price/Cash Earnings is modified by ‘capex per sales’. This is not the same measure as cash flow used for absolute valuations.
- Dividend Yield is modified by the ‘payout ratio’.
Different Ways to Define the Metric Parameters
- The metric must be greater/less than a predefined value.
- The metric must be better than the average for the universe (or in the top decile, etc).
- Only the most extreme values are accepted until all the spaces in the portfolio are filled.
- A more sophisticated method incorporates a larger number of metrics. For each metric all the stocks are given a ranking (or decile, etc) for that metric. A stock’s overall score is the sum across the values for all the metrics. This way no stock is discarded just because it does not have an extreme value on one metric. A company with moderate values on a lot of metrics outranks one with extreme scores on a limited number of metrics.
Where To Find The Data
Sites don’t always say what universe of stocks is being searched, so make your own presumptions from the context. While there are several sites that can screen for US stocks, the sites for Canadian companies are very rudimentary.
When there are no screening tools available, you can use screen displays. Copy and paste them into a blank Excel file. Sort by the ticker and arrange beside each other, ensuring all the metrics on a line pertain to one company. Then use the sort button to sort by each column’s metric. Color shade the values that comply with your screen parameters. Once all the metrics have been analyzed, you add across the line giving different values to the different color codings.
FinViz: Cdn, US and int’l companies
CNBC: Cdn and US companies
Guru Analysis
Financial Time
Morningstar
Nasdaq screen
GlobeInvestor
Kiplinger
MSN Money
Zacks
Canoe
Harry Domash
Yahoo Finance
AdviceforInvestor
StockCharts
General Issues You Should Understand
* Risk is just as important as increasing returns. For all portfolios, your annual returns averaged will be higher than the compounding return you realize over time. (E.g. the S&P500;’s annual returns from 1927-2006 average at 12.2%. But the compounding return over time is only 10.3%). It is the infrequent really bad years that drag you down. Screening for single metrics widens the difference between these two measures of return.
Almost all screened portfolios in the book had greater risk than the underlying universe of stocks. This is predictable. When screening for growth, you end up exposed to high priced stocks when their growth ends. When screening for value, you end up with broken businesses.
A large number of portfolios back-tested in the book UNDER performed their benchmark one third of all 5 year periods. Even if there is near certainty of regaining the shortfall in the subsequent 5 years, do YOU want to wait 10 years just to pull even?
* Why does it work? The extra return generated by screening for a specific metric may not be attributable to the metric itself. E.g. screening LargeCaps for ‘dividend yield’ will spit out a large number of utilities. The high returns are a function of the sector not the metric. Screening AllCaps for ‘dividend yield’ added little value for O’Shaughnessy.
* Multiple metrics. Screening for more metrics is not necessarily better than screening for only one. When the metric belong to the same catagory you effectively make your selections more and more risky, with less and less redeaming qualities.
Some metrics may have little value alone, but add value when combined with others. E.g. using ‘1-year price appreciation’ to sort SmallCaps lowers your return from 12.6% to 10.2%. But when combined with the metric ‘price/sales<1’, the returns increase to 18.6%.
The risk created by single metrics can be reduced by teaming up a value metric with a market attention metric. This way you look for companies who are still cheap but have turned the corner. Now the markets are paying attention.
* Capitalization size. The value of a metric can be different when applied to portfolios of different capitalization (small cap, large cap, equal-weighted, cap-weighted). E.g. ‘1-yr price appreciation’ creates about 3% of value in the LargeCap universe. But decreases returns by more than 2% in the SmallCap universe.
* Size of Portfolio. Decreasing the number of stocks in the resulting portfolio can increase your returns because the ones chosen are more extreme. But almost always the increase in risk is so sever that no investor could continue the strategy through the dark days. The advantages of screening are greater in the SmallCap universe, but when adjusted for risk the benefits are greater in the LargeCaps.
* Rebalance Frequency. The optimum length of time between rebalancing with the results of a new screen vary. Generally, 1 year is optimal for LargCaps and value metrics. SmallCaps and growth metrics are better rebalanced more frequently. Transaction costs and taxes will effect your decision on this.
Why This May Not Work
1.) The greatest amount of work has been done by James O’Shaughnessy (reference above). Similar work was done by Jeremy Siegel who claims even greater benefits. But the hypothesis was ‘proven’ only by back-testing the actual history of stocks. What remains unknown is the influence of pure chance.
2.) O’Shaughnessy has created public mutual funds managed by these strategies, but they have not beaten the market since performing well through the tech crash (which everyone did who did not buy stocks without profits). Of course he never claimed to even meet the market over any 5-year period, so he can respond by saying you have to stick with him. Regardless, it raises the possibility that history may not predict the future.
3.) Although the books by both authors are aimed at retail investors, neither overtly addresses the size of the portfolio and the risks of making it smaller. O’Shaughnessy uses 50 stocks and Seigel uses 100 stocks. The retail investor is not going to buy 50 – 100 stocks. Their risk would be HUGELY greater.
4.) A problem with O’Shaughnessy’s work is shown by what he decides to NOT show. For each metric analyzed he shows a bar chart of the resulting % returns for each decile of the metric. A bar chart shows only one value – the average of the decile. With no extra space his chart could have been drawn to show the average return, the width of the standard deviation and also the spread of all results. The illustration (right) shows the alternate displays he chose to NOT show.
The implications of this missing information is shown by the results (right) of sorting for the metric ‘dividend yield’ on both the AllStock universe and the LargeCap universe. The decile results are compared to his reported results for a “Best 50” portfolio of the 50 stocks chosen with the highest ‘yield’.
We don’t know how large the LargCap universe is, but he does say that the AllStock universe is 3795 stocks. So the first decile includes 380 stocks and would include all the stocks chosen for the “Best 50”. It would be expected that the return realized by the “Best 50” would be similar to the average of the top decile. It isn’t.
The results for the AllStock universe are similar to the results for ALL the other metrics analyzed. The 50 extreme stocks produce returns (13.4%) far lower than the average for the top decile (14.3%). So much lower that they more closely replicate the returns from the whole top half (5 deciles together) of the universe. This implies that the variability of results within that top decile is very wide indeed. We could have predicted this if he had NOT used bar charts.
The results from sorting the LargeCap universe do NOT conform to the pattern of other metrics. The 50 extreme stocks produce returns (13.6%) far better than the average for the top decile (12.7%). But was this just chance? Why only this metric? Why not for the AllStock universe as well? Why so much larger than the average for the top decile? Why were the second and third deciles’ returns higher than the top decile? It could all be due to chance.
O’Shaughnessy’s recommendations, as well as many advisors’, are heavily influenced by his results that show that dividends are the best predictor for risk-adjusted portfolio returns. But the results raise more questions than they answer.