Image Credit: Zapp2Photo / Shutterstock.com

Machine learning is arguably the hottest topic in technology today. It’s what has enabled us to shoot videos of ourselves spewing rainbows, and it’s what will enable cars to drive themselves in the future.

As I’ve read more about the topic, I’ve begun to wonder if machine learning could be applied to investing. To deepen my knowledge, I’ve spent the last few months reading books on the subject and practicing on data sets.

Fortunately, my educational background has allowed me to pick up the topic more easily. Machine learning, at its core, is automated statistics. To understand machine learning, one needs a solid foundation in linear algebra, calculus and statistics - all of which I had exposures to during my ten years spent in university.

After some study, I believe I now have enough of an understanding to make machine learning useful. Therefore, I’ve decided to start a new blog series that examines the application of machine learning to investing. This article is the first of such a series, and I hope to publish one every month or two.

As a first stab, I decided to investigate whether various financial ratios (e.g. price to earnings) could predict whether a stock would outperform the market over a 1 year period.

There are many machine learning models I could have used to investigate the problem. But since this is my first attempt, I decided to use the classic deep neural network (DNN). Let me explain what a DNN is, and why it’s useful.

#### What Deep Neural Networks Can Achieve That Linear Models Can’t

When researchers investigate the relationship between a potential cause and an outcome, they generally rely on a “linear” statistical model. For example, suppose a medical researcher wants to find a relationship between smoking and lung cancer. To investigate, she would collect data on a large number of smokers and non-smokers, and see whether the smokers tend to develop lung cancer more frequently. There’s a direct relationship between cause (smoking) and effect (cancer), and that’s what a linear model is good at capturing.

As with most other disciplines, finance practitioners have used linear models extensively. For example, the famous Fama-French three factor model is a linear model that decomposes the expected return of a stock into the risk free rate of return, beta, size and value components.

Unfortunately, a linear model is not good at capturing more complex relationships between cause and effect. Let me give you an example.

Suppose we want to know whether two magnets placed side by side will attract or repel. If we put a magnet’s positive end next to another magnet’s negative end, we know that they will attract. Therefore, we have two configurations where the magnets will attract: +-, and -+. In other cases (-- and ++), we know the magnets will repel.

Now, let’s suppose we want to create a linear model that tells us whether the magnets will attract or repel. To construct the model, we ask ourselves: given that the first magnet is showing +, will the magnets attract or repel?

Of course, as a human being, we will say that it depends on how the other magnet is oriented. However, linear models don’t have the capacity to consider that extra piece of information. All it can do is observe the direct relationship between the first magnet showing +, and the magnets attracting or repelling. Since the magnets would attract half the time (when the other magnet shows -) and repel half the time (when the other magnet shows +), the linear model would say there’s no relationship between the first magnet showing +, and the outcome.

DNNs, by contrast, are able to solve this problem using what’s known as “hidden layers”. You can think of hidden layers as intermediate steps that connect the cause and effect. For example, I know I'm oversimplifying things, but a hidden layer may contain information about whether the two magnets are showing the same signs. Because of hidden layers, DNNs can discern relationships between cause and effect that linear models can’t.

I suspect that there’s a lot of investment phenomena that require the use of DNNs to explain. For example, history shows that value (i.e. cheap) stocks and momentum stocks tend to outperform the rest of the stock market. However, stocks that exhibit both value and momentum characteristics have not tended to outperform stocks that exhibit only one of either value or momentum characteristics. This suggests a complex interplay between value and momentum that linear models can’t tease out.

This relationship between value and momentum is something that I may investigate in the future. But this time around, I decided to focus on a different problem: I decided to see whether financial ratios can predict whether a stock will outperform the rest of the stock market.

#### The Input and Output of DNNs

Financial ratios are metrics that one can calculate using a company’s financial statements. For example, the popular price to earnings (P/E) ratio is the price of a stock today divided by its earnings per share. Stocks with low P/E ratios are considered to be cheap.

There are hundreds of different ratios. Some, such as return on assets (ROA), measure how efficient a company’s operations are. Some others, such as the current ratio, measure the financial stability of a company. Different ratios all measure different aspects of the company, and some investors rely heavily on these ratios to make investment decisions.

I decided to investigate the influence of 82 such financial ratios on stock returns. I didn’t choose these ratios for any well thought out reason, but because they were convenient. I use a data service provider called Intrinio to fetch each stock’s fundamental data, and these 82 ratios are already calculated by them. The list of ratios are as follows:

Dividend yield |
Earnings yield |
EV to EBIT |
EV to EBITDA |

EV to free cash flow |
EV to invested capital |
EV to NOPAT |
EV to operating cash flow |

EV to revenue |
Altman Z score |
Total debt to EBITDA |
Long term debt to EBITDA |

Long term debt to NOPAT |
Net debt to EBITDA |
Net debt to NOPAT |
EBIT margin |

EBITDA margin |
Effective tax rate |
Gross margin |
Interest burden |

NOPAT margin |
Normalized NOPAT margin |
Operating expenses to revenue |
Operating margin |

Pretax income margin |
Profit margin |
R&D to revenue |
Selling, general & administrative to revenue |

Tax burden |
Current ratio |
Debt-free net working capital to revenue |
Debt-free, cash-free net working capital |

Net working capital to revenue |
Quick ratio |
Compound leverage factor |
Debt to equity |

Leverage Ratio |
Long term debt to equity |
EBIT growth |
EBITDA growth |

Earnings per share growth |
Free cash flow growth |
Invested capital growth |
Net income growth |

NOPAT growth |
Operating cash flow growth |
Revenue growth |
Accounts payable turnover |

Accounts receivable turnover |
Asset turnover |
Cash conversion cycle |
Days inventory outstanding |

Days payable outstanding |
Days sales outstanding |
Fixed asset turnover |
Inventory turnover |

Invested capital turnover |
Augmented payout ratio |
Cash return on invested capital |
Dividend payout ratio |

Net nonoperating expense percent |
Noncontrolling interest sharing ratio |
Operating cash flow to capital expenditures |
Operating return on assets |

Return on assets |
Return on common equity |
Return on equity |
Return on invested capital |

Return on net nonoperating assets |
ROIC less NNEP spread |
EBIT less capital expenditures to interest expense |
EBIT to interest expense |

Free cash flow to interest expense |
NOPAT less capital expenditures to interest expense |
NOPAT to interest expense |
Operating cash flow less capital expenditures to interest expense |

Operating cash flow to interest expense |
Common equity to total capital |
Debt to total capital |
Long term debt to capital |

Noncontrolling interests to total capital |
Short term debt to capital |

The list of ratios provided by Intrinio is pretty comprehensive. It not only includes ratios favoured by novices (e.g. dividend yield), but also contains ratios favoured by seasoned professionals (e.g. return on invested capital). Some ratios, such as the EV to EBIT ratio, have been found by academics to predict stock performance. The input to the DNN model consists of these ratios for each stock, each year, going back to 2007.

The output of the model, or the measure that we’re trying to predict, is each individual stock performance in excess of S&P 500 index returns over the next 12 months. The S&P 500 measures the performance of the U.S. stock market as a whole. If a stock returned 20% and the S&P 500 returned 15%, then we say the output is 20 - 15 = 5%.

I used Yahoo Finance data for the performance data, which allowed me to easily account for dividends and stock splits. But as another consideration, I had to be careful about choosing the timeframe of the stock performances.

Companies release their end of year financial statements up to 3 months after the end of the fiscal year. If I measured the stock performance from the end of the fiscal year, my model would have trained as if it had foreknowledge of the numbers the company would report.

For example, suppose a company’s fiscal year ended in Dec 2014. This company probably wouldn’t have released its financial statements until some time in Feb 2015. But if I trained my model on the stock performance from Jan 2015 to Jan 2016 based on the financial statements up to Dec 2014, I would be training it based on knowledge that didn’t exist in Jan 2015. To get around this problem, I measured the stock performance from 3 months after the release of the financial statements (e.g from Mar 2015 to Mar 2016).

Once I gathered the input and output data, I then preprocessed that data. In this step, I added new metrics that showed whether a financial ratio was absent. Some ratios can’t be calculated for mathematical reasons. For example, the EV to EBIT ratio can’t exist if EBIT is 0. Other ratios were unfortunately absent because of data issues - while I think Intrinio’s data is good, it’s not perfect. This doubled the number of metrics to 164.

After this, I filled the missing financial ratios with the median value from the rest of the stocks. For example, if a profit margin was missing for company XYZ, and the median value of the rest of the stocks was 10%, I supplanted XYZ’s profit margin with 10%.

I had to preprocess the data this way because the DNN can’t process inputs with missing data. But at the same time, I didn’t want to lose any information due to the fact that some ratios were missing, which led me to create the new metrics. This way, if there’s any common behaviour between stocks that miss certain financial ratios, then the DNN should be able to understand how to deal with them.

After I had preprocessed the data, I was left with roughly 25000 input/output pairs. I fed this data through various DNN models for training. There’s an unlimited number of ways to configure DNN models, from choosing the number of hidden layers, to choosing training algorithms. I tried many different combinations of these configurations to get the best fit.

Training a model involves the following: First, we split the data into ‘train’ and ‘test’ data sets to detect whether we are “overfitting” a model (I’ll explain overfitting later). Then, we feed the ‘train’ data into the model so that it finds relationships between the input and output within the ‘train’ data set.

Once training is done, we plot the actual stock performance vs. the performance predicted by the trained model. We do this for both the ‘train’ and ‘test’ data sets.

Before I began training any models, I expected to find a weak relationship between financial ratios and stock performances. But is there such a relationship? I will talk about the results from my DNN models in Part 2 of this instalment next week.