I really wanted to write up a post discussing my stock screener app and clear my head, basically.
The functions of this application will be to 1) import 3rd-party fundamental and daily data [for U.S. equity stocks] and aggregate all this data to create a historical time-series data set, 2) screen point-in-time market data against user-specified criteria to deliver a basket of stocks that pass the selection criteria, 3) simulate portfolios that buy and hold selected stocks for user-specified holding periods and position sizes, and 4) to calculate dollar returns of a simulated portfolio.
These are the as-yet-identified phases for development:
There exists two types of market data; daily data and fundamental data. Daily data typically comprises opening and closing prices of a stock ticker, as well as volume, and daily high and daily low. Fundamental data is available either quarterly or annually, and comprises descriptive information such as earnings per share, total liabilities, etc.
The first challenge will be to locate freely-available data sets for free… I have already found a couple free data sets. Two paid data sets that are available include the AAII Stock Investor Pro database and the Standard & Poor’s Compustat database, which the application will eventually support after it is further along.
A later post will identify the free data sets I have found and plan on implementing.
Obviously, this phase is very important for the project.
This challenge is how to bring together various sources of market data into one contiguous time-series of complete market data… So far, the solution is to create my own .pztmdb or pztrick-market-database that can aggregate the various sources. The database will be exhaustive with probably hundreds or thousands of fields for conceivable market data, though obviously any single data set source will not be sufficient to provide all information.
So to begin, phase 2.1 simply comprises the creation/definition of a new .pztmdb file type that my application will use, as well as GUI options for creating, loading, saving the session’s .pztmdb file.
To make the application as extensible as possible, each data source will require a unique plug-in that tells the application how to save the data to the pztmdb database… Upon importing, all data from the source is saved to the pztmdb.
If data already exists for a given field, the user will be able to select which data to keep. Each field in pztmdb will retain meta-data that specifies the date it was modified, and the name of the data set that provided the value, so that users will be able to prioritize which data sets are most accurate… Meta data in the header of the pztmdb file will allow users to order, for example, the S&P Compustat data set higher than the SketchyDataSets.com data set.
Of course, if data does not already exist for a given field, it is successfully imported. Therefore, it is possible to brute-ly import as many data sets as you want, from as many sources as coded plug-ins have been developed, and only the most accurate data will be saved.
Ideally, I will be able to develop the application using less-accurate freely-available data sets found on Google, and eventually upgrade to AAII SIP or Compustat data sets that will be more useful for testing portfolio strategies.
To be clear, phase 2.2 comprises writing plug-ins that communicate how to import data sets for fundamental data and daily data into the .pztmdb.
Phase 2.3 comprises coding the GUI for browsing and importing data set files, and specifying which plug-in they use…
Phase 2.4 comprises code to view the raw spreadsheet for the pztmdb file, with possible features to view the meta data for a given field, or to edit a given field manually (if you know a data set provides wrong value for example). Also, a user should be able to adjust the ordered list of preference for data sets as previously discussed, saved in the pztmdb header/meta data.
For any given date in time, users will be able to specify criteria, such as market capitalization greater than $100 million, and the stock screener will return a list of stocks that pass all the specified criteria. If the data set is incomplete for the specific criteria, the screen will fail and tell the user as much.
Installed plug-ins will define what criteria can be specified. A standard set will come included with the application for such typical criteria as P/E ratios or moving averages. Advanced users will be able to write their own plug-ins for themselves or to share with others.
Finally, users will be able to save stock screen criteria as files for later use.
This feature is crucial for the portfolio simulator (backtesting).
In addition to screening stocks, the portfolio tool will enable users to configure position sizing in denominations of fake money and holding length in periods of weeks, months, years, etc per stock buy, as well as how frequently to adjust the portfolio holdings(weekly, monthly, quarterly).
Portfolios can be backtested against the loaded pztmdb file in the current session, and the application will output the $ performance of the portfolio over a specified interval of time.
Whereas the stock screener tool merely returns the basket of stock tickers that fulfill certain criteria, the portfolio tool will return the performance of that same basket of stocks, adjusted at regular intervals, over a great period of time. This is called backtesting.
Portfolios can be saved for later use.
One additional feature I would like to implement for portfolios is a binary “in-or-out” condition. For example, one condition may be written such that the DJIA has not fallen 15% in the trailing three months. For each week that this condition is true, the portfolio liquidates all holdings to cash until the condition is false again. This is distinct from the stock screener in that criteria for the screener is individual to a single stock, whereas this condition should comprise the broader market.
Finally, with the application fully developed, I’ll be able to take my beliefs about the stock market(e.g. value investing) and begin crafting a portfolio strategy that fulfills these beliefs, and see what kind of performance I can expect.
These kind of tools already exist for people with great enough pocketbooks, but I do want the tool for my personal investing and the experience of programming it; a win-win project. And who knows, perhaps a freeware or retail release could be worthwhile when all is said and done.