The end (of the first phase) is in sight. I started this project very myopic and wrote code with very small, short-term goals in mind, but I find great joy this morning in having the rest of the code conceived in my mind (and on paper) (and on MS Word). I know exactly what is left to be implemented — everything is sure.
To date, Midas Data Miner coding efforts have focused on mining & storing raw data which I have aptly stored in the binary tree named “primary”.
I remain to be pleased with my ZODB implementation.
Since my last post, I have successfully downloaded all the market data I could find off of Yahoo! Finance. The program downloads all the market data for a specified ticker off of Yahoo’s servers, and then this information is saved into a binary tree in my ZODB, which is keyed by the tuple (date,ticker).
The download took 7 hours, and comprised 8 years of market data for all my tickers.
I’m not sure if everything is packed optimally or not, but the resulting ZODB pickle file is sixteen gigabytes.
So while sequential access and un-pickling/re-pickling market days of information between my hard drive and my RAM to conserve memory was a solution, there was a better one to be found.
I am now using Zope Object Database (ZODB) and its OOBTree class. ZODB is much more slick than my proprietary pickling scheme… and binary trees are much quicker data structures for what I am trying to do (I was using mere dictionaries before).