[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[amibroker] CSI data with Amibroker (good news) - a length update



PureBytes Links

Trading Reference Links

Group:

I have receive a few private emails with questions from
some Amibroker users interested in CSI data. Since others
may be interested, I am wrapping my answers up into this
long post.

-------------------
SUMMARY:
-------------------

Working with CSI data in AB has significant benefits ---
inactive data (doubles the number of stocks for testing and
removes the danger of "survivorship" bias in testing),
longer data history than most other providers, high speed
testing (since the data is stored in the fastest possible
format, namely Amibroker's native database). 

Another benefit which I personally do not use yet, is
having access to stock data for international exchanges
like Britain and Canada.

But using CSI also has costs -- maintaining multiple
databases (CSI own native database, a large Amibroker
database used for backtesting research, and a smaller
Amibroker database used for  trading). Getting the data in
AB has its challenges which will be discussed in the DETAIL
and WRINKLES sections.

Do the benefits out way the costs? That is for each person
to decide. 

For me the benefits easily out way the cost.

-------------------
DETAILED DISCUSSION
-------------------

For several reasons I have found it best to maintain two
databases for CSI data. One I call CSI_Research which is
used for backtesting. The other is called CSI_trading.

The CSI_research database is a monster in size. Including
over 10,000 active stocks and another 12,000 inactive
stocks, this database occupies over 1 GB on the hard drive.
It has data from January 1989 to the most recent update
which currently is Nov 18, 2005 since I only update this
database once a month.

My CSI_trading database gets updated daily. CSI is set to
automatically export data to this file as part of its daily
download process. This database does not included inactives
stocks (cuts the export time down considerably) and it only
has a couple years of data (another time saver) since that
is more than enough for my trading indicators to work. This
database has about 12,500 items in it and takes about 185
MB on my hard drive. The CSI_trading database will easily
fit into the memory of my computer so it is a lot faster
for scans etc.

Each day I run the CSI downloader. That takes about 10 to
15 minutes to do its work. I have a DSL modem and a
moderately fast CPU with a fast hard drive. The download
over the DSL takes less than a minute. The CPU is runs at
full speed for part of the processing, but then it drops to
1/2 speed because my "fast" 7,200 rpm hard drive is not
fast enough.

Why does this download take so long? Because the CSI
database that sit on the hard drive contains data on about
60,000 items from the US, Canada, and Britain. Apparently
CSI figures it is simpler to maintain 1 standard database
than to maintain hundreds if not thousands of distinct
databases due to the fact CSI subscriptions have options
for countries covered (US, Canada and/or Britain), years
covered (5 years for monthly subscribers, 10 years for
annual subscribers plus optional extensions for earlier
years), for actives and / or inactives, etc. One large
database makes a lot more sense than a hundred or more
distinct ones. Thus, the daily download is handling tens of
thousands of updates. Through a special code users get
access to those portions of the database for which they
have paid subscriptions.

This is one of the reasons why Amibroker can not use a data
plugin to access the CSI database directly. Another reason
is that the data plugin approach would require CSI to be
running all the time and to keep its 900 MB database in RAM
(or have a terrible speed penalty if it have to access the
hard drive to continually feed AB data during
optimizations). That 900 MB would just be to hold CSI's
compressed database in RAM. Additional RAM would be needed
for Amibroker. So the data plug-in method is out.

In my case, I have access to the standard 10 years of data
for US stocks and indexes (about $30/month or about
$225/year) plus inactive stocks ($25/month or about
$200/year). I have also paid a one time fee to get several
additional years of data (both active and inactive) so my
info covers 1989 to the present.

As part of the daily download, I have set the User
Preferences in CSI to automatically update the
"CSI_trading" database used by Amibroker. As I mentioned,
to keep the size down and speed up in Amibroker,
preferences are set so only 500 days of data are exported
(about 2 years worth) for active stocks. This export to
Amibroker adds about 15 minutes to the process. So a full
download and export to AB takes a bit less than half an
hour on my computer.

A note about the 7,500 maximum listed on the CSI website.
Apparently this limitation does not apply to CSI's export
to Amibroker. My export to AB as part of the CSI daily
download process exports over 12,000 items. And once a
month I have the CSI export over 25,000 in one shot to the
Amibroker "CSI_Research" database. My guess is the 7,500
maximum is on the books so CSI can punish a user who might
be tempted to do use a single subscription to do 5 or 10
exports a day for colleagues on a local network.  I have
been using CSI data for nearly three years and so far have
never had a problem due to the 7,500 maximum rule.

When I do an export for to my Ambroker "CSI_Research"
database (which I only do once a month), I need to reset
several items on the CSI preferences page: change the path
so it points to the research database, change the number
for days to export from 500, to 4500 (about 17 years of
data), click on the "inactives" option, make sure the
"proportional" adjustment option is selected. When one
selects "inactives", CSI automatically selects the long
ticker naming formula (Ticker + stockIDnumber). Because the
tickers of inactive stocks can be reused for a new stock, a
number needs to be added to keep each separate in the
Amibroker database. Oh, just to keep things standardized, I
use this long naming convention for exporting to my
CSI_Trading database.

So how long does this full export take? Over a hour on my
computer. Now you know why I have two databases for CSI
data.

CSI opens and closes Amibroker in the backgound when doing
an export to an Amibroker database. That takes a bit longer
to do, but it results in major speed savings when using the
data in Amibroker. CSI creates a temporary csv file to hold
the export data and Amibroker reads that data, translates
it into Amibroker's internal data structure and stores it
in a standard Amibroker database folder. The result is
Amibroker is very fast at accessing CSI data. This results
in major time savings when doing back tests and
optimizations because all the data is in Amibroker's native
format.

Although having everything in Amibroker's native format is
a speed plus, but it has a couple of "costs".

The biggest cost is that in certain, predictable
circumstances the CSI export will, by design, completely
delete all the stock files in the targeted Amibroker
database. Why? CSI has to make sure a stock does not get
duplicated in Amibroker's database when that stock changes
its ticker name. From time to time a stock will change its
ticker. For example, when Chrysler got bought out, the "C"
symbol became available and Citibank started using it. Thus
Citiback stock data could end up being in Amibroker's
database twice once under its older ticker and once under
"C". If this continued over time, backtesting would be
compromised. So to avoid this, the CSI export has to delete
the old stock file in the Amibroker database. From what I
can see, CSI keeps track of such changes on a daily basis
and only deletes individual files as needed. But if I
manually change any item on the CSI preferences page for
exporting to Amibroker, CSI plays it "safe" and deletes
every file - yes every ticker file - in the targeted
Amibroker database listed on the CSI preferences page. That
wipes out all watch lists, studies, etc. The CSI_trading
database gets updated daily so this "nuking" of the entire
database does not happen. But when I update my CSI_Research
database, everything gets automatically "nuked" so the
ground is bare and the exported data will not be
contaminated by data left over due to ticker changes,
differences in history length, etc.

After exporting to my research database, watchlists,
favorites, and other assignments need to be redone. I do
this via a Scan using special AFL formula that
automatically builds key watchlists, etc. so this is not
painful, but it is not something I would wish to do each
day. This is another reason I keep two databases in
Amibroker for CSI data.

Oh, if I get behind in daily downloads, my daily
CSI_trading database in Amibroker needs to be "nuked". The
reason is the regular daily CSI export to AB only exports 2
days of data: today's and yesterday's. So if I am 5 days
behind, the regular export will only send the most recent
two days of data to Amibroker and 3 days of data will be
missing. The solution is to click on the "Refresh
history..." option on the CSI preferences page. Then the
next daily download (or a manual export via Database/Manual
Database Distribute command) will  "nuke" the Amibroker
database and rebuilt it from scratch and thus ensure the
history has no holes. Of course, I then have to run the 
AFL code that recreates my watchlists and favorite
assignments.

The AFL code that creates watchlists, identifies index
tickers and distinguishes between inactive and active
stocks -- each get assigned to distinct watchlists to speed
up back testing and optimizations. The AFL code also
detects "preferred" shares and these get put in a special
watchlist so they can be excluded from back tests.

------------------------------------
WRINKLE ONE - SELECTING PROPORTIONAL 
ADJUSTEMENT FOR DIVIDENDS
------------------------------------

CSI's exporter for Amibroker is a work in progress which
still has a couple wrinkles. Wrinkle one is the method used
to back adjust stocks prices for dividends. CSI has three
option for this but so far only options 2 and 3 are
implemented for exporting to Amibroker:

1. to ignore dividends (some users apparently what this). I
personally would not use this, so it does not bother me
that it is implemented as an option for exporting to
Amibroker.

2. to use "additive" adjustment (which involves subtracting
the exact total of dividends paid), but this can result in
early prices for high dividend stocks being recorded with
negative prices. Negative prices can really mess up profit
and lost calculations, so I do not like this optional at
all. For some reason, this is the default setting in the
most current CSI program (version 2.90).

3. to use "proportional" adjustment (which operates in a
similar manner to how stock splits are handled). The prices
prior to a dividend are adjusted downward on a pro-rated
basis so that prices never go negative. This is the one I
use.

So what is the wrinkle if option 3 is the way to go? Well
there is a bug in version 2.90 which prevents a user from
using the CSI preferences page to change from option 2 to
option Until version 2.91 appears, one has to make this
change manually by using a text editor to add a line to the
"unfair.ini" file in the UA folder. Thankfully, Steve at
CSI walked me through the process of changing the
"unfair.ini" file. I really appreciated this level of
support.

---------------------------------
WRINKLE TWO - GETTING MORE THAN 2 
DECIMALS OF PRECISION
---------------------------------

CSI data has 6 significant digits of precision. Amibroker
can accept 7 digits of precision. However, just a couple
days ago I became aware there is a bug somewhere that
results in only 2 decimal points of precision in
Amibroker's database for CSI data. 

I am preparing a bug report for CSI which I hope to submit
later today. 

I am sure this bug will get fixed by CSI, but there is an
effective workaround until then.

Why does it matter? Well much of the time 2 decimals of
precision is more than enough. However, the problem shows
up when a stock like CSCO has split a dozen times so that
its early price history (when it was trading for $25 in the
early 1990s) gets 0.08 as a split adjusted price. Thus with
a 0.08 price and only 2 decimals of precision, CSCO's price
has to move by 12.5% before any move shows up. It has to
move in large 12.5% steps from 0.08 to 0.07 or 0.09, but a
12.5 decline in a single day could trigger a stop loss
exits. Plus it would mess up profit and loss calculations. 

But this affects only a minority of stocks and then mainly
only for the early portion of their early history. Plus I
have a workaround:

FilterForOutProblemData = close < 0.20; // use to exclude
stocks that will have price jumps of 5% or more, or use a
0.50 to exclude stocks with price jumps of 2% or more.

---------------------------------
SUMMARY:
---------------------------------

Working with CSI data in AB has significant benefits ---
Inactive data (doubles the number of stocks for testing),
long history length (1989 for me, but I could buy more
history if I need it), high speed testing (since the data
is stored in the fastest possible format, namely
Amibroker's native database). Another benefit which I
personally do not use yet, is having access to stock data
for Canada and Britain.

But using CSI also has costs -- maintaining multiple
databases (CSI own native database, and a large Amibroker
database used for backtesting research, and a smaller
Amibroker datatbase used for decision for live trading).
Getting the data in AB has its challenges. 

Do the benefits out way the costs? That is for each person
to decide. For me the benefits easily out way the cost.

b


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


------------------------ Yahoo! Groups Sponsor --------------------~--> 
Try Online Currency Trading with GFT. Free 50K Demo. Trade 
24 Hours. Commission-Free. 
http://us.click.yahoo.com/RvFikB/9M2KAA/U1CZAA/GHeqlB/TM
--------------------------------------------------------------------~-> 

Please note that this group is for discussion between users only.

To get support from AmiBroker please send an e-mail directly to 
SUPPORT {at} amibroker.com

For other support material please check also:
http://www.amibroker.com/support.html

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/amibroker/

<*> To unsubscribe from this group, send an email to:
    amibroker-unsubscribe@xxxxxxxxxxxxxxx

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/