[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Correlation



PureBytes Links

Trading Reference Links

Hello Paul,

I've never done PPMCC's on stock/index data but if I did I'd plug in
the daily closing price of each and run the program.  If I had my
druthers, I'd plug in say 25 stocks and run all the inter-correlations
and or do a multiple regression in an attempt to predict the index for
example.  You might also try lagging  one or all the stocks to see if
there is any forecasting value (e.g., what is the predictive value,
derived from the correlation coeff of say GE to predict tomorrow's SP
behavior.).

One thing to be aware of is that people often put undue importance on
the statistical significance of such stats.  That's important of
course but it does not guarantee much in the way of predictive
ability. If you don't see r's greater than   .50 (somewhat arbitrary
#) you have little practical predictive power.

Consider also that 252 days is not enuf data.  If you can try for
more, esp if you start doing cross correlation betw a number of
stocks.  And/or separate the data into 2 or 3 sections and run the
stat in all of them to see if the relationship is stable.

Finally, look for stocks with high negative correlation to see if
their is a "pairs" strategy in there for hedging.

Wednesday, February 06, 2002, 2:23:16 AM, you wrote:

PA> I'd like to assemble a basket of stocks that correlate strongly with 
PA> indexes, say for the last year.  So I figure I'll search for stocks that 
PA> have a high "Pearson product moment correlation coefficient" (phew, copied 
PA> that from Excel, glad I didn't have to say it out loud) with the 
PA> index.  But since I don't truly grasp the details of the formula (after 
PA> staring at it for quite a while), here's my question:

PA> What data do I enter into the formula, which compares two identically-sized 
PA> arrays of values?  I figure I'll have two arrays of data, each 252 days 
PA> long.  But if I take the % return: over what period of time should the % 
PA> return be calculated (daily?, weekly?, monthly?).   And doesn't the % 
PA> return have large distortions, when the values become very small?  After 
PA> all, 0.00001% is mathematically about a zillion times bigger than 
PA> 0.0000000000000001%, but in the real world they're both ~ 0.0  .

PA> Is there a "right" way to do a correlation study, that addresses these issues?

PA> Thanks for any comments.

PA>       Paul



-- 
Best regards,
 Jim                            mailto:jejohn@xxxxxxxxxxxxxxxx