[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Tick data - data scrubbing



PureBytes Links

Trading Reference Links

A couple responses have come back defending sterile-clean data.

Based on my experience with at least 4 different data vendors over the years
and running systems on each one, I've been forced to draw these conclusions.

Most systems (even simple ones) are tpyically affected by even small
differences in the data.  The same system run on say DTNs data will yield
different signals and results than say TRAD's or Quote.com's.  In some cases
the differences will be marignal and in others, the differences will be
significant.

Most people don't have a tick scrubbing algorithim built into TradeStation
unless they've custom hacked into TradeStation and put one either before
data gets to TradeStation (bad idea) or between the charting module and the
server (good idea).  So to say this is part of step 2 of the process, is
implying that TradeStation users are not completely using good system
testing and trading methods because they don't have a tick filter, which all
users who responded agreed is needed.

Thus, for 99.99999% of the people out there using TradeStation, we either
have to work around the problem (maintain your own data or use market only
signals or other creative techniques) or recognize that in some cases (not
futures as much as stocks), bad ticks are a way of life we have to deal
with.

Recognizing that different data streams and quality of data produce
deifferent results (which is probably why so many traders are adamit about
capturing every last tick published by the exchange and why they avoid feeds
like BMI and inisisting on the fastest serial ports and so on) it stands to
reason that merged, cajoled and otherwise sterile-clean data that is unlike
anything your TradeStation will ever see will yield results that will be
hard if not impssible to duplicate using TradeStation and your real time
feed.

There is definitely value in historical data but I think the value is
reduced when someone else cleans it.  I would rather get the raw data, apply
my own filters, and then hook that same filter up to my trading platform so
my real time data matches my historical data as much as posssible.

Brian.

-----Original Message-----
From: Bob Fulks [mailto:bfulks@xxxxxxxxxxxx]
Sent: Friday, March 15, 2002 6:37 AM
To: Brian
Cc: List, Omega
Subject: RE: Tick data - data scrubbing


At 11:24 PM -0800 3/14/02, Brian wrote:

>You know I was out at a site that sold tick data and was reading
>about how they scrub their data from head to toe. This makes no sense
>to me...

<snip>

>A far better solution, IMO, is to maintain your own data from your
>vendor or the raw data, apply some general tick cleaning scrubbers
>(like ticks that fall 2% outside previous tick and so on) to an ASCII
>data file, and use that to do your backtests.


Consider trading system development as a two-step process:

   Stage 1: Find some market characteristic that is tradable

   Stage 2: Convert this to a robust, tradable system.

During Stage 1 you need to test lots of things with minimum effort so
you need clean data and "prototype" trading system code - lots of
simple code segments, pre-canned functions, etc.

During Stage 2 you need to streamline the code, add bad-tick filters,
error checking, adaptive parameters, code to handle special
exceptions, etc., to make the system work with what it will actually
encounter in real trading with real data.

Any experienced software developer will tell you that the two cases
are very, very different...

Bob Fulks