Data Leakage on Time Series

What are the recommended strategies to combat data leakage on time series OHLC's?

I wonder whether there is an industry norm solution to prevent the training from getting exposure to information it shouldn't. I have heard of using cross validators in pipelines, other people say this is a risky strategy.

Status: Model trains, 95% accuracy on train/test (keras CNN). Deployment is set up over currency pairs with web sockets for performance and logging functionality. On the deployment rig the algorithm has a 52% accuracy which means the train is likely exposed to information which in no way resembles new unseen before data.

Submitted October 21, 2020 at 07:05AM by Dream3r111

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s