Forex Entry Testing¶

July 13, 2021

When building a system there are multiple different components that need to be tested. The entry signal of a trading system is just one component. This is the component that gets you into a trade. Van K. Tharp cited a method from LeBeau and Lucas for testing reliability of entry signals:

What they do is determine the reliability (that is, percentage of time it is profitable) of the signal after various time periods. You might try an hour, the end of day, and after 1, 2, 5, 10, and 20 days. A random system should give you average reliability of about 50 percent (that is, generally between 45 and 55 percent). If your concept is any better than random, then it should give you a reliability of 55 percent or better-escpecially in the 1- to 5-day time periods. If it doesn't do that, then it is no better than random, no matter how sound the concept seems to be. - Van K. Tharp, Trade Your Way to Financial Freedom

Let's pick one of our entry mechanisms and run this test to see how it does. We will generate histograms of each distribution and capture the output reliability statistics for comparison across each of the suggested time periods.

# basic python imports
import pandas as pd
import matplotlib.pyplot as plt
import time
import math
from talib import EMA,SMA,ATR
from numpy import nan

# retrieve price data
instrument='EUR_USD'
granularity='H1'
df = pd.read_csv('/Volumes/market-data/oanda-forex/'+instrument+'/'+granularity+'/master.csv')

First, lets code the entry logic that we want to test in a function that will return a column of entry data. Our target output from this function will be a column called "entry" with 0 if there is no entry on the bar, 1 if a LONG entry could be taken, and 2 if a SHORT entry could be taken.

The simple entry rule we will create will contain the following logic for a LONG entry:

Price is highest of last 8 bars and price is greater than EMA8
EMA8 > EMA12 > EMA24 > EMA72

And we will do the exact opposite of the above for a SHORT entry.

# build simple function
def generateEntryLabels(df,verbose=True):
    """Return entry column with 0 no entry, 1 LONG entry and -1 SHORT entry"""
    df['EMA72'] = EMA(df.Close,72)
    df['EMA24'] = EMA(df.Close,24)
    df['EMA12'] = EMA(df.Close,12)
    df['EMA8'] = EMA(df.Close,8)
    
    df['entry'] = 0
    for idx in range(8,len(df)):
        if df.loc[idx].Close==max(df.loc[idx-8:idx].Close) and df.loc[idx].Close>df.loc[idx].EMA8 and \
            df.loc[idx].EMA8>df.loc[idx].EMA12 and df.loc[idx].EMA12>df.loc[idx].EMA24 and \
            df.loc[idx].EMA24>df.loc[idx].EMA72:
            df.loc[idx,'entry'] = 1
            
        elif df.loc[idx].Close==min(df.loc[idx-8:idx].Close) and df.loc[idx].Close<df.loc[idx].EMA8 and \
            df.loc[idx].EMA8<df.loc[idx].EMA12 and df.loc[idx].EMA12<df.loc[idx].EMA24 and \
            df.loc[idx].EMA24<df.loc[idx].EMA72:
            df.loc[idx,'entry'] = -1
            
    if verbose:
        print('# longs:  ',len(df[df['entry']==1]))
        print('# shorts: ',len(df[df['entry']==-1]))
        print('# total:  ',len(df))
        
    return df

df = generateEntryLabels(df,verbose=False)

Now that the long and short labels from the last function have been generated, we can iterate through the historical data and see where the price ends up at the target time in the future. We will build the function below and then use it to iterate through the suggested timeframes from the blurb in the book.

def getFuturePeriodDistribution(instrument,df,target,futurePeriod,verbose):
    
    if 'JPY' in instrument:
        multiplier=100
    else:
        multiplier=1000
        
    df['lDiff']=nan
    df['sDiff']=nan
    for idx in range(0,max(df.index-futurePeriod)):
        if target=='Close':
            price = df.loc[idx].Close
            fPrice = df.loc[idx+futurePeriod].Close
            
            if df.loc[idx].entry==1:
                df.loc[idx,'lDiff']=(fPrice-price)*multiplier
            elif df.loc[idx].entry==-1:
                df.loc[idx,'sDiff']=(fPrice-price)*multiplier

    if verbose:
        plt.figure(figsize=(15,4))
        plt.title(instrument+' Distribution of price change '+str(futurePeriod)+'-'+ \
                  granularity+' periods after entry')
        ldf = df[df['lDiff']!=nan]
        sdf = df[df['sDiff']!=nan]
        plt.hist(ldf.lDiff,bins=int(math.sqrt(len(ldf))),label='long' )
        plt.hist(sdf.sDiff,bins=int(math.sqrt(len(sdf))),label='short' )
        plt.legend()
        plt.show()
    
    return ldf,sdf

Now lets iterate through each hourly period suggested earlier. For example, 10 days is equal to 240 hours in the future. We will capture the output of both long and short trades in dataframes for each target time period. Then, given the distance that price traveled in that period of time, we can capture the reliability percent by simply calculating the total number of trades that would have ended in profit and dividing by the total number of trades.

pdf = pd.DataFrame(columns=['futurePeriod','longRelPct','shortRelPct'])
target='Close'
# run each example as initially suggested - 1 hour, then 1, 2, 5, 10, and 20 days
for futurePeriod in [1,24,48,120,240,480]:
    ldf,sdf = getFuturePeriodDistribution(instrument,df,target,futurePeriod,verbose=True)
    longRelPct = round( len(ldf[ldf['lDiff']>0]['lDiff']) / len(ldf[ldf['lDiff'].notnull()]),2)*100
    shortRelPct = round( len(sdf[sdf['sDiff']<0]['lDiff']) / len(sdf[sdf['sDiff'].notnull()]),2)*100
    print(str(futurePeriod)+' hours\n\t'+str(longRelPct)+ \
          '% long reliability\n\t'+str(shortRelPct)+'% short reliability')
    pdf.append({'futurePeriod':futurePeriod,
               'longRelPct':longRelPct,
               'shortRelPct':shortRelPct},ignore_index=True)

1 hours
	43.0% long reliability
	44.0% short reliability

24 hours
	48.0% long reliability
	50.0% short reliability

48 hours
	49.0% long reliability
	50.0% short reliability

120 hours
	49.0% long reliability
	49.0% short reliability

240 hours
	49.0% long reliability
	48.0% short reliability

480 hours
	49.0% long reliability
	51.0% short reliability

We can see that the above signals do not generate any reliability higher than 55% which might indicate a better than random signal. They all seem to fall within the 45-55% range (for the most part), which indicates that this signal is no better than random for the selected instrument (EUR_USD) over the given timeframe of data. It is interesting that the signal generates a worse-than-random result one hour after the signal entry.

There are some other things to note about this analysis. It is really only a test for the entry signal reliability. It does not take into account the interaction of other system components after a live trading system is created, for instance the addition of stop losses and take profit targets. Also, the label generation does not account for signals that happen right after each other - that also depends on the exit rules of the system.

References¶

[1] Van K. Tharp, Trade Your Way to Financial Freedom, Chapter 4 Steps to Developing a System, PART 2 Conceptualizing Your System https://www.amazon.com/Trade-Your-Way-Financial-Freedom/dp/007147871X/ref=asc_df_007147871X/?tag=hyprod-20&linkCode=df0&hvadid=344057888328&hvpos=&hvnetw=g&hvrand=18331769108948188620&hvpone=&hvptwo=&hvqmt=&hvdev=c&hvdvcmdl=&hvlocint=&hvlocphy=9030070&hvtargid=pla-405693441725&psc=1&tag=&ref=&adgrpid=69543898472&hvpone=&hvptwo=&hvadid=344057888328&hvpos=&hvnetw=g&hvrand=18331769108948188620&hvqmt=&hvdev=c&hvdvcmdl=&hvlocint=&hvlocphy=9030070&hvtargid=pla-405693441725

[2] LeBeau, Charles, and David W. Lucas. The Technical Traders' Guide to Computer Analysis of the Futures Market.