Normalising training and test data for neural networks

A common mistake when using neural networks it would seem is to separately normalise training and test data. The following codes demonstrate how, as far as I know, this should be done properly. On the first example I’ll use the MinMaxScaler function and the second just python code.

Using MinMaxScaler

# scaling test code
from sklearn.preprocessing import MinMaxScaler

trainArray=np.array([[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12]])
testArray=np.array([[1.1],[4],[1],[12],[13],[14],[10],[11],[12],[10],[11],[13]])

# normalize the data
scaler = MinMaxScaler(feature_range = (0.0, 1.0))
training_scaled = scaler.fit_transform(trainArray)
testing_scaled = scaler.transform(testArray)

print (training_scaled,"\n")
print (testing_scaled,"\n")

print ("reversing the transformation\n")

training_reverse_transformation = scaler.inverse_transform(training_scaled)  
testing_reverse_transformation = scaler.inverse_transform(testing_scaled)  

print (training_reverse_transformation,"\n")
print (testing_reverse_transformation,"\n")
 [[0.        ]
[0.09090909]
[0.18181818]
[0.27272727]
[0.36363636]
[0.45454545]
[0.54545455]
[0.63636364]
[0.72727273]
[0.81818182]
[0.90909091]
[1. ]]

[[0.00909091]
[0.27272727]
[0. ]
[1. ]
[1.09090909]
[1.18181818]
[0.81818182]
[0.90909091]
[1. ]
[0.81818182]
[0.90909091]
[1.09090909]]

reversing the transformation

[[ 1.]
[ 2.]
[ 3.]
[ 4.]
[ 5.]
[ 6.]
[ 7.]
[ 8.]
[ 9.]
[10.]
[11.]
[12.]]

[[ 1.1]
[ 4. ]
[ 1. ]
[12. ]
[13. ]
[14. ]
[10. ]
[11. ]
[12. ]
[10. ]
[11. ]
[13. ]]

Using just python code

trainArray=np.array([[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12]])
testArray=np.array([[1.1],[4],[1],[12],[13],[14],[10],[11],[12],[10],[11],[13]])

maxValOfTrain=trainArray[:,0:12].max() 
minValOfTrain=trainArray[:,0:12].min()

training_scaled = 0.000+((trainArray[:,0:12] - minValOfTrain)/(maxValOfTrain - minValOfTrain)*1.000)
testing_scaled = 0.000+((testArray[:,0:12] - minValOfTrain)/(maxValOfTrain - minValOfTrain)*1.000)

print (training_scaled,"\n")
print (testing_scaled,"\n")

print ("reversing the transformation\n")

training_reverse_transformation = minValOfTrain + (((training_scaled[:,0:12]-0.000)*(maxValOfTrain - minValOfTrain))/1.000)
testing_reverse_transformation = minValOfTrain + (((testing_scaled[:,0:12]-0.000)*(maxValOfTrain - minValOfTrain))/1.000)

print (training_reverse_transformation,"\n")
print (testing_reverse_transformation,"\n")
 [[0.        ]
[0.09090909]
[0.18181818]
[0.27272727]
[0.36363636]
[0.45454545]
[0.54545455]
[0.63636364]
[0.72727273]
[0.81818182]
[0.90909091]
[1. ]]

[[0.00909091]
[0.27272727]
[0. ]
[1. ]
[1.09090909]
[1.18181818]
[0.81818182]
[0.90909091]
[1. ]
[0.81818182]
[0.90909091]
[1.09090909]]

reversing the transformation

[[ 1.]
[ 2.]
[ 3.]
[ 4.]
[ 5.]
[ 6.]
[ 7.]
[ 8.]
[ 9.]
[10.]
[11.]
[12.]]

[[ 1.1]
[ 4. ]
[ 1. ]
[12. ]
[13. ]
[14. ]
[10. ]
[11. ]
[12. ]
[10. ]
[11. ]
[13. ]]

Note: in the code above I’ve left in value 0.000 and 1.000 for example. 
The reasoning is if you wanted to scale for example 0.01 and 0.99, you would 
simply change these respective values.

Leave a Reply