keras LSTM feeding input with the right shape

2024/11/18 5:37:41

I am getting some data from a pandas dataframe with the following shape

df.head()
>>>
Value USD   Drop 7  Up 7    Mean Change 7   Change      Predict
0.06480     2.0     4.0     -0.000429       -0.00420    4
0.06900     1.0     5.0     0.000274        0.00403     2
0.06497     1.0     5.0     0.000229        0.00007     2
0.06490     1.0     5.0     0.000514        0.00200     2
0.06290     2.0     4.0     0.000229        -0.00050    3

The first 5 columns are intended to be the X and predict the y. This is how I preprocess the data for the model

from keras.models import Sequential
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
from sklearn.metrics import accuracy_score
from keras.layers import LSTM
from sklearn import preprocessing# Convert a Pandas dataframe to the x,y inputs that TensorFlow needs
def to_xy(df, target):result = []for x in df.columns:if x != target:result.append(x)# find out the type of the target column.  Is it really this hard? :(target_type = df[target].dtypestarget_type = target_type[0] if hasattr(target_type, '__iter__') else target_type# Encode to int for classification, float otherwise. TensorFlow likes 32 bits.if target_type in (np.int64, np.int32):# Classificationdummies = pd.get_dummies(df[target])return df.as_matrix(result).astype(np.float32), dummies.as_matrix().astype(np.float32)else:# Regressionreturn df.as_matrix(result).astype(np.float32), df.as_matrix([target]).astype(np.float32)# Encode text values to indexes(i.e. [1],[2],[3] for red,green,blue).
def encode_text_index(df, name):le = preprocessing.LabelEncoder()df[name] = le.fit_transform(df[name])return le.classes_df['Predict'].value_counts()
>>>
4    1194
3     664
2     623
0     405
1      14
Name: Predict, dtype: int64predictions = encode_text_index(df, "Predict")
predictions
>>>
array([0, 1, 2, 3, 4], dtype=int64)X,y = to_xy(df,"Predict")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, shuffle=False)X_train
>>>
array([[ 6.4800002e-02,  2.0000000e+00,  4.0000000e+00, -4.2857142e-04,-4.1999999e-03],[ 6.8999998e-02,  1.0000000e+00,  5.0000000e+00,  2.7414286e-04,4.0300000e-03],[ 6.4970002e-02,  1.0000000e+00,  5.0000000e+00,  2.2857143e-04,7.0000002e-05],...,[ 9.5987000e+02,  5.0000000e+00,  2.0000000e+00, -1.5831429e+01,-3.7849998e+01],[ 9.9771997e+02,  5.0000000e+00,  2.0000000e+00, -1.6948572e+01,-1.8250000e+01],[ 1.0159700e+03,  5.0000000e+00,  2.0000000e+00, -1.3252857e+01,-7.1700001e+00]], dtype=float32)y_train
>>>
array([[0., 0., 0., 0., 1.],[0., 0., 1., 0., 0.],[0., 0., 1., 0., 0.],...,[0., 0., 0., 0., 1.],[0., 0., 0., 0., 1.],[0., 0., 0., 0., 1.]], dtype=float32)X_train[1]
>>>
array([6.8999998e-02, 1.0000000e+00, 5.0000000e+00, 2.7414286e-04,4.0300000e-03], dtype=float32)X_train.shape
>>>
(2320, 5)X_train[1].shape
>>>
(5,)

and finally the LSTM model (also it might look like not the best way to write one so will appreciate a rewrite of the inner layers as well if that's the case)

model = Sequential()
#model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2, input_shape=(None, 1)))
model.add(LSTM(50, dropout=0.2, return_sequences=True, input_shape=X_train.shape))
model.add(LSTM(50, dropout=0.2, return_sequences=True))
model.add(LSTM(50, dropout=0.2, return_sequences=True))
model.add(LSTM(50, dropout=0.2, return_sequences=True))
#model.add(Dense(50, activation='relu'))
model.add(Dense(y_train.shape[1], activation='softmax'))#model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
#model.fit(X_train, y_train, epochs=1000)model.compile(loss='categorical_crossentropy', optimizer='adam')
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-2, patience=15, verbose=1, mode='auto')
checkpointer = ModelCheckpoint(filepath="best_weights.hdf5", verbose=0, save_best_only=True) # save best modelmodel.fit(X_train, y_train, validation_data=(X_test, y_test), callbacks=[monitor,checkpointer], verbose=2, epochs=1000)
model.load_weights('best_weights.hdf5') # load weights from best model

running this throws this error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-67-a17835a382f6> in <module>()15 checkpointer = ModelCheckpoint(filepath="best_weights.hdf5", verbose=0, save_best_only=True) # save best model16 
---> 17 model.fit(X_train, y_train, validation_data=(X_test, y_test), callbacks=[monitor,checkpointer], verbose=2, epochs=1000)18 model.load_weights('best_weights.hdf5') # load weights from best modelc:\users\samuel\appdata\local\programs\python\python35\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)948             sample_weight=sample_weight,949             class_weight=class_weight,
--> 950             batch_size=batch_size)951         # Prepare validation data.952         do_validation = Falsec:\users\samuel\appdata\local\programs\python\python35\lib\site-packages\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)747             feed_input_shapes,748             check_batch_axis=False,  # Don't enforce the batch size.
--> 749             exception_prefix='input')750 751         if y is not None:c:\users\samuel\appdata\local\programs\python\python35\lib\site-packages\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)125                         ': expected ' + names[i] + ' to have ' +126                         str(len(shape)) + ' dimensions, but got array '
--> 127                         'with shape ' + str(data_shape))128                 if not check_batch_axis:129                     data_shape = data_shape[1:]ValueError: Error when checking input: expected lstm_48_input to have 3 dimensions, but got array with shape (2320, 5)

I've tried a lot of variations of the X_train input shape but every single one throws some error, I also checked the Keras docs but it wasn't clear on how the data should be fed to the model

Attempt No. 1 from Suggestions

First is reshaping X_train

data = np.resize(X_train,(X_train.shape[0],1,X_train.shape[1]))
model.add(LSTM(50, dropout=0.2, return_sequences=True, input_shape=data.shape))

this fails with an error

ValueError: Input 0 is incompatible with layer lstm_52: expected ndim=3, found ndim=4 

suggested I feed it in as

model.add(LSTM(50, dropout=0.2, return_sequences=True, input_shape=X_train.shape[1:]))

that throws the same error

ValueError: Input 0 is incompatible with layer lstm_63: expected ndim=3, found ndim=2

Sugestion 2

use the default X,y from pandas

y = df['Predict']
X = df[['Value USD', 'Drop 7', 'Up 7', 'Mean Change 7', 'Change']]X = np.array(X)
y = np.array(y)X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, shuffle=False)

also that LSTM expect input in the following way (batch_size, timesteps, input_dim)

so I tried this

model.add(LSTM(50, dropout=0.2, return_sequences=True, input_shape=(100, 100, X_train.shape)))

which throws this error

TypeError: Error converting shape to a TensorShape: int() argument must be a string, a bytes-like object or a number, not 'tuple'.

and a different way

model.add(LSTM(50, dropout=0.2, return_sequences=True, input_shape=(100, 100, X_train[1].shape)))

returns the same error

TypeError: Error converting shape to a TensorShape: int() argument must be a string, a bytes-like object or a number, not 'tuple'.
Answer

You want to set up a LSTM ( stateful or stateless ? ) with multiple features, the features are the columns Value USD Drop 7 Up 7 Mean Change 7 Change in your dataframe. A similar problem is in https://github.com/keras-team/keras/issues/6471

Keras LSTMs accept input as (batch_size (number of samples processed at a time),timesteps,features) = (batch_size, timesteps, input_dim) As you have 5 features input_dim = features = 5. i do not know your entire data so i can not say more. The relation of number_of_samples ( number of rows in your dataframe ) and batch_size is in http://philipperemy.github.io/keras-stateful-lstm/, batch_size is the number of samples ( rows ) processed at a time ( doubts regarding batch size and time steps in RNN ) :

Said differently, whenever you train or test your LSTM, you first haveto build your input matrix X of shape nb_samples, timesteps, input_dimwhere your batch size divides nb_samples. For instance, ifnb_samples=1024 and batch_size=64, it means that your model willreceive blocks of 64 samples, compute each output (whatever the numberof timesteps is for every sample), average the gradients and propagateit to update the parameters vector.

source : http://philipperemy.github.io/keras-stateful-lstm/

batch size is important for training

A batch size of 1 means that the model will be fit using onlinetraining (as opposed to batch training or mini-batch training). As aresult, it is expected that the model fit will have some variance.

source : https://machinelearningmastery.com/stateful-stateless-lstm-time-series-forecasting-python/

timesteps is the number of timesteps / past network states you want to look back on, there is a maximal value for LSTMs of about 200-500 ( Vanishing Gradient problem ) for performance reason maximal value is about 200 ( https://github.com/keras-team/keras/issues/2057 )

splitting is easier ( Selecting multiple columns in a pandas dataframe ) :

y = df['Predict']
X = df[['Value USD','Drop 7','Up 7','Mean Change 7', 'Change']]

in https://www.kaggle.com/mknorps/titanic-with-decision-trees is code for modifying data types

updated :

to get rid of these errors you have to reshape the training data like in Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (339732, 29) ( also contains reshaping code for more than 1 timestep ). i post entire code that worked for me because this question is less trivial than it appeared on first sight ( note the number of [ and ] that indicate the dimension of an array, when reshaping ) :

import pandas as pd
import numpy as npfrom keras.models import Sequential
from keras.layers import Dense, Activation
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from keras.layers import LSTM
from sklearn import preprocessingdf = pd.read_csv('/path/data_lstm.dat')y = df['Predict']
X = df[['Value USD', 'Drop 7', 'Up 7', 'Mean Change 7', 'Change']]X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, shuffle=False)X_train_array = X_train.values  ( https://stackoverflow.com/questions/13187778/convert-pandas-dataframe-to-numpy-array-preserving-index )
y_train_array = y_train.values.reshape(4,1)X_test_array = X_test.values
y_test_array = y_test.values# reshaping to fit batch_input_shape=(4,1,5) batch_size, timesteps, number_of_features , batch_size can be varied batch_input_shape=(2,1,5), = (1,1,5),... is also workingX_train_array = np.reshape(X_train_array, (X_train_array.shape[0], 1, X_train_array.shape[1]))
#>>> X_train_array    NOTE THE NUMBER OF [ and ] !!
#array([[[ 6.480e-02,  2.000e+00,  4.000e+00, -4.290e-04, -4.200e-03]],#       [[ 6.900e-02,  1.000e+00,  5.000e+00,  2.740e-04,  4.030e-03]],#       [[ 6.497e-02,  1.000e+00,  5.000e+00,  2.290e-04,  7.000e-05]],#       [[ 6.490e-02,  1.000e+00,  5.000e+00,  5.140e-04,  2.000e-03]]])
y_train_array = np.reshape(y_train_array, (y_train_array.shape[0], 1, y_train_array.shape[1]))
#>>> y_train_array     NOTE THE NUMBER OF [ and ]   !!
#array([[[4]],#       [[2]],#       [[2]],#       [[2]]])model = Sequential()
model.add(LSTM(32, return_sequences=True, batch_input_shape=(4,1,5) ))
model.add(LSTM(32, return_sequences=True ))
model.add(Dense(1, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
https://en.xdnf.cn/q/71098.html

Related Q&A

Problems with a binary one-hot (one-of-K) coding in python

Binary one-hot (also known as one-of-K) coding lies in making one binary column for each distinct value for a categorical variable. For example, if one has a color column (categorical variable) that ta…

How to hide the title bar in pygame?

I was wondering does anyone know how to hide the pygame task bar?I really need this for my pygame program!Thanks!

Deleting existing class variable yield AttributeError

I am manipulating the creation of classes via Pythons metaclasses. However, although a class has a attribute thanks to its parent, I can not delete it.class Meta(type):def __init__(cls, name, bases, dc…

Setting global font size in kivy

What is the preferred way, whether through python or the kivy language, to set the global font size (i.e. for Buttons and Labels) in kivy? What is a good way to dynamically change the global font size…

What is the difference between load name and load global in python bytecode?

load name takes its argument and pushes onto the stack the value of the name stored by store name at the position indicated by the argument . load global does something similar, but there appears to …

porting Python 2 program to Python 3, random line generator

I have a random line generator program written in Python2, but I need to port it to Python3. You give the program the option -n [number] and a file argument to tell it to randomly output [number] numbe…

Symbol not found, Expected in: flat namespace

I have a huge gl.pxd file with all the definitions of gl.h, glu.h and glut.h. For example it has these lines:cdef extern from <OpenGL/gl.h>:ctypedef unsigned int GLenumcdef void glBegin( GLenum m…

Why does Django not generate CSRF or Session Cookies behind a Varnish Proxy?

Running Django 1.2.5 on a Linux server with Apache2 and for some reason Django seems like it cannot store CSRF or Session cookies. Therefore when I try to login to the Django admin it gives me a CSRF v…

Shared state with aiohttp web server

My aiohttp webserver uses a global variable that changes over time:from aiohttp import web shared_item = blaasync def handle(request):if items[test] == val:shared_item = doedaprint(shared_item)app =…

ModuleNotFoundError: No module named matplotlib.pyplot

When making a plot, I used both Jupyter Notebook and Pycharm with the same set of code and packages. The code is: import pandas as pd import numpy as np import matplotlib.pyplot as plt # as in Pycha…