I am attempting to use a pipeline to feed an ensemble voting classifier as I want the ensemble learner to use models that train on different feature sets. For this purpose, I followed the tutorial available at [1].
Following is the code that I could develop so far.
y = df1.index
x = preprocessing.scale(df1)phy_features = ['A', 'B', 'C']
phy_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler())])
phy_processer = ColumnTransformer(transformers=[('phy', phy_transformer, phy_features)])fa_features = ['D', 'E', 'F']
fa_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler())])
fa_processer = ColumnTransformer(transformers=[('fa', fa_transformer, fa_features)])pipe_phy = Pipeline(steps=[('preprocessor', phy_processer ),('classifier', SVM)])
pipe_fa = Pipeline(steps=[('preprocessor', fa_processer ),('classifier', SVM)])ens = VotingClassifier(estimators=[pipe_phy, pipe_fa])cv = KFold(n_splits=10, random_state=None, shuffle=True)
for train_index, test_index in cv.split(x):x_train, x_test = x[train_index], x[test_index]y_train, y_test = y[train_index], y[test_index]ens.fit(x_train,y_train)print(ens.score(x_test, y_test))
However, when running the code, I am getting an error saying TypeError: argument of type 'ColumnTransformer' is not iterable
, at the line ens.fit(x_train,y_train)
.
Following is the complete stack trace that I am receiving.
Traceback (most recent call last):File "<input>", line 1, in <module>File "C:\Program Files\JetBrains\PyCharm 2020.1.1\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfilepydev_imports.execfile(filename, global_vars, local_vars) # execute the scriptFile "C:\Program Files\JetBrains\PyCharm 2020.1.1\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfileexec(compile(contents+"\n", file, 'exec'), glob, loc)File "C:/Users/ASUS/PycharmProjects/swelltest/enemble.py", line 112, in <module>ens.fit(x_train,y_train)File "C:\Users\ASUS\PycharmProjects\swelltest\venv\lib\site-packages\sklearn\ensemble\_voting.py", line 265, in fitreturn super().fit(X, transformed_y, sample_weight)File "C:\Users\ASUS\PycharmProjects\swelltest\venv\lib\site-packages\sklearn\ensemble\_voting.py", line 65, in fitnames, clfs = self._validate_estimators()File "C:\Users\ASUS\PycharmProjects\swelltest\venv\lib\site-packages\sklearn\ensemble\_base.py", line 228, in _validate_estimatorsself._validate_names(names)File "C:\Users\ASUS\PycharmProjects\swelltest\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 77, in _validate_namesinvalid_names = [name for name in names if '__' in name]File "C:\Users\ASUS\PycharmProjects\swelltest\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 77, in <listcomp>invalid_names = [name for name in names if '__' in name]
TypeError: argument of type 'ColumnTransformer' is not iterable
Following are the values in the names list when the error is occuring.
1- ColumnTransformer(transformers=[('phy',Pipeline(steps=[('imputer',SimpleImputer(strategy='median')),('scaler', StandardScaler())]),['HR', 'RMSSD', 'SCL'])])
2- ColumnTransformer(transformers=[('fa',Pipeline(steps=[('imputer',SimpleImputer(strategy='median')),('scaler', StandardScaler())]),['Squality', 'Sneutral', 'Shappy'])])
What is the reason for this and how can I fix it?