Facing AttributeError: list object has no attribute lower

2024/11/18 20:18:00

I have posted my sample train data as well as test data along with my code. I'm trying to use Naive Bayes algorithm to train the model.

But, in the reviews I'm getting list of list. So, I think my code is failing with the following error:

return lambda x: strip_accents(x.lower())
AttributeError: 'list' object has no attribute 'lower'

Can anyone of you please help me out regarding the same as I'm new to python ....

train.txt:

review,label
Colors & clarity is superb,positive
Sadly the picture is not nearly as clear or bright as my 40 inch Samsung,negative

test.txt:

review,label
The picture is clear and beautiful,positive
Picture is not clear,negative

My code:

from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import BernoulliNB
from sklearn.metrics import confusion_matrix
from sklearn.feature_extraction.text import CountVectorizerdef load_data(filename):reviews = list()labels = list()with open(filename) as file:file.readline()for line in file:line = line.strip().split(',')labels.append(line[1])reviews.append(line[0].split())return reviews, labelsX_train, y_train = load_data('/Users/7000015504/Desktop/Sep_10/sample_train.csv')
X_test, y_test = load_data('/Users/7000015504/Desktop/Sep_10/sample_test.csv')clf = CountVectorizer()
X_train_one_hot =  clf.fit(X_train)
X_test_one_hot = clf.transform(X_test)bnbc = BernoulliNB(binarize=None)
bnbc.fit(X_train_one_hot, y_train)score = bnbc.score(X_test_one_hot, y_test)
print("score of Naive Bayes algo is :" , score)
Answer

I have applied a few modifications to your code. The one posted below works; I added comments on how to debug the one you posted above.

# These three will not used, do not import them
# from sklearn.preprocessing import MultiLabelBinarizer 
# from sklearn.model_selection import train_test_split 
# from sklearn.metrics import confusion_matrix# This performs the classification task that you want with your input data in the format provided
from sklearn.naive_bayes import MultinomialNB from sklearn.feature_extraction.text import CountVectorizerdef load_data(filename):""" This function works, but you have to modify the second-to-last line fromreviews.append(line[0].split()) to reviews.append(line[0]).CountVectorizer will perform the splits by itself as it sees fit, trust him :)"""reviews = list()labels = list()with open(filename) as file:file.readline()for line in file:line = line.strip().split(',')labels.append(line[1])reviews.append(line[0])return reviews, labelsX_train, y_train = load_data('train.txt')
X_test, y_test = load_data('test.txt')vec = CountVectorizer() 
# Notice: clf means classifier, not vectorizer. 
# While it is syntactically correct, it's bad practice to give misleading names to your objects. 
# Replace "clf" with "vec" or something similar.# Important! you called only the fit method, but did not transform the data 
# afterwards. The fit method does not return the transformed data by itself. You 
# either have to call .fit() and then .transform() on your training data, or just fit_transform() once.X_train_transformed =  vec.fit_transform(X_train) X_test_transformed = vec.transform(X_test)clf= MultinomialNB()
clf.fit(X_train_transformed, y_train)score = clf.score(X_test_transformed, y_test)
print("score of Naive Bayes algo is :" , score)

The output of this code is:

score of Naive Bayes algo is : 0.5
https://en.xdnf.cn/q/120034.html

Related Q&A

why I cannot use max() function in this case? [duplicate]

This question already has answers here:Why do I get "TypeError: int object is not iterable" when trying to sum digits of a number? [duplicate](4 answers)Closed 1 year ago.n,m,k=map(int, inpu…

SQLALchemy and Python - Getting the SQL result

I am using cloudkitty which is rating module in OpenStacks.But here question is regarding the SQLAlchemy and Python.I am new to SQLAlchemy.I need to fetch some details from a table using a API call.So …

ValueError: invalid literal for int() with base 10: python [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.Closed 10 years ago.Questions asking for code must demonstrate a minimal understanding of the proble…

python code to connect to sftp server

I found this code to connect to remote sftp server with the help of username ,password and host but i also need to include the port number, can any one let em know how to include the port number in thi…

Python: get the max value with the location above and below than the max

If I have a dataframe like this, index User Value location1 1 1.0 4.5 2 1 1.5 5.23 1 3.0 7.04 1 2.5 7.55 2 1.0 11.56 2 1.…

Retrieve smart cards PAN with Python and pyscard

Im trying to retrieve the PAN of a smart card using pyscard in Python. What I have done so far is to connect to the reader and to retrieve various information about the reader and the card... but I can…

How to stop a specific key from working in Python

My laptop keyboard has a bug and it sometimes presses the number 5 randomly so i tried many things and they didnt work, I tried programming a code that can stop it but i couldnt because i am a beginner…

How do i sort a 2D array or multiple arrays by the length of the array using bubble sort

trying to write a Python function: def compare_lengths(x, y, z) which takes as arguments three arrays and checks their lengths and returns them as a triple in order of length. For example, if the funct…

How to split a string in Python by 2 or 3, etc [duplicate]

This question already has answers here:Split string every nth character(21 answers)How to iterate over a list in chunks(40 answers)Closed 10 years ago.Does anyone know if its possible in python to spli…

.LAS into a .CSV file using python

How to change a .las file into a .csv file? Have been trying myself but no luck no far. I am just looking for something decently short that will save some time when I have to convert big .olas files i…