Python SKlearn fit method not working

2024/7/7 6:49:21

I'm working on a project using Python(3.6) and Sklearn.I have done classifications but when I try to apply it for reshaping in order to use it with fit method of sklearn it returns an error.

Here's what I have tried:

# Get all the columns from dataframe
columns = data.columns.tolist()# Filter the columns to remove data we don't want
columns = [c for c in columns if c not in ["Class"] ]# store the variables we want to predicting on
target = "Class"
X = data.drop(target, 1)
Y = data[target]# Print the shapes of X & Y
print(X.shape)
print(Y.shape)# define a random state
state = 1# define the outlier detection method
classifiers = {"Isolation Forest": IsolationForest(max_samples=len(X),contamination=outlier_fraction,random_state=state),"Local Outlier Factor": LocalOutlierFactor(n_neighbors = 20,contamination = outlier_fraction)
}# fit the model
n_outliers = len(Fraud)for i, (clf_name, clf) in enumerate(classifiers.items()):# fit te data and tag outliersif clf_name == "Local Outlier Factor":y_pred = clf.fit_predict(X)scores_pred = clf.negative_outlier_factor_else:clf.fit(X)scores_pred = clf.decision_function(X)y_pred = clf.predict(X)# Reshape the prediction values to 0 for valid and 1 for fraudulenty_pred[y_pred == 1] = 0y_pred[y_pred == -1] = 1n_errors = (y_pred != Y).sum()# run classification metrics print('{}:{}'.format(clf_name, n_errors))print(accuracy_score(Y, y_pred ))print(classification_report(Y, y_pred ))

Then it returns the following error:

ValueError: could not convert string to float: '301.48 Change: $0.00'
and it's pointed to  `clf.fit(X)` line.

What have I configured wrong?

Answer

We can convert out dataset to numeric data values on the base of their uniqueness and you can also drop un-necessary columns form the dataset.

Here's how you can try that:

df_full = pd.read_excel('input/samp.xlsx', sheet_name=0,)
df_full = df_full[df_full.filter(regex='^(?!Unnamed)').columns]
df_full.drop(['paymentdetails',], 1, inplace=True)
df_full.drop(['timestamp'], 1, inplace=True)
# Handle non numaric data
def handle_non_numaric_data(df_full):columns = df_full.columns.valuesfor column in columns:text_digit_vals = {}def convert_to_int(val):return text_digit_vals[val]if df_full[column].dtype != np.int64 and df_full[column].dtype != np.float64:column_contents = df_full[column].values.tolist()unique_elements = set(column_contents)x = 0for unique in unique_elements:if unique not in text_digit_vals:text_digit_vals[unique] = xx+=1df_full[column] = list(map(convert_to_int, df_full[column]))return df_full
https://en.xdnf.cn/q/120225.html

Related Q&A

extracting n grams from huge text

For example we have following text:"Spark is a framework for writing fast, distributed programs. Sparksolves similar problems as Hadoop MapReduce does but with a fastin-memory approach and a clean…

Python: Input validate with string length

Ok so i need to ensure that a phone number length is correct. I came up with this but get a syntax error.phone = int(input("Please enter the customers Phone Number.")) if len(str(phone)) == 1…

Mergesort Python implementation

I have seen a lot of mergesort Python implementation and I came up with the following code. The general logic is working fine, but it is not returning the right results. How can I fix it? Code: def me…

Use variable in different class [duplicate]

This question already has answers here:How to access variables from different classes in tkinter?(2 answers)Closed 7 years ago.I am a beginner in python. I have a problem with using variable in differ…

Embedded function returns None

My function returns None. I have checked to make sure all the operations are correct, and that I have a return statement for each function.def parameter_function(principal, annual_interest_rate, durati…

calculate days between several dates in python

I have a file with a thousand lines. Theres 12 different dates in a single row. Im looking for two conditions. First: It should analyze row by row. For every row, it should check only for the dates bet…

Appeding different list values to dictionary in python

I have three lists containing different pattern of values. This should append specific values only inside a single dictionary based on some if condition.I have tried the following way to do so but i go…

Split only part of list in python

I have a list[Paris, 458 boulevard Saint-Germain, Marseille, 29 rue Camille Desmoulins, Marseille, 1 chemin des Aubagnens]i want split after keyword "boulevard, rue, chemin" like in output[Sa…

How to find the index of the element in a list that first appears in another given list?

a = [3, 4, 2, 1, 7, 6, 5] b = [4, 6]The answer should be 1. Because in a, 4 appears first in list b, and its index is 1.The question is that is there any fast code in python to achieve this?PS: Actual…

How to yield fragment URLs in scrapy using Selenium?

from my poor knowledge about webscraping Ive come about to find a very complex issue for me, that I will try to explain the best I can (hence Im opened to suggestions or edits in my post).I started usi…