Change numerical Data to Categorical Data - Pandas [duplicate]

2024/10/3 17:13:09

I have a pandas dataframe which has a numerical column "amount". The amount varies from 0 to 20000. I want to change it into categorical variable which defines a range. So, the categorical variable would be :

  1. Between 0-1000$
  2. Between 1000-2000$ and so on.. till 19000-20000$

I am unable to figure out how to change the column. I can change it to a binary values like this :

months["value"] = np.where(months['amount']>=450, 'yes', 'no') 

But, how to do it for categorical variable having more than 2 values?

Answer

You can use cut:

df = pd.DataFrame({'B':[4000,5000,4000,9000,5,11040]})df['D'] = pd.cut(df['B'], range(0, 21000, 1000))
print (df)B               D
0   4000    (3000, 4000]
1   5000    (4000, 5000]
2   4000    (3000, 4000]
3   9000    (8000, 9000]
4      5       (0, 1000]
5  11040  (11000, 12000]
https://en.xdnf.cn/q/70705.html

Related Q&A

Why is dataclasses.astuple returning a deepcopy of class attributes?

In the code below the astuple function is carrying out a deep copy of a class attribute of the dataclass. Why is it not producing the same result as the function my_tuple? import copy import dataclass…

customize dateutil.parser century inference logic

I am working on old text files with 2-digit years where the default century logic in dateutil.parser doesnt seem to work well. For example, the attack on Pearl Harbor was not on dparser.parse("12…

How can I check a Python unicode string to see that it *actually* is proper Unicode?

So I have this page:http://hub.iis.sinica.edu.tw/cytoHubba/Apparently its all kinds of messed up, as it gets decoded properly but when I try to save it in postgres I get:DatabaseError: invalid byte seq…

Test assertions for tuples with floats

I have a function that returns a tuple that, among others, contains a float value. Usually I use assertAlmostEquals to compare those, but this does not work with tuples. Also, the tuple contains other …

Django: Assigning ForeignKey - Unable to get repr for class

I ask this question here because, in my searches, this error has been generally related to queries rather than ForeignKey assignment.The error I am getting occurs in a method of a model. Here is the co…

Counting day-of-week-hour pairs between two dates

Consider the following list of day-of-week-hour pairs in 24H format:{Mon: [9,23],Thu: [12, 13, 14],Tue: [11, 12, 14],Wed: [11, 12, 13, 14]Fri: [13],Sat: [],Sun: [], }and two time points, e.g.:Start:dat…

Download A Single File Using Multiple Threads

Im trying to create a Download Manager for Linux that lets me download one single file using multiple threads. This is what Im trying to do : Divide the file to be downloaded into different parts by sp…

Merge string tensors in TensorFlow

I work with a lot of dtype="str" data. Ive been trying to build a simple graph as in https://www.tensorflow.org/versions/master/api_docs/python/train.html#SummaryWriter. For a simple operat…

How to reduce memory usage of threaded python code?

I wrote about 50 classes that I use to connect and work with websites using mechanize and threading. They all work concurrently, but they dont depend on each other. So that means 1 class - 1 website - …

Connection is closed when a SQLAlchemy event triggers a Celery task

When one of my unit tests deletes a SQLAlchemy object, the object triggers an after_delete event which triggers a Celery task to delete a file from the drive.The task is CELERY_ALWAYS_EAGER = True when…