Pandas replace part of string with values from dictionary

2024/10/5 17:21:56

I would like to replace the words in my dataframe

df = pd.DataFrame({"Text": ["The quick brown fox jumps over the lazy dog"]})

which match the keys in the following dictionary

dic = {"quick brown fox": "fox","lazy dog": "dog}

with their values.

The expected outcome is

    Text
0   The fox jumps over the dog

I tried the following code but there is no change to my df.

df["Text"] = df["Text"].apply(lambda x: ' '.join([dic.get(i, i) for x in x.split()]))

I would like to know if there is any way to do this? I have a dataframe with around 15k rows.

Thanks in advance!

Answer

Use .replace with regex=True

Ex:

import pandas as pddic = {"quick brown fox": "fox", "lazy dog": "dog", "u": "you"}
#Update as per comment
dic = {r"\b{}\b".format(k): v for k, v in dic.items()}df = pd.DataFrame({"Text": ["The quick brown fox jumps over the lazy dog"]})
df["Text"] = df["Text"].replace(dic, regex=True)
print(df)

Output:

                         Text
0  The fox jumps over the dog
https://en.xdnf.cn/q/70466.html

Related Q&A

Tensorflow autoencoder cost not decreasing?

I am working on unsupervised feature learning using autoencoders using Tensorflow. I have written following code for the Amazon csv dataset and when I am running it the cost is not decreasing at every …

Seconds since epoch to relative date

Im working with dates since epoch, and already got, for example:date = 6928727.56235Id like to transform this into another relative format, so that Ill be able to transform this into something relative…

ring buffer with numpy/ctypes

Im developing a client which will receive the [EEG] data over tcp and write it to the ring buffer. I thought it can be very convenient to have the buffer as a ctypes or numpy array because its possible…

Get all available timezones

Im currently working on an application that is required to support multiple timezones.For that, Im using the dateutil library. Now, I need a way to present the user a list of all available timezones th…

Load blob image data into QPixmap

I am writing a program using PyQt4 for front-end GUI and this program accesses a back-end database (which can be either MySQL or SQLite). I need to store some image data in the database and below is th…

Fetch a value of SQLalchemy instrumentedattribute

How can I fetch the value of a InstrumentedAttribute object in SQLalchemy:(Pdb) ResultLine.item_reference_1 <sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x793dc90>The above statemen…

python super calling child methods

There are numerous questions on the usage of super() but none of them appears to answer my question. When calling super().__init__() from a subclass, all method calls in the super-constructor are actua…

How to create space between subplots? [duplicate]

This question already has answers here:Manipulation on vertical space in matplotlib subplots(3 answers)Closed 2 years ago.The title pretty much says it all. I have a notebook containing two subplots an…

How to (re)name an empty column header in a pandas dataframe without exporting to csv

I have a pandas dataframe df1 with an index column and an unnamed series of values. I want to assign a name to the unnamed series. The only way to do this that I know so far is to export to df1.csv usi…

Capturing the video stream from a website into a file

For my image classification project I need to collect classified images, and for me a good source would be different webcams around the world streaming video in the internet. Like this one:https://www.…