Remove substring from string if substring in list in data frame column

2024/9/20 15:22:50

I have the following data frame df1

       string             lists
0      i have a dog       ['fox', 'dog', 'cat']
1      there is a cat     ['dog', 'house', 'car']
2      hello everyone     ['hi', 'hello', 'everyone']
3      hi my name is Joe  ['name', 'was', 'Joe']

I'm trying to return a data frame df2 that looks like this

       string             lists                         new_string
0      i have a dog       ['fox', 'dog', 'cat']         i have a
1      there is a cat     ['dog', 'house', 'car']       there is a cat
2      hello everyone     ['hi', 'hello', 'everyone']   
3      hi my name is Joe  ['name', 'was', 'Joe']        hi my is

I've referenced other questions such as https://stackoverflow.com/a/40493603/5879909, but I'm having trouble searching through a list in a column as opposed to a preset list.

Answer

Considering that the dataframe is df, and that OP's goal is to create a new column named new_string whose values are strings equal to the one's in the string column without a string in the lists column, for that specific row, the following will do the work

df['new_string'] = df['string'].apply(lambda x: ' '.join([word for word in x.split() if word not in df['lists'][df['string'] == x].values[0]]))[Out]:string                  lists      new_string
0       i have a dog        [fox, dog, cat]        i have a
1     there is a cat      [dog, house, car]  there is a cat
2     hello everyone  [hi, hello, everyone]                
3  hi my name is Joe       [name, was, Joe]        hi my is
https://en.xdnf.cn/q/119478.html

Related Q&A

how to save data in the db django model?

Good day, I cant really understand what Im doing wrong in here. I was using this function base view to store my scrap data in the database with the django model, but now its not saving any more. I cant…

Move existing jointplot legend

I tried answers from a previous question to no avail in Matplotlib 1.5.1. I have a seaborn figure:import seaborn as sns %matplotlib inline import matplotlib.pyplot as plt import numpy as np tips = sns.…

timezone conversion of a large list of timestamps from an excel file with python

I have an excel file named "hello.xlsx". There is a column of timestamps that has a lot of rows (more than 80,000 rows for now). The file basically looks like this:03/29/2018 19:24:5003/29/20…

N_gram frequency python NTLK

I want to write a function that returns the frequency of each element in the n-gram of a given text. Help please. I did this code fo counting frequency of 2-gramcode:from nltk import FreqDistfrom nltk.…

Is there a way to have a list of 4 billion numbers in Python?

I made a binary search function and Im curious what would happen if I used it on 4 billion numbers, but I get a MemoryError every time I use it. Is there a way to store the list without this issue?

ValueError: invalid literal for int() with base 10: when it worked before

Im having some issues with my program, basically what Im trying to do is Stenography, insert an image into another image and then extract the secret image.My program is able to insert just fine, but ex…

How to fetch the current branch from Jenkins?

I would like to query Jenkins using its API and Python to fetch the branch that is currently ready to be built.How can I do that?

How to vertically stretch graphs with matplotlib subplot [duplicate]

This question already has answers here:How do I change the size of figures drawn with Matplotlib?(16 answers)Closed 5 years ago.With the following code, I try to plot 12 different histograms in one pi…

Python Selenium Traceback (most recent call last):

Im trying to use selenium for a python web scraper but when I try to run the program I get the following error: "/Applications/Python 3.8/IDLE.app/Contents/MacOS/Python" "/Applications/P…

getting error while installing opencv via pip

python version = Python 3.8.0pip version = 19.3.1C:\Users\Sami Ullah Ch>pip3 install opencv-pythonERROR: Could not find a version that satisfies the requirement opencv-python (from versions: none)