Finding matching and nonmatching items in lists

2024/10/3 6:34:09

I'm pretty new to Python and am getting a little confused as to what you can and can't do with lists. I have two lists that I want to compare and return matching and nonmatching elements in a binary format. List1 is of constant length, while the length of List2 differs (but is always shorter than List1).

For example:

List1 = ['dog', 'cat', 'pig', 'donkey']
List2 = ['dog', 'cat', 'donkey']

Output wanted:

List3 = [1, 1, 0, 1]

The code I have so far is:

def match_nonmatch(List1, List2):List3 = []for i in range(len(List1)):for j in range(len(List2)):if List1[i] == List2[j]:List3.append(1)else:List3.append(0)return List3

I am able to return the matches when I compare the lists, but when I include the else statement shown above to return the nonmatches I end up with a list that is way longer than it should be. For instance, when I use a list comparing 60 items, I get a list that contains 3600 items rather than 60.

I'd appreciate it if someone could explain to me the problem with my code as it currently stands and suggest how I could modify the code so it does what I want.

Answer

Use set instead of list. This way you can do lots of nice things:

set1 = set(['dog', 'cat', 'pig', 'donkey'])
set2 = set(['dog', 'cat', 'donkey'])matched = set1.intersection(set2) # set(['dog', 'cat', 'donkey'])
unmatched = set1.symmetric_difference(set2) # set(['pig'])

I know it's not exactly what you asked for, but it's usually a better practice to use sets instead of lists when doing this sort of things.

More on sets here: http://docs.python.org/library/stdtypes.html#set

https://en.xdnf.cn/q/70760.html

Related Q&A

How to obtain better results using NLTK pos tag

I am just learning nltk using Python. I tried doing pos_tag on various sentences. But the results obtained are not accurate. How can I improvise the results ?broke = NN flimsy = NN crap = NNAlso I am …

Pandas apply on rolling with multi-column output

I am working on a code that would apply a rolling window to a function that would return multiple columns. Input: Pandas Series Expected output: 3-column DataFrame def fun1(series, ):# Some calculation…

Exceptions for the whole class

Im writing a program in Python, and nearly every method im my class is written like this: def someMethod(self):try:#...except someException:#in case of exception, do something here#e.g display a dialog…

Getting live output from asyncio subprocess

Im trying to use Python asyncio subprocesses to start an interactive SSH session and automatically input the password. The actual use case doesnt matter but it helps illustrate my problem. This is my c…

multi language support in python script

I have a large python (2.7) script that reads data from a database and generate pictures in pdf format. My pictures have strings for labels, etc... Now I want to add a multi language support for the sc…

Add date tickers to a matplotlib/python chart

I have a question that sounds simple but its driving me mad for some days. I have a historical time series closed in two lists: the first list is containing prices, lets say P = [1, 1.5, 1.3 ...] while…

Python Selenium: Cant find element by xpath when browser is headless

Im attempting to log into a website using Python Selenium using the following code:import time from contextlib import contextmanager from selenium import webdriver from selenium.webdriver.chrome.option…

Reading large file in Spark issue - python

I have spark installed in local, with python, and when running the following code:data=sc.textFile(C:\\Users\\xxxx\\Desktop\\train.csv) data.first()I get the following error:---------------------------…

pyinstaller: 2 instances of my cherrypy app exe get executed

I have a cherrypy app that Ive made an exe with pyinstaller. now when I run the exe it loads itself twice into memory. Watching the taskmanager shows the first instance load into about 1k, then a seco…

python - Dataframes with RangeIndex vs.Int64Index - Why?

EDIT: I have just found a line in my code that changes my df from a RangeIndex to a numeric Int64Index. How and why does this happen?Before this line all my df are type RangeIndex. After this line of …