How to show diff of two string sequences in colors?

2024/10/1 12:28:24

I'm trying to find a Python way to diff strings. I know about difflib but I haven't been able to find an inline mode that does something similar to what this JS library does (insertions in green, deletions in red):

one_string =   "beep boop"
other_string = "beep boob blah"

colored diff

Is there a way to achieve this?

Answer

One possible way (see also @interjay's comment to the OP) is

import difflibred = lambda text: f"\033[38;2;255;0;0m{text}\033[38;2;255;255;255m"
green = lambda text: f"\033[38;2;0;255;0m{text}\033[38;2;255;255;255m"
blue = lambda text: f"\033[38;2;0;0;255m{text}\033[38;2;255;255;255m"
white = lambda text: f"\033[38;2;255;255;255m{text}\033[38;2;255;255;255m"def get_edits_string(old, new):result = ""codes = difflib.SequenceMatcher(a=old, b=new).get_opcodes()for code in codes:if code[0] == "equal": result += white(old[code[1]:code[2]])elif code[0] == "delete":result += red(old[code[1]:code[2]])elif code[0] == "insert":result += green(new[code[3]:code[4]])elif code[0] == "replace":result += (red(old[code[1]:code[2]]) + green(new[code[3]:code[4]]))return result

Which just depends just on difflib, and can be tested with

one_string =   "beep boop"
other_string = "beep boob blah"print(get_edits_string(one_string, other_string))

colored diff

https://en.xdnf.cn/q/70966.html

Related Q&A

Regex for timestamp

Im terrible at regex apparently, it makes no sense to me...Id like an expression for matching a time, like 01:23:45 within a string. I tried this (r(([0-9]*2)[:])*2([0-9]*2)but its not working. I need …

os.read(0,) vs sys.stdin.buffer.read() in python

I encountered the picotui library, and was curious to know a bit how it works. I saw here (line 147) that it uses: os.read(0,32)According to Google 0 represents stdin, but also that the accepted answer…

python - Pandas: groupby ffill for multiple columns

I have the following DataFrame with some missing values. I want to use ffill() to fill missing values in both var1 and var2 grouped by date and building. I can do that for one variable at a time, but w…

Gtk-Message: Failed to load module canberra-gtk-module

My pygtk program writes this warning to stderr:Gtk-Message: Failed to load module "canberra-gtk-module"libcanberra seems to be a library for sound.My program does not use any sound. Is there …

Why does installation of some Python packages require Visual Studio?

Say, you are installing a Python package for pyEnchant or crfsuite, etc. It fails to install and in the error trace it says some .bat (or .dll) file is missing.A few forums suggest you install Visual S…

Does Django ORM have an equivalent to SQLAlchemys Hybrid Attribute?

In SQLAlchemy, a hybrid attribute is either a property or method applied to an ORM-mapped class,class Interval(Base):__tablename__ = intervalid = Column(Integer, primary_key=True)start = Column(Integer…

Building a Python shared object binding with cmake, which depends upon external libraries

We have a c file called dbookpy.c, which will provide a Python binding some C functions.Next we decided to build a proper .so with cmake, but it seems we are doing something wrong with regards to linki…

What linux distro is better suited for Python web development?

Which linux distro is better suited for Python web development?Background:I currently develop on Windows and its fine, but I am looking to move my core Python development to Linux. Im sure most any di…

Relation between 2D KDE bandwidth in sklearn vs bandwidth in scipy

Im attempting to compare the performance of sklearn.neighbors.KernelDensity versus scipy.stats.gaussian_kde for a two dimensional array.From this article I see that the bandwidths (bw) are treated diff…

How to style (rich text) in QListWidgetItem and QCombobox items? (PyQt/PySide)

I have found similar questions being asked, but without answers or where the answer is an alternative solution.I need to create a breadcrumb trail in both QComboBoxes and QListWidgets (in PySide), and …