How to extract only characters from image?

2024/11/14 18:32:50

I have this type of image from that I only want to extract the characters.

enter image description here

After binarization, I am getting this image

img = cv2.imread('the_image.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 9)

enter image description here

Then find contours on this image.

(im2, cnts, _) = cv2.findContours(thresh.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
for contour in cnts[:2000]:x, y, w, h = cv2.boundingRect(contour)aspect_ratio = h/warea = cv2.contourArea(contour)cv2.drawContours(img, [contour], -1, (0, 255, 0), 2) 

I am getting

enter image description here

I need a way to filter the contours so that it selects only the characters. So I can find the bounding boxes and extract roi.

I can find contours and filter them based on the size of areas, but the resolution of the source images are not consistent. These images are taken from mobile cameras.

Also as the borders of the boxes are disconnected. I can't accurately detect the boxes.

Edit:

If I deselect boxes which has an aspect ratio less than 0.4. Then it works up to some extent. But I don't know if it will work or not for different resolution of images.

for contour in cnts[:2000]:x, y, w, h = cv2.boundingRect(contour)aspect_ratio = h/warea = cv2.contourArea(contour)if aspect_ratio < 0.4:continueprint(aspect_ratio)cv2.drawContours(img, [contour], -1, (0, 255, 0), 2)

enter image description here

Answer

Not so difficult...

import cv2img = cv2.imread('img.jpg')gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('gray', gray)ret, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)
cv2.imshow('thresh', thresh)im2, ctrs, hier = cv2.findContours(thresh.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[0])for i, ctr in enumerate(sorted_ctrs):x, y, w, h = cv2.boundingRect(ctr)roi = img[y:y + h, x:x + w]area = w*hif 250 < area < 900:rect = cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)cv2.imshow('rect', rect)cv2.waitKey(0)

Result

res

You can tweak the code like you want (here it can save ROI using original image; for eventually OCR recognition you have to save them in binary format - better methods than sorting by area are available)

Source: Extract ROI from image with Python and OpenCV and some of my knowledge.

Just kidding, take a look at my questions/answers.

https://en.xdnf.cn/q/71926.html

Related Q&A

connect to Azure SQL database via pyodbc

I use pyodbc to connect to my local SQL database which works withoout problems.SQLSERVERLOCAL=Driver={SQL Server Native Client 11.0};Server=(localdb)\\v11.0;integrated security = true;DATABASE=eodba; c…

Generator function for prime numbers [duplicate]

This question already has answers here:Simple prime number generator in Python(28 answers)Closed 9 years ago.Im trying to write a generator function for printing prime numbers as followsdef getPrimes(n…

`ValueError: too many values to unpack (expected 4)` with `scipy.stats.linregress`

I know that this error message (ValueError: too many values to unpack (expected 4)) appears when more variables are set to values than a function returns. scipy.stats.linregress returns 5 values accord…

Django prefetch_related from foreignkey with manytomanyfield not working

For example, in Django 1.8:class A(models.Model):x = models.BooleanField(default=True)class B(models.Model):y = models.ManyToManyField(A)class C(models.Model):z = models.ForeignKey(A)In this scenario, …

Running SimpleXMLRPCServer in separate thread and shutting down

I have a class that I wish to test via SimpleXMLRPCServer in python. The way I have my unit test set up is that I create a new thread, and start SimpleXMLRPCServer in that. Then I run all the test, and…

Clean Python multiprocess termination dependant on an exit flag

I am attempting to create a program using multiple processes and I would like to cleanly terminate all the spawned processes if errors occur. below Ive wrote out some pseudo type code for what I think …

Any limitations on platform constraints for wheels on PyPI?

Are there any limitations declared anywhere (PEPs or elsewhere) about how broad a scope the Linux wheels uploaded to PyPI should have? Specifically: is it considered acceptable practice to upload li…

Match set of dictionaries. Most elegant solution. Python

Given two lists of dictionaries, new one and old one. Dictionaries represent the same objects in both lists. I need to find differences and produce new list of dictionaries where will be objects from n…

Cookies using Python and Google App Engine

Im developing an app on the Google App Engine and have run into a problem. I want to add a cookie to each user session so that I will be able to differentiate amongst the current users. I want them all…

Matplotlib animations - how to export them to a format to use in a presentation?

So, I learned how to make cute little animations in matplotlib. For example, this:import numpy as np import matplotlib import matplotlib.pyplot as pltplt.ion()fig = plt.figure() ax = fig.add_subplot(…