Format OCR text annotation from Cloud Vision API in Python

2024/9/16 23:07:27

I am using the Google Cloud Vision API for Python on a small program I'm using. The function is working and I get the OCR results, but I need to format these before being able to work with them.

This is the function:

# Call to OCR API
def detect_text_uri(uri):"""Detects text in the file located in Google Cloud Storage or on the Web."""client = vision.ImageAnnotatorClient()image = types.Image()image.source.image_uri = uriresponse = client.text_detection(image=image)texts = response.text_annotationsfor text in texts:textdescription = ("    "+ text.description )return textdescription

I specifically need to slice the text line by line and add four spaces in the beginning and a line break in the end, but at this moment this is only working for the first line, and the rest is returned as a single line blob.

I've been checking the official documentation but didn't really find out about the format of the response of the API.

Answer

You are almost right there. As you want to slice the text line by line, instead of looping the text annotations, try to get the direct 'description' from google vision's response as shown below.

def parse_image(image_path=None):"""Parse the image using Google Cloud Vision API, Detects "document" features in an image:param image_path: path of the image:return: text content:rtype: str"""client = vision.ImageAnnotatorClient()response = client.text_detection(image=open(image_path, 'rb'))text = response.text_annotationsdel response     # to clean-up the system memoryreturn text[0].description

The above function returns a string with the content in the image, with the lines separated by "\n"

Now, you can add prefix & suffix as you need to each line.

image_content = parse_image(image_path="path\to\image")my_formatted_text = ""
for line in image_content.split("\n"):my_formatted_text += "    " + line + "\n"

my_formatted_text is the text you need.

https://en.xdnf.cn/q/72561.html

Related Q&A

Does pybtex support accent/special characters in .bib file?

from pybtex.database.input import bibtex parser = bibtex.Parser() bibdata = parser.parse_file("sample.bib")The above code snippet works really well in parsing a .bib file but it seems not to …

How do I count specific values across multiple columns in pandas

I have the DataFrame df = pd.DataFrame({colA:[?,2,3,4,?],colB:[1,2,?,3,4],colC:[?,2,3,4,5] })I would like to get the count the the number of ? in each column and return the following output - colA…

Split Python source into separate directories?

Here are some various Python packages my company "foo.com" uses:com.foo.bar.web com.foo.bar.lib com.foo.zig.web com.foo.zig.lib com.foo.zig.lib.lib1 com.foo.zig.lib.lib2Heres the traditional …

How can I use a raw_input with twisted?

I am aware that raw_input cannot be used in twisted. However here is my desired application.I have an piece of hardware that provides an interactive terminal serial port. I am trying to connect to th…

How to use Python and HTML to build a desktop software?

Maybe my question is stupid but I still want to ask. I am always wondering whether I can use Python, HTML and Css to develop a desktop software. I know there are alrealy several good GUI frameworks lik…

More efficient way to look up dictionary values whose keys start with same prefix

I have a dictionary whose keys come in sets that share the same prefix, like this:d = { "key1":"valA", "key123":"valB", "key1XY":"valC","…

When should I use dt.column vs dt[column] pandas?

I was doing some calculations and row manipulations and realised that for some tasks such as mathematical operations they both worked e.g.d[c3] = d.c1 / d. c2 d[c3] = d[c1] / d[c2]I was wondering wheth…

Quiver matplotlib : arrow with the same sizes

Im trying to do a plot with quiver but I would like the arrows to all have the same size.I use the following input :q = ax0.quiver(x, y, dx, dy, units=xy ,scale=1) But even if add options like norm = t…

How to convert Tensorflow dataset to 2D numpy array

I have a TensorFlow dataset which contains nearly 15000 multicolored images with 168*84 resolution and a label for each image. Its type and shape are like this: < ConcatenateDataset shapes: ((168, 8…

CSV remove field value wrap quotes

Im attempting to write a list to a csv, however when I do so I get wrapper quotes around my field values:number1,number2 "1234,2345" "1235.7890" "2345.5687"Using this code…