Combining multiple conditional expressions in a list comprehension

2024/11/15 17:31:24

I utf-8 encode characters like \u2013 before inserting them into SQLite.

When I pull them out with a SELECT, they are back in their unencoded form, so I need to re-encode them if I want to do anything with them. In this case, I want to write the rows to a CSV.Before writing the rows to CSV, I want to first add hyperlink to any row whose value starts with 'http'. Some values will be ints, dates etc, so I do the folliowing conditional expression - list comprehension combo:

row = ['=HYPERLINK("%s")' % cell if 'http' in str(cell) else cell for cell in row].

The str() operation then results in the well-known:

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' inposition 15: ordinal not in range(128) error.

What I need then is to perform the .encode('utf-8') encoding again, but only on those elements in the lists that are strings to begin with. The following won't work (since not all elements are strings):

['=HYPERLINK("%s")' % cell if 'http' in str(cell).encode('utf8') else cell.encode('utf8') for cell in row]

TLDR: How do I expand /modify the list comprehension to only encode an element if it's a string?

Answer

In general, work in terms of unicode as long as possible, and encoded unicode to bytes (i.e. strs) only when necessary, such as writing output to a network socket or file.

Do not mix strs with unicode -- although this is permitted in Python2, it causes Python2 to implicitly convert str to unicode or vice versa as necessary using the ascii codec. If the implicit encoding or decoding fails, then you get a UnicodeEncodingError or UnicodedDecodingError, respectively, such as the one you are seeing.

Since cell is unicode, use u'=HYPERLINK("{}")'.format(cell) or u'=HYPERLINK("%s")' % cell instead of '=HYPERLINK("%s")' % cell. (Note that you may want to url-encode cell in case cell contains a double quote).

row = [u'=HYPERLINK("{}")'.format(cell) if isinstance(cell, unicode) and cell.startswith(u'http') else cell for cell in row]

Later, when/if you need to convert row to strs, you could use

row = [cell.encode('utf-8') if isinstance(cell, unicode) else str(cell) for cell in row]

Alternatively, convert everything in row to strs first:

row = [cell.encode('utf-8') if isinstance(cell, unicode) else str(cell) for cell in row]

and then you could use

row = ['=HYPERLINK("{}")'.format(cell) if cell.startswith('http') else cell for cell in row]

Similarly, since row contains cells which are unicode, perform the test

if u'http' in cell

using the unicode u'http' instead of the str 'http', or better yet,

if isinstance(cell, unicode) and cell.startswith(u'http')

Although no error arises if you keep 'http' here (since the ascii codec can decode bytes in the 0-127 range), it is a good practice to use u'http' anyway since conforms to the rule never mix str and unicode, and supports mental clarity.

https://en.xdnf.cn/q/119390.html

Related Q&A

Arduino Live Serial Plotting with a MatplotlibAnimation gets slow

I am making a live plotter to show the analog changes from an Arduino Sensor. The Arduino prints a value to the serial with a Baudrate of 9600. The Python code looks as following: import matplotlib.pyp…

Hide lines on tree view - openerp 7

I want to hide all lines (not only there cointaner) in sequence tree view (the default view). I must hide all lines if code != foo but the attrs atribute dont work on tree views, so how can i filter/hi…

Python Append dataframe generated in nested loops

My program has two for loops. I generate a df in each looping. I want to append this result. For each iteration of inner loop, 1 row and 24 columns data is generated. For each iteration of outer loop, …

bError could not find or load main class caused by java.lang.classnotfoundation error

I am trying to read the executable jar file using python. That jar file doesnt have any java files. It contains only class and JSON files. So what I tried is from subprocess import Popen,PIPEjar_locati…

Invalid value after matching string using regex [duplicate]

This question already has answers here:Incrementing a number at the end of string(2 answers)Closed 3 years ago.I am trying to match strings with an addition of 1 at the end of it and my code gives me t…

How to fetch specific data from same class div using Beautifulsoup

I have a link : https://www.cagematch.net/?id=2&nr=448&gimmick=Adam+Pearce In this link there data in divs with same class name. But I want to fetch specifi div. Like I want to fetch current g…

Python Matplotlib Box plot

This is my dataframe:{Parameter: {0: A, 1: A, 2: A, 3: A, 4: A, 5: A, 6: A, 7: A},Site: {0: S1,1: S2,2: S1,3: S2,4: S1,5: S2,6: S1,7: S2},Value: {0: 2.3399999999999999,1: 2.6699999999999999,2: 2.560000…

How to send turtle to random position?

I have been trying to use goto() to send turtles to a random position but I get an error when running the program.I am lost on how else to do this and not sure of other ways. My current code is:t1.shap…

How to scrape all p-tag and its corresponding h2-tag with selenium?

I want to get title and content of article: example web :https://facts.net/best-survival-movies/ I want to append all p in h2[tcontent-title]and the result expected is: title=[title1, title2, title3]co…

Tkinter: Window not showing image

I am new to GUI programming and recently started working with tKinter.My problem is that the program wont show my image, Im suspecing that it is my code that is wrong, however, I would like somone to e…