Python 3 reading CSV file with line breaks in rows

2024/10/8 12:31:42

I have a large CSV file with one column and line breaks in some of its rows. I want to read the content of each cell and write it to a text file but the CSV reader is splitting the cells with line breaks into multiple ones (multiple rows) and writing each one to a separate text file.

Using Python 3.6.2 on a MAC Sierra

Here is an example:

"content of row 1"
"content of row 2 continues here"
"content of row 3"

And here is how I am reading it:

with open(csvFileName, 'r') as csvfile:lines= csv.reader(csvfile)i=0for row in lines:i+=1content= rowoutFile= open("output"+str(i)+".txt", 'w')outFile.write(content)outFile.close()

This is creating 4 files instead of 3 for each row. Any suggestions on how to ignore the line break in the second row?

Answer

You could define a regular expression pattern to help you iterate over the rows.

Read the entire file contents - if possible.

s = '''"content of row 1"
"content of row 2 continues here"
"content of row 3"'''

Pattern - double-quote, followed by anything that isn't a double-quote, followed by a double-quote.:

row_pattern = '''"[^"]*"'''
row = re.compile(row_pattern, flags = re.DOTALL | re.MULTILINE)

Iterate the rows:

for r in row.finditer(s):print r.group()print '******'>>> 
"content of row 1"
******
"content of row 2 continues here"
******
"content of row 3"
******
>>>
https://en.xdnf.cn/q/70122.html

Related Q&A

Python appending dictionary, TypeError: unhashable type?

abc = {} abc[int: anotherint]Then the error came up. TypeError: unhashable type? Why I received this? Ive tried str()

Calling C# code within Python3.6

with absolutely no knowledge of coding in C#, I wish to call a C# function within my python code. I know theres quite a lot of Q&As around the same problem, but for some strange reason, im unable t…

Pycharm 3.4.1 - AppRegistryNotReady: Models arent loaded yet. Django Rest framewrok

Im using DRF and Pycharm 3.4.1 and Django 1.7. When I try to test my serializer class via Pycharm django console, it gives me the following error:Codefrom items_app.serializers import ItemSerializer s …

Pass Flask route parameters into a decorator

I have written a decorator that attempts to check we have post data for a Flask POST route:Heres my decorator:def require_post_data(required_fields=None):def decorator(f):@wraps(f)def decorated_functio…

update env variable on notebook in VsCode

I’m working on a python project with a notebook and .env file on VsCode. I have problem when trying to refresh environment variables in a notebook (I found a way but its super tricky). My project: .en…

How do I properly set up flask-admin views with using an application factory?

Im trying to setup flask-admin model views with SQLAlchemy against user and role models. Instead of a function admin view Im getting:ValueError: Invalid model property name <class app.models.Role>…

Django Rest Framework: Correct way to serialize ListFields

Based on the DRF documentation I have a created a list of email_id stored in my model in the following way Models.pyclass UserData(models.Model):emails = models.CharField(max_length=100,blank=False)In…

Flask-SQLAlchemy TimeoutError

My backend configuration is :Ubuntu 12.04 Python 2.7 Flask 0.9 Flask-SQLAlchemy Postgres 9.2Ive got this error message: TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection timed ou…

Making saxon-c available in Python

I have just read that Saxon is now available for Python, and thats just great fun and good, but can anyone write a tutorial on how to make it available for Python/Anaconda/WingIDE or similar? I am use…

How to include multiple interactive widgets in the same cell in Jupyter notebook

My goal is to have one cell in Jupyter notebook displaying multiple interactive widgets. Specifically, I would like to have four slider for cropping an image and then another separate slider for rotati…