Python download large csv file from a url line by line for only 10 entries

2024/9/19 10:01:27

I have a large csv file of the client and shared via a url to download and I want to download it line by line or by bytes and I want to limit only for 10 entries.

I have the following code which will download the file, but i want here to download only the first 10 entries from the file, I don't want the full file.

#!/usr/bin/env python
import requests
from contextlib import closing
import csvurl = "https://example.com.au/catalog/food-catalog.csv"with closing(requests.get(url, stream=True)) as r:f = (line.decode('utf-8') for line in r.iter_lines())reader = csv.reader(f, delimiter=',', quotechar='"')for row in reader:print(row)

I don't know much about contextlib, how it will work with with in Python.

Can anyone help me here, it would be really helpful, and thanks in advance.

Answer

The issue is not so much with contextlib as with generators. When your with block ends, the connection will be closed, fairly straightforwardly.

The part that actually does the download is for row in reader:, since reader is wrapped around f, which is a lazy generator. Each iteration of the loop will actually read a line from the stream, possibly with some internal buffering by Python.

The key then is to stop the loop after 10 lines. There area couple of simple ways of doing that:

for count, row in enumerate(reader, start=1):print(row)if count == 10:break

Or

from itertools import islice...for row in islice(reader, 0, 10):print(row)
https://en.xdnf.cn/q/72549.html

Related Q&A

Flask-Login still logged in after use logouts when using remember_me

To logout a user in flask using Flask-login, i simply call logout_user(), but after adding some additional checks with session, after I click logout and click back to "login page" again, im s…

How to write integers to a file

I need to write ranks[a], ranks[b], countto a file, each time on a new lineI am using:file = open("matrix.txt", "w") for (a, b), count in counts.iteritems():file.write(ranks[a], ran…

seaborn changing xticks from float to int

I am plotting a graph with seaborn as sns and pylab as plt:plt.figure(figsize=(10,10),) sns.barplot(y = whatever_y, x = whatever_x , data=mydata) plt.xticks(fontsize=14, fontweight=bold)The xticks are …

What are the use cases for a Python distribution?

Im developing a distribution for the Python package Im writing so I can post it on PyPI. Its my first time working with distutils, setuptools, distribute, pip, setup.py and all that and Im struggling a…

Recovering a file deleted with python

So, I deleted a file using python. I cant find it in my recycling bin. Is there a way I can undo it or something. Thanks in advance.EDIT: I used os.remove. I have tried Recuva, but it doesnt seem to fi…

Using Py_buffer and PyMemoryView_FromBuffer with different itemsizes

This question is related to a previous question I asked. Namely this one if anyone is interested. Basically, what I want to do is to expose a C array to Python using a Py_buffer wrapped in a memoryview…

selenium remotewebdriver with python - performance logging?

Im trying to get back some performance log info from a remote webdriver instance. Im using the Python Selenium bindings.From what I can see, this is information I should be able to get back. Think it m…

Python - replace unicode emojis with ASCII characters

I have an issue with one of my current weekend projects. I am writing a Python script that fetches some data from different sources and then spits everything out to an esc-pos printer. As you might ima…

How do I get my python object back from a QVariant in PyQt4?

I am creating a subclass of QAbstractItemModel to be displayed in an QTreeView.My index() and parent() function creates the QModelIndex using the QAbstractItemModel inherited function createIndex and p…

Django serializers vs rest_framework serializers

What is the difference between Django serializers vs rest_framework serializers? I making a webapp, where I want the API to be part of the primary app created by the project. Not creating a separate A…