Download A Single File Using Multiple Threads

2024/10/3 19:25:58

I'm trying to create a 'Download Manager' for Linux that lets me download one single file using multiple threads. This is what I'm trying to do :

  1. Divide the file to be downloaded into different parts by specifying an offset
  2. Download the different parts into a temporary location
  3. Merge them into a single file.

Steps 2 and 3 are solvable, and it is at Step #1 that I'm stuck. How do I specify an offset while downloading a file?

Using something along the lines of open("/path/to/file", "wb").write(urllib2.urlopen(url).read()) does not let me specify a starting point to read from. Is there any alternative to this?

Answer

first, the http server should return Content-Length header. this is usually means the file is a static file, if it is a dynamic file, such as a result of php or jsp, you can not do such split.

then, you can use http Range header when request, this header tell the server which part of file should return. see python doc for how set and parse http head.

to do this, if the part size is 100k, you first request with Range: 0-1000000 100k will get first part, and in its conent-length in response tell your the size of file, then start some thread with different Range, it will work

https://en.xdnf.cn/q/70697.html

Related Q&A

Merge string tensors in TensorFlow

I work with a lot of dtype="str" data. Ive been trying to build a simple graph as in https://www.tensorflow.org/versions/master/api_docs/python/train.html#SummaryWriter. For a simple operat…

How to reduce memory usage of threaded python code?

I wrote about 50 classes that I use to connect and work with websites using mechanize and threading. They all work concurrently, but they dont depend on each other. So that means 1 class - 1 website - …

Connection is closed when a SQLAlchemy event triggers a Celery task

When one of my unit tests deletes a SQLAlchemy object, the object triggers an after_delete event which triggers a Celery task to delete a file from the drive.The task is CELERY_ALWAYS_EAGER = True when…

Python escape sequence \N{name} not working as per definition

I am trying to print unicode characters given their name as follows:# -*- coding: utf-8 -*- print "\N{SOLIDUS}" print "\N{BLACK SPADE SUIT}"However the output I get is not very enco…

Binary integer programming with PULP using vector syntax for variables?

New to the python library PULP and Im finding the documentation somewhat unhelpful, as it does not include examples using lists of variables. Ive tried to create an absolutely minimalist example below …

Nonblocking Scrapy pipeline to database

I have a web scraper in Scrapy that gets data items. I want to asynchronously insert them into a database as well. For example, I have a transaction that inserts some items into my db using SQLAlchemy …

python function to return javascript date.getTime()

Im attempting to create a simple python function which will return the same value as javascript new Date().getTime() method. As written here, javascript getTime() method returns number of milliseconds …

Pulling MS access tables and putting them in data frames in python

I have tried many different things to pull the data from Access and put it into a neat data frame. right now my code looks like this.from pandas import DataFrame import numpy as npimport pyodbc from sq…

Infinite loop while adding two integers using bitwise operations?

I am trying to solve a problem, using python code, which requires me to add two integers without the use of + or - operators. I have the following code which works perfectly for two positive numbers: d…

When is pygame.init() needed?

I am studying pygame and in the vast majority of tutorials it is said that one should run pygame.init() before doing anything. I was doing one particular tutorial and typing out the code as one does an…