Pandas - split large excel file

2024/10/8 10:59:12

I have an excel file with about 500,000 rows and I want to split it to several excel file, each with 50,000 rows.

I want to do it with pandas so it will be the quickest and easiest.

any ideas how to make it?

thank you for your help

Answer

Assuming that your Excel file has only one (first) sheet containing data, I'd make use of chunksize parameter:

import pandas as pd
import numpy as npi=0
for df in pd.read_excel(file_name, chunksize=50000):df.to_excel('/path/to/file_{:02d}.xlsx'.format(i), index=False)i += 1

UPDATE:

chunksize = 50000
df = pd.read_excel(file_name)
for chunk in np.split(df, len(df) // chunksize):chunk.to_excel('/path/to/file_{:02d}.xlsx'.format(i), index=False)
https://en.xdnf.cn/q/70127.html

Related Q&A

Unable to verify secret hash for client at REFRESH_TOKEN_AUTH

Problem"Unable to verify secret hash for client ..." at REFRESH_TOKEN_AUTH auth flow. {"Error": {"Code": "NotAuthorizedException","Message": "Unab…

save a dependecy graph in python

I am using in python3 the stanford dependency parser to parse a sentence, which returns a dependency graph. import pickle from nltk.parse.stanford import StanfordDependencyParserparser = StanfordDepend…

What are the specific rules for constant folding?

I just realized that CPython seems to treat constant expressions, which represent the same value, differently with respect to constant folding. For example:>>> import dis >>> dis.dis(…

installing opencv for python on mavericks

I am trying to install opencv on a Macbook Pro late 2013 with mavericks. I didnt find any binaries so I am trying to build it. I tried http://www.guidefreitas.com/installing-opencv-2-4-2-on-mac-osx-mou…

Python 3 reading CSV file with line breaks in rows

I have a large CSV file with one column and line breaks in some of its rows. I want to read the content of each cell and write it to a text file but the CSV reader is splitting the cells with line brea…

Python appending dictionary, TypeError: unhashable type?

abc = {} abc[int: anotherint]Then the error came up. TypeError: unhashable type? Why I received this? Ive tried str()

Calling C# code within Python3.6

with absolutely no knowledge of coding in C#, I wish to call a C# function within my python code. I know theres quite a lot of Q&As around the same problem, but for some strange reason, im unable t…

Pycharm 3.4.1 - AppRegistryNotReady: Models arent loaded yet. Django Rest framewrok

Im using DRF and Pycharm 3.4.1 and Django 1.7. When I try to test my serializer class via Pycharm django console, it gives me the following error:Codefrom items_app.serializers import ItemSerializer s …

Pass Flask route parameters into a decorator

I have written a decorator that attempts to check we have post data for a Flask POST route:Heres my decorator:def require_post_data(required_fields=None):def decorator(f):@wraps(f)def decorated_functio…

update env variable on notebook in VsCode

I’m working on a python project with a notebook and .env file on VsCode. I have problem when trying to refresh environment variables in a notebook (I found a way but its super tricky). My project: .en…