how to use pkgutils.get_data with csv.reader in python?

2024/9/22 23:42:53

I have a python module that has a variety of data files, (a set of csv files representing curves) that need to be loaded at runtime. The csv module works very well

  # curvefile = "ntc.10k.csv"raw = csv.reader(open(curvefile, 'rb'), delimiter=',')

But if I import this module into another script, I need to find the full path to the data file.

/project/sharedcurve.pyntc.10k.csvntc.2k5.csv/appsscript.py

I want the script.py to just refer to the curves by basic filename, not with full paths. In the module code, I can use:

pkgutil.get_data("curve", "ntc.10k.csv") 

which works very well at finding the file, but it returns the csv file already read in, whereas the csv.reader requires the file handle itself. Is there any way to make these two modules play well together? They're both standard libary modules, so I wasn't really expecting problems. I know I can start splitting the pkgutil binary file data, but then I might as well not be using the csv library.

I know I can just use this in the module code, and forget about pkgutils, but it seems like pkgutils is really exactly what this is for.

this_dir, this_filename = os.path.split(__file__)
DATA_PATH = os.path.join(this_dir, curvefile)
raw = csv.reader(open(DATA_PATH, "rb"))
Answer

I opened up the source code to get_data, and it is trivial to have it return the path to the file instead of the loaded file. This module should do the trick. Use the keyword as_string=True to return the file read into memory, or as_string=False, to return the path.

import os, sysfrom pkgutil import get_loaderdef get_data_smart(package, resource, as_string=True):
"""Rewrite of pkgutil.get_data() that actually lets the user determine if data should
be returned read into memory (aka as_string=True) or just return the file path.
"""loader = get_loader(package)
if loader is None or not hasattr(loader, 'get_data'):return None
mod = sys.modules.get(package) or loader.load_module(package)
if mod is None or not hasattr(mod, '__file__'):return None# Modify the resource name to be compatible with the loader.get_data
# signature - an os.path format "filename" starting with the dirname of
# the package's __file__
parts = resource.split('/')
parts.insert(0, os.path.dirname(mod.__file__))
resource_name = os.path.join(*parts)
if as_string:return loader.get_data(resource_name)
else:return resource_name
https://en.xdnf.cn/q/71789.html

Related Q&A

How to make celery retry using the same worker?

Im just starting out with celery in a Django project, and am kinda stuck at this particular problem: Basically, I need to distribute a long-running task to different workers. The task is actually broke…

Make an AJAX call to pass drop down value to the python script

I want to pass the selected value from dropdown which contains names of databases and pass it to the python script in the background which connects to the passed database name. Following is the ajax co…

PyLint 1.0.0 with PyDev + Eclipse: include-ids option no longer allowed, breaks Eclipse integration

As noted in this question: How do I get Pylint message IDs to show up after pylint-1.0.0?pylint 1.0.0 no longer accepts "include-ids" option. (It returns "lint.py: error: no such optio…

Shifting all rows in dask dataframe

In Pandas, there is a method DataFrame.shift(n) which shifts the contents of an array by n rows, relative to the index, similarly to np.roll(a, n). I cant seem to find a way to get a similar behaviour …

Pandas dataframe: omit weekends and days near holidays

I have a Pandas dataframe with a DataTimeIndex and some other columns, similar to this:import pandas as pd import numpy as nprange = pd.date_range(2017-12-01, 2018-01-05, freq=6H) df = pd.DataFrame(ind…

How to dump a boolean matrix in numpy?

I have a graph represented as a numpy boolean array (G.adj.dtype == bool). This is homework in writing my own graph library, so I cant use networkx. I want to dump it to a file so that I can fiddle wit…

Cant append_entry FieldList in Flask-wtf more than once

I have a form with flask-wtf for uploading images, also file field can be multiple fields. my form: class ComposeForm(Form):attachment = FieldList(FileField(_(file)), _(attachment))add_upload = SubmitF…

What is the best way to use python code from Scala (or Java)? [duplicate]

This question already has answers here:Closed 11 years ago.Possible Duplicate:Java Python Integration There is some code written in Python and I need to use it from Scala. The code uses some native C.…

Pandas groupby week given a datetime column

Lets say I have the following data sample:df = pd.DataFrame({date:[2011-01-01,2011-01-02,2011-01-03,2011-01-04,2011-01-05,2011-01-06,2011-01-07,2011-01-08,2011-01-09,2011-12-30,2011-12-31],revenue:[5,3…

Django form to indicate input type

Another basic question Im afraid which Im struggling with. Ive been through the various Django documentation pages and also search this site. The only thing I have found on here was back in 2013 which…