What are the use cases for a Python distribution?

2024/5/20 17:16:34

I'm developing a distribution for the Python package I'm writing so I can post it on PyPI. It's my first time working with distutils, setuptools, distribute, pip, setup.py and all that and I'm struggling a bit with a learning curve that's quite a bit steeper than I anticipated :)

I was having a little trouble getting some of my test data files to be included in the tarball by specifying them in the data_files parameter in setup.py until I came across a different post here that pointed me toward the MANIFEST.in file. Just then I snapped to the notion that what you include in the tarball/zip (using MANIFEST.in) and what gets installed in a user's Python environment when they do easy_install or whatever (based on what you specify in setup.py) are two very different things; in general there being a lot more in the tarball than actually gets installed.

This immediately triggered a code-smell for me and the realization that there must be more than one use case for a distribution; I had been fixated on the only one I've really participated in, using easy_install or pip to install a library. And then I realized I was developing work product where I had only a partial understanding of the end-users I was developing for.

So my question is this: "What are the use cases for a Python distribution other than installing it in one's Python environment? Who else am I serving with this distribution and what do they care most about?"

Here are some of the working issues I haven't figured out yet that bear on the answer:

  • Is it a sensible thing to include everything that's under source control (git) in the source distribution? In the age of github, does anyone download a source distribution to get access to the full project source? Or should I just post a link to my github repo? Won't including everything bloat the distribution and make it take longer to download for folks who just want to install it?

  • I'm going to host the documentation on readthedocs.org. Does it make any sense for me to include HTML versions of the docs in the source distribution?

  • Does anyone use python setup.py test to run tests on a source distribution? If so, what role are they in and what situation are they in? I don't know if I should bother with making that work and if I do, who to make it work for.

Answer

Some things that you might want to include in the source distribution but maybe not install include:

  • the package's license
  • a test suite
  • the documentation (possibly a processed form like HTML in addition to the source)
  • possibly any additional scripts used to build the source distribution

Quite often this will be the majority or all of what you are managing in version control and possibly a few generated files.

The main reason why you would do this when those files are available online or through version control is so that people know they have the version of the docs or tests that matches the code they're running.

If you only host the most recent version of the docs online, then they might not be useful to someone who has to use an older version for some reason. And the test suite on the tip in version control may not be compatible with the version of the code in the source distribution (e.g. if it tests features added since then). To get the right version of the docs or tests, they would need to comb through version control looking for a tag that corresponds to the source distribution (assuming the developers bothered tagging the tree). Having the files available in the source distribution avoids this problem.

As for people wanting to run the test suite, I have a number of my Python modules packaged in various Linux distributions and occasionally get bug reports related to test failures in their environments. I've also used the test suites of other people's modules when I encounter a bug and want to check whether the external code is behaving as the author expects in my environment.

https://en.xdnf.cn/q/72545.html

Related Q&A

Recovering a file deleted with python

So, I deleted a file using python. I cant find it in my recycling bin. Is there a way I can undo it or something. Thanks in advance.EDIT: I used os.remove. I have tried Recuva, but it doesnt seem to fi…

Using Py_buffer and PyMemoryView_FromBuffer with different itemsizes

This question is related to a previous question I asked. Namely this one if anyone is interested. Basically, what I want to do is to expose a C array to Python using a Py_buffer wrapped in a memoryview…

selenium remotewebdriver with python - performance logging?

Im trying to get back some performance log info from a remote webdriver instance. Im using the Python Selenium bindings.From what I can see, this is information I should be able to get back. Think it m…

Python - replace unicode emojis with ASCII characters

I have an issue with one of my current weekend projects. I am writing a Python script that fetches some data from different sources and then spits everything out to an esc-pos printer. As you might ima…

How do I get my python object back from a QVariant in PyQt4?

I am creating a subclass of QAbstractItemModel to be displayed in an QTreeView.My index() and parent() function creates the QModelIndex using the QAbstractItemModel inherited function createIndex and p…

Django serializers vs rest_framework serializers

What is the difference between Django serializers vs rest_framework serializers? I making a webapp, where I want the API to be part of the primary app created by the project. Not creating a separate A…

Pandas replace non-zero values

I know I can replace all nan values with df.fillna(0) and replace a single value with df.replace(-,1), but how can I replace all non-zero values with a single value?

Pandas percentage change using group by

Suppose I have the following DataFrame: df = pd.DataFrame({city: [a, a, a, b, b, c, d, d, d], year: [2013, 2014, 2016, 2015, 2016, 2013, 2016, 2017, 2018],value: [10, 12, 16, 20, 21, 11, 15, 13, 16]})A…

Django cannot find my static files

I am relatively new to web dev. and I am trying to build my first web application. I have my static folder in project_root/static but for some reason, I keep getting 404s when I run the server:Not Foun…

How can I find intersection of two large file efficiently using python?

I have two large files. Their contents looks like this:134430513125296589151963957125296589The file contains an unsorted list of ids. Some ids may appear more than one time in a single file. Now I want…