The Pythonic way of organizing modules and packages

2024/11/20 8:32:16

I come from a background where I normally create one file per class. I organize common classes under directories as well. This practice is intuitive to me and it has been proven to be effective in C++, PHP, JavaSript, etc.

I am having trouble bringing this metaphor into Python: files are not just files anymore, but they are formal modules. It doesn't seem right to just have one class in a module --- most classes are useless by themselves. If I have a automobile.py and an Automobile class, it seems silly to always reference it as automobile.Automobile as well.

But, at the same time, it doesn't seem right to throw a ton of code into one file and call it a day. Obviously, a very complex application should have more than 5 files.

What is the correct---or pythonic---way? (Or if there is no correct way, what is your preferred way and why?) How much code should I be throwing in a Python module?

Answer

Think in terms of a "logical unit of packaging" -- which may be a single class, but more often will be a set of classes that closely cooperate. Classes (or module-level functions -- don't "do Java in Python" by always using static methods when module-level functions are also available as a choice!-) can be grouped based on this criterion. Basically, if most users of A also need B and vice versa, A and B should probably be in the same module; but if many users will only need one of them and not the other, then they should probably be in distinct modules (perhaps in the same package, i.e., directory with an __init__.py file in it).

The standard Python library, while far from perfect, tends to reflect (mostly) reasonably good practices -- so you can mostly learn from it by example. E.g., the threading module of course defines a Thread class... but it also holds the synchronization-primitive classes such as locks, events, conditions, and semaphores, and an exception-class that can be raised by threading operations (and a few more things). It's at the upper bound of reasonable size (800 lines including whitespace and docstrings), and some crucial thread-related functionality such as Queue has been placed in a separate module, nevertheless it's a good example of what maximum amount of functionality it still makes sense to pack into a single module.

https://en.xdnf.cn/q/26342.html

Related Q&A

Where do you need to use lit() in Pyspark SQL?

Im trying to make sense of where you need to use a lit value, which is defined as a literal column in the documentation.Take for example this udf, which returns the index of a SQL column array:def find…

Evaluate multiple scores on sklearn cross_val_score

Im trying to evaluate multiple machine learning algorithms with sklearn for a couple of metrics (accuracy, recall, precision and maybe more).For what I understood from the documentation here and from t…

Generate SQL statements from a Pandas Dataframe

I am loading data from various sources (csv, xls, json etc...) into Pandas dataframes and I would like to generate statements to create and fill a SQL database with this data. Does anyone know of a way…

How to translate a model label in Django Admin?

I could translate Django Admin except a model label because I dont know how to translate a model label in Django Admin. So, how can I translate a model label in Django Admin?

converty numpy array of arrays to 2d array

I have a pandas series features that has the following values (features.values)array([array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]),array([0, 0, 0, ..., 0, 0, 0]), ...,array([0, 0, 0, …

profiling a method of a class in Python using cProfile?

Id like to profile a method of a function in Python, using cProfile. I tried the following:import cProfile as profile# Inside the class method... profile.run("self.myMethod()", "output_f…

Installing h5py on an Ubuntu server

I was installing h5py on an Ubuntu server. However it seems to return an error that h5py.h is not found. It gives the same error message when I install it using pip or the setup.py file. What am I miss…

NLTK Named Entity Recognition with Custom Data

Im trying to extract named entities from my text using NLTK. I find that NLTK NER is not very accurate for my purpose and I want to add some more tags of my own as well. Ive been trying to find a way t…

How do I write to the console in Google App Engine?

Often when I am coding I just like to print little things (mostly the current value of variables) out to console. I dont see anything like this for Google App Engine, although I note that the Google Ap…

Does Google App Engine support Python 3?

I started learning Python 3.4 and would like to start using libraries as well as Google App Engine, but the majority of Python libraries only support Python 2.7 and the same with Google App Engine.Shou…