Architecture solution for Python Web application

2024/9/28 19:20:33

We're setting up a Python REST web application. Right now, we're using WSGI, but we might do some changes to that in the future (using Twisted, for example, to improve on scalability or some other feature). I would really like some help regarding what is considered a good architecture for a Web application in Python.

In general our app serves dynamic content, processes a moderate to high level of data from clients, performs pretty high-demand database, network and filesystem calls and should be "easily" scalable (quotes here because if a solution is great but somewhat tough to configure for scalability, it would definitely be thought of as good). We would probably like to evolve this into a highly parallel application in the mid-to-long term. Google App Engine is not an accepted suggestion, mainly because of its cost.

My question is this:

  1. Is using WSGI a good idea? Should we be looking into something like Twisted instead?
  2. Should we use Apache as a reverse proxy for our static files?
  3. Is there some different pattern or architecture that we should consider that I haven't mentioned? (Even if completely obvious).

Any help at all with this would be very appreciated. Thanks a lot!


A WSGI application will be fine this is mostly a backend question and data processing question, in my opinion as that is where more architectural parts come into play. I would look into using Celery ( ) for your work distribution and backend scaling. Twisted would be a good choice, but it appears you already have that portion written for use as a WSGI application so I would just extend it with Celery.

I do not know the scope of your project but I would design it with Celery in mind.

I would have my frontend endpoints be the WSGI (because you already have that written) and write the backend to be distributed via messages. Then you would have a pool of backend nodes that would pull messages off of the Celery queue and complete the required work. It would look sort of like:

Apache -> WSGI Containers -> Celery Message Queue -> Celery Workers.

The apache nodes would be behind a load balancer of some kind. This would be a fairly simple architecture to scale and is, if done correctly, fairly reliable. Code for failure in a system like this and you will be fine.

Related Q&A

Capture image for processing

Im using Python with PIL and SciPy. i want to capture an image from a webcam then process it further using numpy and Scipy. Can somebody please help me out with the code.Here is the code there is a pre…

Loading Magnet LINK using Rasterbar libtorrent in Python

How would one load a Magnet link via rasterbar libtorrent python binding?

Python currying with any number of variables

I am trying to use currying to make a simple functional add in Python. I found this curry decorator here.def curry(func): def curried(*args, **kwargs):if len(args) + len(kwargs) >= func.__code__…

Python - Display rows with repeated values in csv files

I have a .csv file with several columns, one of them filled with random numbers and I want to find duplicated values there. In case there are - strange case, but its what I want to check after all -, I…

Defining __getattr__ and __getitem__ on a function has no effect

Disclaimer This is just an exercise in meta-programming, it has no practical purpose.Ive assigned __getitem__ and __getattr__ methods on a function object, but there is no effect...def foo():print &quo…

thread._local object has no attribute

I was trying to change the logging format by adding a context filter. My Format is like thisFORMAT = "%(asctime)s %(VAL)s %(message)s"This is the class I use to set the VAL in the format. cla…

Pytorch batch matrix vector outer product

I am trying to generate a vector-matrix outer product (tensor) using PyTorch. Assuming the vector v has size p and the matrix M has size qXr, the result of the product should be pXqXr.Example:#size: 2 …

Scraping Google Analytics by Scrapy

I have been trying to use Scrapy to get some data from Google Analytics and despite the fact that Im a complete Python newbie I have made some progress. I can now login to Google Analytics by Scrapy b…

Pandas representative sampling across multiple columns

I have a dataframe which represents a population, with each column denoting a different quality/ characteristic of that person. How can I get a sample of that dataframe/ population, which is representa…

TensorFlow - Ignore infinite values when calculating the mean of a tensor

This is probably a basic question, but I cant find a solution:I need to calculate the mean of a tensor ignoring any non-finite values.For example mean([2.0, 3.0, inf, 5.0]) should return 3.333 and not …