python database / sql programming - where to start

2024/10/13 21:14:24

What is the best way to use an embedded database, say sqlite in Python:

  1. Should be small footprint. I'm only needing few thousands records per table. And just a handful of tables per database.
  2. If it's one provided by Python default installation, then great. Must be open-source, available on Windows and Linus.
  3. Better if SQL is not written directly, but no ORM is fully needed. Something that will shield me from the actual database, but not that huge of a library. Something similar to ADO will be great.
  4. Mostly will be used through code, but if there is a GUI front end, then that is great
  5. Need just a few pages to get started with. I don't want to go through pages reading what a table is and how a Select statement works. I know all of that.
  6. Support for Python 3 is preferred, but 2.x is okay too.

The usage is not a web app. It's a small database to hold at most 5 tables. The data in each table is just a few string columns. Think something just larger than a pickled dictionary

Update: Many thanks for the great suggestions.
The use-case I'm talking about is fairly simple. One you'd probably do in a day or two.
It's a 100ish line Python script that gathers data about a relatively large number of files (say 10k), and creates metadata files about them, and then one large metadata file about the whole files tree. I just need to avoid re-processing the files already processed, and create the metadata for the updated files, and update the main metadata file. In a way, cache the processed data, and only update it on file updates. If the cache is corrupt / unavailable, then simply process the whole tree. It might take 20 minutes, but that's okay.

Note that all processing is done in-memory.

I would like to avoid any external dependencies, so that the script can easily be put on any system with just a Python installation on it. Being Windows, it is sometimes hard to get all the components installed. So, In my opinion, even a database might be an overkill.

You probably wouldn't fire up an Office Word/Writer to write a small post it type note, similarly I am reluctant on using something like Django for this use-case.

Where to start?

Answer

I highly recommend the use of a good ORM. When you can work with Python objects to manage your database rows, life is so much easier.

I am a fan of the ORM in Django. But that was already recommended and you said that is too heavyweight.

This leaves me with exactly one ORM to recommend: Autumn. Very lightweight, works great with SQLite. If your embedded application will be multithreaded, then you absolutely want Autumn; it has extensions to support multithreaded SQLite. (Full disclosure: I wrote those extensions and contributed them. I wrote them while working for RealNetworks, and my bosses permitted me to donate them, so a public thank-you to RealNetworks.)

Autumn is written in pure Python. For SQLite, it uses the Python official SQLite module to do the actual SQL stuff. The memory footprint of Autumn itself is tiny.

I do not recommend APSW. In my humble opinion, it doesn't really do very much to help you; it just provides a way to execute SQL statements, and leaves you to master the SQL way of doing things. Also, it supports every single feature of SQLite, even the ones you rarely use, and as a result it actually has a larger memory footprint than Autumn, while not being as easy to use.

https://en.xdnf.cn/q/69492.html

Related Q&A

How to install Python 3.5 on Raspbian Jessie

I need to install Python 3.5+ on Rasbian (Debian for the Raspberry Pi). Currently only version 3.4 is supported. For the sources I want to compile I have to install:sudo apt-get install -y python3 pyth…

Django - last insert id

I cant get the last insert id like I usually do and Im not sure why.In my view:comment = Comments( ...) comment.save() comment.id #returns NoneIn my Model:class Comments(models.Model):id = models.Integ…

How to check if default value for python function argument is set using inspect?

Im trying to identify the parameters of a function for which default values are not set. Im using inspect.signature(func).parameters.value() function which gives a list of function parameters. Since Im…

OpenCV-Python cv2.CV_CAP_PROP_POS_FRAMES error

Currently, I am using opencv 3.1.0, and I encountered the following error when executing the following code:post_frame = cap.get(cv2.CV_CAP_PROP_POS_FRAMES)I got the following error Message:File "…

How to get the total number of tests passed, failed and skipped from pytest

How can I get to the statistics information of a test session in pytest?Ive tried to define pytest_sessionfinish in the conftest.py file, but I only see testsfailed and testscollected attributes on th…

Tensorflow ArgumentError Running CIFAR-10 example

I am trying to run the CIFAR-10 example of Tensorflow. However when executing python cifar10.py I am getting the error attached below. I have installed Version 0.6.0 of the Tensorflow package using pip…

How to plot multiple density plots on the same figure in python

I know this is going to end up being a really messy plot, but I am curious to know what the most efficient way to do this is. I have some data that looks like this in a csv file:ROI Band Min…

Running Tkinter on Mac

I am an absolute newbie. Im trying to make Python GUI for my school project so I decided to use Tkinter. When I try to import Tkinter it throws this message: >>> import tkinter Traceback (most…

Object-based default value in SQLAlchemy declarative

With SQLAlchemy, it is possible to add a default value to every function. As I understand it, this may also be a callable (either without any arguments or with an optional ExecutionContext argument).No…

High Resolution Image of a Graph using NetworkX and Matplotlib

I have a python code to generate a random graph of 300 nodes and 200 edges and display itimport networkx as nx import matplotlib.pyplot as pltG = nx.gnm_random_graph(300,200) graph_pos = nx.spring_layo…