Faster way to iterate all keys and values in redis db

2024/10/11 7:32:45

I have a db with about 350,000 keys. Currently my code just loops through all keys and gets its value from the db.

However this takes almost 2 minutes to do, which seems really slow, redis-benchmark gave 100k reqs/3s.

I've looked at pipelining but I need each value returned so that I end up with a dict of key, value pairs.

At the moment I'm thinking of using threading in my code if possible to speed this up, is this the best way to handle this usecase?

Here's the code I have so far.

import redis, timeit
start_time = timeit.default_timer()
count = redis.Redis(host='127.0.0.1', port=6379, db=9)
keys = count.keys()data = {}for key in keys:value = count.get(key)if value:data[key.decode('utf-8')] = int(value.decode('utf-8'))elapsed = timeit.default_timer() - start_timeprint('Time to read {} records: '.format(len(keys)), elapsed)
Answer

First, the fastest way is doing all of this inside EVAL.

Next, recommended approach to iterate all keys is SCAN. It would not iterate faster than KEYS, but will allow Redis to process some other actions in between, so it will help with overall application behavior.

The script will be something like local data={} local i=1 local mykeys=redis.call(\"KEYS\",\"*\") for k=1,#mykeys do local tmpkey=mykeys[k] data[i]={tmpkey,redis.call(\"GET\",tmpkey)} i=i+1 end return data, but it will fail if you have keys inaccessible with GET (like sets, lists). You need to add error handling to it. If you need sorting, you can do it either in LUA directly, or later on the client side. The second will be slower, but would not let other users of redis instance wait.

Sample output:

127.0.0.1:6370> eval "local data={} local i=1 local mykeys=redis.call(\"KEYS\",\"*\") for k=1,#mykeys do local tmpkey=mykeys[k] data[i]={tmpkey,redis.call(\"GET\",tmpkey)} i=i+1 end return data" 0
1) 1) "a"2) "aval"
2) 1) "b"2) "bval"
3) 1) "c"2) "cval"
4) 1) "d"2) "dval"
5) 1) "e"2) "eval"
6) 1) "f"2) "fval"
7) 1) "g"2) "gval"
8) 1) "h"2) "hval"
https://en.xdnf.cn/q/69799.html

Related Q&A

How to store a floating point number as text without losing precision?

Like the question says. Converting to / from the (truncated) string representations can affect their precision. But storing them in other formats like pickle makes them unreadable (yes, I want this too…

Integer in python/pandas becomes BLOB (binary) in sqlite

Storing an integer in sqlite results in BLOBs (binary values) instead of INTEGER in sqlite. The problem is the INT in the "Baujahr" column. The table is created. CREATE TABLE "Objekt&quo…

Calling Scrapy Spider from Django

I have a project with a django and scrapy folder in the same workspace:my_project/django_project/django_project/settings.pyapp1/app2/manage.py...scrapy_project/scrapy_project/settings.pyscrapy.cfg...Iv…

Python Threading: Multiple While True loops

Do you guys have any recommendations on what python modules to use for the following application: I would like to create a daemon which runs 2 threads, both with while True: loops. Any examples would b…

Visual Studio Code - input function in Python

I am trying out Visual Studio Code, to learn Python.I am writing a starter piece of code to just take an input from the user, say:S = input("Whats your name? ")When I try to run this (Mac: C…

DRF: how to change the value of the model fields before saving to the database

If I need to change some field values before saving to the database as I think models method clear() is suitable. But I cant call him despite all my efforts.For example fields email I need set to lowe…

keep matplotlib / pyplot windows open after code termination

Id like python to make a plot, display it without blocking the control flow, and leave the plot open after the code exits. Is this possible?This, and related subjects exist (see below) in numerous ot…

socket python : recvfrom

I would like to know if socket.recvfrom in python is a blocking function ? I couldnt find my answer in the documentation If it isnt, what will be return if nothing is receive ? An empty string ? In…

pandas read_excel(sheet name = None) returns a dictionary of strings, not dataframes?

The pandas read_excel documentation says that specifying sheet_name = None should return "All sheets as a dictionary of DataFrames". However when I try to use it like so I get a dictionary of…

Plotly: How to assign specific colors for categories? [duplicate]

This question already has an answer here:How to define colors in a figure using Plotly Graph Objects and Plotly Express(1 answer)Closed 2 years ago.I have a pandas dataframe of electricity generation m…