msgpack unserialising dict key strings to bytes

2024/10/15 5:23:39

I am having issues with msgpack in python. It seems that when serialising a dict, if the keys are strings str, they are not unserialised properly and causing KeyError exceptions to be raised.

Example:

>>> import msgpack
>>> d = dict()
>>> value = 1234
>>> d['key'] = value
>>> binary = msgpack.dumps(d)
>>> new_d = msgpack.loads(binary)
>>> new_d['key']
Traceback (most recent call last):File "<stdin>", line 1, in <module>
KeyError: 'key'

This is because the keys are not strings after calling loads() but are unserialised to bytes objects.

>>> d.keys()
dict_keys(['key'])
>>> new_d.keys()
dict_keys([b'key'])

It seems this is related to a unimplemented feature as mentioned in github

My question is, Is there a way to fix this issue or a work around to ensure that the same keys can be used upon deserialisation?

I would like to use msgpack but if I cannot build a dict object with str keys and expect to be able to use the same key upon deserilisation, it becomes useless.

Answer

A default encoding is set when calling dumps or packb

:param str encoding:|      Convert unicode to bytes with this encoding. (default: 'utf-8')

but it is not set by default when calling loads or unpackb as seen in:

Help on built-in function unpackb in module msgpack._unpacker:unpackb(...)unpackb(... encoding=None, ... )

Therefore changing the encoding on the deserialisation fixes the issue, for example:

>>> d['key'] = 1234
>>> binary = msgpack.dumps(d)
>>> msgpack.loads(binary, encoding = "utf-8")
{'key': 1234}
>>> msgpack.loads(binary, encoding = "utf-8") == d
True
https://en.xdnf.cn/q/69321.html

Related Q&A

Better solution for Python Threading.Event semi-busy waiting

Im using pretty standard Threading.Event: Main thread gets to a point where its in a loop that runs:event.wait(60)The other blocks on a request until a reply is available and then initiates a:event.set…

\ufeff Invalid character in identifier

I have the following code :import urllib.requesttry:url = "https://www.google.com/search?q=test"headers = {}usag = Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:25.0) Gecko/20100101 Firefo…

Python multiprocessing - Passing a list of dicts to a pool

This question may be a duplicate. However, I read lot of stuff around on this topic, and I didnt find one that matches my case - or at least, I didnt understood it.Sorry for the inconvenance.What Im tr…

Failed to write to file but generates no Error

Im trying to write to a file but its not working. Ive gone through step-by-step with the debugger (it goes to the write command but when I open the file its empty).My question is either: "How do I…

train spacy for text classification

After reading the docs and doing the tutorial I figured Id make a small demo. Turns out my model does not want to train. Heres the codeimport spacy import random import jsonTRAINING_DATA = [["My l…

Python threading vs. multiprocessing in Linux

Based on this question I assumed that creating new process should be almost as fast as creating new thread in Linux. However, little test showed very different result. Heres my code: from multiprocessi…

How to create a visualization for events along a timeline?

Im building a visualization with Python. There Id like to visualize fuel stops and the fuel costs of my car. Furthermore, car washes and their costs should be visualized as well as repairs. The fuel c…

Multiplying Numpy 3D arrays by 1D arrays

I am trying to multiply a 3D array by a 1D array, such that each 2D array along the 3rd (depth: d) dimension is calculated like:1D_array[d]*2D_arrayAnd I end up with an array that looks like, say:[[ [1…

Django Performing System Checks is running very slow

Out of nowhere Im running into an issue with my Django application where it runs the "Performing System Checks" command very slow. If I start the server with python manage.py runserverIt take…

str.translate vs str.replace - When to use which one?

When and why to use the former instead of the latter and vice versa?It is not entirely clear why some use the former and why some use the latter.