ctypes in python crashes with memset

2024/9/22 19:34:29

I am trying to erase password string from memory like it is suggested in here.

I wrote that little snippet:

import ctypes, sysdef zerome(string):location = id(string) + 20size     = sys.getsizeof(string) - 20#memset =  ctypes.cdll.msvcrt.memset# For Linux, use the following. Change the 6 to whatever it is on your computer.print ctypes.string_at(location, size)memset =  ctypes.CDLL("libc.so.6").memsetmemset(location, 0, size)print "Clearing 0x%08x size %i bytes" % (location, size)print ctypes.string_at(location, size)a = "asdasd"zerome(a)

Oddly enouth this code works fine with IPython,

[7] oz123@yenitiny:~ $ ipython a.py 
Clearing 0x02275b84 size 23 bytes

But crashes with Python:

[8] oz123@yenitiny:~ $ python a.py 
Segmentation fault
[9] oz123@yenitiny:~ $

Any ideas why?

I tested on Debian Wheezy, with Python 2.7.3.

little update ...

The code works on CentOS 6.2 with Python 2.6.6.The code crashed on Debian with Python 2.6.8.I tried thinking why it works on CentOS, and not on Debian. The only reason, which came an immidiate different, is that my Debian is multiarch and CentOSis running on my older laptop with i686 CPU.

Hence, I rebooted my CentOS latop and loaded Debian Wheezy on it. The code works on Debian Wheezy which is not multi-arch. Hence, I suspect my configuration on Debian is somewhat problematic ...

Answer

ctypes has a memset function already, so you don't have to make a function pointer for the libc/msvcrt function. Also, 20 bytes is for common 32-bit platforms. On 64-bit systems it's probably 36 bytes. Here's the layout of a PyStringObject:

typedef struct {Py_ssize_t ob_refcnt;         // 4|8 bytesstruct _typeobject *ob_type;  // 4|8 bytesPy_ssize_t ob_size;           // 4|8 byteslong ob_shash;                // 4|8 bytes (4 on 64-bit Windows)int ob_sstate;                // 4 byteschar ob_sval[1];
} PyStringObject; 

So it could be 5*4 = 20 bytes on a 32-bit system, 8*4 + 4 = 36 bytes on 64-bit Linux, or 8*3 + 4*2 = 32 bytes on 64-bit Windows. Since a string isn't tracked with a garbage collection header, you can use sys.getsizeof. In general if you don't want the GC header size included (in memory it's actually before the object's base address you get from id), then use the object's __sizeof__ method. At least that's a general rule in my experience.

What you want is to simply subtract the buffer size from the object size. The string in CPython is null-terminated, so simply add 1 to its length to get the buffer size. For example:

>>> a = 'abcdef'
>>> bufsize = len(a) + 1
>>> offset = sys.getsizeof(a) - bufsize
>>> ctypes.memset(id(a) + offset, 0, bufsize)
3074822964L
>>> a
'\x00\x00\x00\x00\x00\x00'

Edit

A better alternative is to define the PyStringObject structure. This makes it convenient to check ob_sstate. If it's greater than 0, that means the string is interned and the sane thing to do is raise an exception. Single-character strings are interned, along with string constants in code objects that consist of only ASCII letters and underscore, and also strings used internally by the interpreter for names (variable names, attributes).

from ctypes import *class PyStringObject(Structure):_fields_ = [('ob_refcnt', c_ssize_t),('ob_type', py_object),('ob_size', c_ssize_t),('ob_shash', c_long),('ob_sstate', c_int),# ob_sval varies in size# zero with memset is simpler]def zerostr(s):"""zero a non-interned string"""if not isinstance(s, str):raise TypeError("expected str object, not %s" % type(s).__name__)s_obj = PyStringObject.from_address(id(s))if s_obj.ob_sstate > 0:raise RuntimeError("cannot zero interned string")s_obj.ob_shash = -1  # not hashed yetoffset = sizeof(PyStringObject)memset(id(s) + offset, 0, len(s))

For example:

>>> s = 'abcd' # interned by code object
>>> zerostr(s)
Traceback (most recent call last):File "<stdin>", line 1, in <module>File "<string>", line 10, in zerostr
RuntimeError: cannot zero interned string>>> s = raw_input() # not interned
abcd
>>> zerostr(s)
>>> s
'\x00\x00\x00\x00'
https://en.xdnf.cn/q/71908.html

Related Q&A

Python __del__ does not work as destructor? [duplicate]

This question already has answers here:What is the __del__ method and how do I call it?(5 answers)Closed 4 years ago.After checking numerous times, I did find inconsistent info about the topic.In some…

How to set default button in PyGTK?

I have very simple window where I have 2 buttons - one for cancel, one for apply. How to set the button for apply as default one? (When I press enter, "apply" button is pressed)However, I wa…

Can Python recognize changes to a file that it is running interactively?

I was doing some troubleshooting and I was curious if it is possible to run a Python script interactively, change a function defined in the script, save the file, then have the interactive shell recogn…

How to use FTP with Pythons requests

Is it possible to use the requests module to interact with a FTP site? requests is very convenient for getting HTTP pages, but I seem to get a Schema error when I attempt to use FTP sites. Is there …

How to find a best fit distribution function for a list of data?

I am aware of many probabilistic functions builted-in Python, with the random module. Id like to know if, given a list of floats, it would be possible to find the distribution equation that best fits t…

How to use peewee limit()?

With Peewee Im trying to use limit as follows:one_ticket = Ticket.select().limit(1) print one_ticket.count()This prints out 5 however. Does anybody know whats wrong here?

Python (1..n) syntax?

I see in the code on this Sage wiki page the following code:@interact def _(order=(1..12)):Is this (1..n) syntax unique to Sage or is it something in Python? Also, what does it do?

in python , How to load just one time a ML model in a rest service like django or flask?

I have a ML model saved in a pkl (pickel file), I have no problem loading this model and using it for prediction, even I have a rest service that expose it, the only problem is that I load the model i…

Adding a Title or Text to a Folium Map

Im wondering if theres a way to add a title or text on a folium map in python?I have 8 maps to show and I want the user to know which map theyre looking at without having to click on a marker. I attem…

Zombie SharedDataMiddleware on Python Heroku

Im setting up a Flask app on Heroku. Everything is working fine until I added static files. Im using this:from werkzeug import SharedDataMiddleware app = Flask(__name__) app.wsgi_app = SharedDataMiddle…