Recently there came some news about some Malicious Libraries that were uploaded into Python Package Index (PyPI), see:
- Malicious libraries on PyPI
- Malicious modules found into official Python repository (this link contains the list of malicious packages)
- Developers using malicious Python Modules
I am not trying to forward these news but I am trying to prevent myself and other teammates to identify if a package from PyPI has not been altered by an external party.
Questions:
- What security check should I use once I have downloaded a package from PyPI? MD5 or any extra step?
- Is MD5 signature enough to verify the integrity of Python Packages?
First, the article describes the danger of typosquatting, which is caused by developers blindly installing package by name without checking if it's the correct upstream package. You can avoid this by going to the author's GitHub repository and copy the install instructions correctly.
Aside from that, packages can be tampered but unlikely. As the PyPI files are transferred through HTTPS, it doesn't make much sense to fetch a hash from server and verify it. (If the author's account or the PyPI server is hacked, hash doesn't prevent you from installing malicious packages.)
If you need extra security measure against server compromise, use pinned version/hashes. See the documentation for details.