How does searching with pip work?

2024/9/30 9:18:48

Yes, I'm dead serious with this question. How does searching with pip work?

The documentation of the keyword search refers to a "pip search reference" at https://pip.pypa.io/en/stable/user_guide/#searching-for-packages which is everything but a reference.

I can't conclude from search attempts how searching works. E.g. if I search for "exec" I get a variety of results such as exec-pypeline (0.4.2) - an incredible python package. I even get results with package names that have nothing to do with "exec" as long as the term "exec" is in the description.

But strangely I don't see one of my own packages in the list though one of the packages contains exec in it's name. That alone now would lead us to the conclusion that pip (at least) searches for complete search terms in the package description (which my package doesn't have).

But building on that assumption if I search for other terms that are provided in the package description I don't get my package listed either. And that applies to other packages as well: E.g. if I search for "projects" I don't get flask-macros in the result set though the term "projects" clearly exists in the description of flask-macros. So as this contradicts the assumption above this is clearly not the way how searching works.

And interestingly I can search for "macro" and get "flask-macros" as a result, but if I search for "macr" "flask-macros" is not found.

So how exactly is searching performed by pip? Where can a suitable reference be found for this?

Answer

pip search looks for substring contained in the distribution name or the distribution summary. I can not see this documented anywhere, and found it by following the command in the source code directly.

The code for the search feature, which dates from Feb 2010, is still using an old xmlrpc_client. There is issue395 to change this, open since 2011, since the XML-RPC API is now considered legacy and should not be used. Somewhat surprisingly, the endpoint was not deprecated in the pypi-legacy to warehouse move, as the legacy routes are still there.

flask-macros did not show up in a search for "project" because this is too common a search term. Only 100 results are returned, this is a hardcoded limit in the elasticsearch view which handles the requests to those PyPI search routes. Note that this was reduced from 1000 fairly recently in PR3827.

Code to do a search with an API client directly:

import xmlrpc.clientclient = xmlrpc.client.ServerProxy('https://pypi.org/pypi')
query = 'project'
results = client.search({'name': query, 'summary': query}, 'or')
print(len(results), 'results returned')
for result in sorted(results, key=lambda data: data['name'].lower()):print(result)

edit: The 100 result limit is now documented here.

https://en.xdnf.cn/q/71099.html

Related Q&A

keras LSTM feeding input with the right shape

I am getting some data from a pandas dataframe with the following shapedf.head() >>> Value USD Drop 7 Up 7 Mean Change 7 Change Predict 0.06480 2.0 4.0 -0.000429 …

Problems with a binary one-hot (one-of-K) coding in python

Binary one-hot (also known as one-of-K) coding lies in making one binary column for each distinct value for a categorical variable. For example, if one has a color column (categorical variable) that ta…

How to hide the title bar in pygame?

I was wondering does anyone know how to hide the pygame task bar?I really need this for my pygame program!Thanks!

Deleting existing class variable yield AttributeError

I am manipulating the creation of classes via Pythons metaclasses. However, although a class has a attribute thanks to its parent, I can not delete it.class Meta(type):def __init__(cls, name, bases, dc…

Setting global font size in kivy

What is the preferred way, whether through python or the kivy language, to set the global font size (i.e. for Buttons and Labels) in kivy? What is a good way to dynamically change the global font size…

What is the difference between load name and load global in python bytecode?

load name takes its argument and pushes onto the stack the value of the name stored by store name at the position indicated by the argument . load global does something similar, but there appears to …

porting Python 2 program to Python 3, random line generator

I have a random line generator program written in Python2, but I need to port it to Python3. You give the program the option -n [number] and a file argument to tell it to randomly output [number] numbe…

Symbol not found, Expected in: flat namespace

I have a huge gl.pxd file with all the definitions of gl.h, glu.h and glut.h. For example it has these lines:cdef extern from <OpenGL/gl.h>:ctypedef unsigned int GLenumcdef void glBegin( GLenum m…

Why does Django not generate CSRF or Session Cookies behind a Varnish Proxy?

Running Django 1.2.5 on a Linux server with Apache2 and for some reason Django seems like it cannot store CSRF or Session cookies. Therefore when I try to login to the Django admin it gives me a CSRF v…

Shared state with aiohttp web server

My aiohttp webserver uses a global variable that changes over time:from aiohttp import web shared_item = blaasync def handle(request):if items[test] == val:shared_item = doedaprint(shared_item)app =…