How to make a local Pypi mirror without internet access and with search available?

2024/10/9 12:35:41

I'm trying to make a complete local Pypi repository mirror with pip search feature on a server I can only connect an external hard drive to. To be clear, I don't want a simple caching system, the server is connected to other machines in a completely closed network, no access to the internet at all.

What I have done so far is dumping every pypi packages with bandersnatch and I can do pip install with a simple http server in front of it. I also retrieved the pypi legacy source code and got it working without any python packages inside. The problem I encounter now is to link both sides and I'm not even sure this could be done this way.

I also tested pypiserver. It could have done what I wanted, but it's way too slow ending up with pip search throwing timeout (looks like it wasn't built to handle that much packages).

Finally, I gave a look at devpi. Seems to do the job well for what I want to do but I'm looking for a way to import my bandersnatch dump into it easily. It does not look like I can create an index based on a local directory.

Thank you for any response.

Answer

I might as well provide a proper answer to this on how we got DevPi working quite nicely in our environment:

  1. Install DevPi

DevPi requires Python 3! So make sure you have the Python 3 version of pip installed. Using that:

pip install -U devpi

(likely as root) should do the trick.

  1. Make sure your server firewall is open

DevPi uses port 3141 by default. If you have firewall-cmd installed something like

firewall-cmd --zone=public --add-port=3141/tcp --permanent
firewall-cmd --reload

or equivalent command on your system.

  1. Configure DevPi

DevPi will use PyPi out of the box. We also wanted the ability to "overlay" our own packages that are only provided organisation internally. For local nabCERT packages requires an internal index. The nice thing as this one can itself use PyPi as fallback!

Select the devpi server to work on - which is the server you're on, probably

devpi use  http://localhost:3141

Now create a user that can add and manage the internal packages and login with them

devpi user -c myuser  password=mypassword
devpi login myuser --password mypassword

Now create our internal index to hold local packages, while ensuring it will use PyPi as a "fallback"

devpi index -c myindex bases=/root/pypi volatile=True
  1. Start it up

    devpi-server --host=0.0.0.0 --port=3141 --serverdir=/var/www/pypi

  2. Try and install a package

    pip install -i http://localhost:3141/root/pypi/ simplejson

If something goes wrong check the logs, in our case they were in /var/www/pypi/.xproc/devpi-server/xprocess.log

At this point, if all settings above have been successfully followed, you should be able to open a web browser and point it at the devpi server with

http://localhost:3141/myuser/myindex
  1. Make DevPi start automatically

That varies. We use systemd so I created a file /usr/lib/systemd/system/devpi.service

[Unit]
Requires=network-online.target
After=network-online.target [Service]
EnvironmentFile=-/etc/sysconfig/devpi
Type=forking
PIDFile=/var/www/pypi/.xproc/devpi-server/xprocess.PID
Restart=always
ExecStart=/bin/devpi-server --host=0.0.0.0 --port 3141 --serverdir /var/www/pypi --start
ExecStop=/bin/devpi-server --host=0.0.0.0 --port 3141 --serverdir /var/www/pypi --stop
User=root [Install]
WantedBy=multi-user.target

Save the file and notify systemd.

systemctl daemon-reload
systemctl enable devpi
  1. Configure a client

To point your clients' pip to use the new DevPi repository create a /etc/pip.conf file with something like this

[global]
trusted-host = <server IP or FQDN>[install]
index-url = http://<server IP or FQDN>:3141/myuser/myindex/+simple/[search]
index = http://<server IP or FQDN>:3141/myuser/myindex/
https://en.xdnf.cn/q/70016.html

Related Q&A

Turn an application or script into a shell command

When I want to run my python applications from commandline (under ubuntu) I have to be in the directory where is the source code app.py and run the application with commandpython app.pyHow can I make i…

pytest - monkeypatch keyword argument default

Id like to test the default behavior of a function. I have the following:# app/foo.py DEFAULT_VALUE = hellodef bar(text=DEFAULT_VALUE):print(text)# test/test_app.py import appdef test_app(monkeypatch):…

How remove a program installed with distutils?

I have installed a python application with this setup.py:#!/usr/bin/env pythonfrom distutils.core import setup from libyouandme import APP_NAME, APP_DESCRIPTION, APP_VERSION, APP_AUTHORS, APP_HOMEPAGE,…

How to check which line of a Python script is being executed?

Ive got a Python script which is running on a Linux server for hours, crunching some numbers for me. Id like to check its progress, so Id like to see what line is being executed right now. If that was …

input to C++ executable python subprocess

I have a C++ executable which has the following lines of code in it /* Do some calculations */ . . for (int i=0; i<someNumber; i++){int inputData;std::cin >> inputData;std::cout<<"T…

pandas extrapolation of polynomial

Interpolating is easy in pandas using df.interpolate() is there a method in pandas that with the same elegance do something like extrapolate. I know my extrapolation is fitted to a second degree polyno…

Speed-up a single task using multi-processing or threading

Is it possible to speed up a single task using multi-processing/threading? My gut feeling is that the answer is no. Here is an example of what I mean by a "single task":for i in range(max):p…

Full outer join of two or more data frames

Given the following three Pandas data frames, I need to merge them similar to an SQL full outer join. Note that the key is multi-index type_N and id_N with N = 1,2,3:import pandas as pdraw_data = {type…

How can I add a level to a MultiIndex?

index = [np.array([foo, foo, qux]),np.array([a, b, a])] data = np.random.randn(3, 2) columns = ["X", "Y"] df = pd.DataFrame(data, index=index, columns=columns) df.index.names = [&qu…

decoupled frontend and backend with Django, webpack, reactjs, react-router

I am trying to decouple my frontend and my backend in my project. My frontend is made up of reactjs and routing will be done with react-router, My backend if made form Django and I plan to use the fron…