Is there a way to reopen a socket?

2024/10/13 8:23:40

I create many "short-term" sockets in some code that look like that :

nb=1000
for i in range(nb):sck = socket.socket(socket.AF_INET, socket.SOCK_STREAM)sck.connect((adr, prt)sck.send('question %i'%i)sck.shutdown(SHUT_WR)answer=sck.recv(4096)print 'answer %i : %s' % (%i, answer)sck.close()

This works fine, as long as nb is "small" enough.

As nb might be quite large though, I'd like to do something like this

sck = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sck.connect((adr, prt)
for i in range(nb):reopen(sck) # ? ? ?sck.send('question %i'%i)sck.shutdown(SHUT_WR)answer=sck.recv(4096)print 'answer %i : %s' % (%i, answer)
sck.close()

So the question is :
Is there any way to "reuse" a socket that has been shutdown ?

Answer

No, this is a limitation of the underlying C sockets (and the TCP/IP protocol, for that matter). My question to you is: why are you shutting them down when you can architect your application to use them?

The problem with many short-term sockets is that shutting them down puts them in a state where they cannot be used for a while (basically, twice the packet lifetime, to ensure any packets in the network either arrive and are discarded, or get discarded by the network itself). Basically what happens is that, in the 4-tuple that needs to be unique (source ip, source port, destination ip, destination port), the first one and last two tend to always be the same so, when you run out of source ports, you're hosed.

We've struck this problem in software before where it only became evident when we ran on faster machines (since we could use many more sessions).

Why dont you just open up the socket and continue to use it? It looks like your protocol is a simple request/response one, which should be easily do-able with that approach.

Something like:

sck = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sck.connect((adr, prt)
for i in range(nb):sck.send('question %i'%i)answer=sck.recv(4096)print 'answer %i : %s' % (%i, answer)
sck.close()

Update:

One possibility (and we've done this before) if you're running out of connection due to this continual open/close, is to detect the problem and throttle it. Consider the following code (the stuff I've added is more pseudo-code than Python since I haven't touched Python for quite a while):

for i in range(nb):sck = socket.socket(socket.AF_INET, socket.SOCK_STREAM)sck.connect((adr, prt)while sck.error() == NO_SOCKETS_AVAIL:sleep 250 millisecondssck.connect((adr, prt)sck.send('question %i'%i)sck.shutdown(SHUT_WR)answer=sck.recv(4096)print 'answer %i : %s' % (%i, answer)sck.close()

Basically, it lets you run at full speed while there are plenty of resources but slows down when you strike your problem area. This is actually what we did to our product to "fix" the problem of failing when resources got low. We would have re-architected it except for the fact it was a legacy product approaching end of life and we were basically in the fix-at-minimal-cost mode for service.

https://en.xdnf.cn/q/69555.html

Related Q&A

How django handles simultaneous requests with concurrency over global variables?

I have a django instance hosted via apache/mod_wsgi. I use pre_save and post_save signals to store the values before and after save for later comparisons. For that I use global variables to store the p…

Why cant I string.print()?

My understanding of the print() in both Python and Ruby (and other languages) is that it is a method on a string (or other types). Because it is so commonly used the syntax:print "hi"works.S…

Difference between R.scale() and sklearn.preprocessing.scale()

I am currently moving my data analysis from R to Python. When scaling a dataset in R i would use R.scale(), which in my understanding would do the following: (x-mean(x))/sd(x)To replace that function I…

Generate random timeseries data with dates

I am trying to generate random data(integers) with dates so that I can practice pandas data analytics commands on it and plot time series graphs. temp depth acceleration 2019-01-1 -0.218062 -1.21…

Spark select top values in RDD

The original dataset is:# (numbersofrating,title,avg_rating) newRDD =[(3,monster,4),(4,minions 3D,5),....] I want to select top N avg_ratings in newRDD.I use the following code,it has an error.selectne…

Python module BeautifulSoup extracting anchors href

i am using BeautifulSoup module to select all href from html by this way:def extract_links(html):soup = BeautifulSoup(html)anchors = soup.findAll(a)print anchorslinks = []for a in anchors:links.append(…

Pandas: how to get a particular group after groupby? [duplicate]

This question already has answers here:How to access subdataframes of pandas groupby by key(6 answers)Closed 9 years ago.I want to group a dataframe by a column, called A, and inspect a particular grou…

aws cli in cygwin - how to clean up differences in windows and cygwin style paths

I suspect this is my ineptitude in getting path variables set right, but Im at a loss.Ive installed the aws cli using pip in cygwin.pip install awscliI have two python environments... a windows anacon…

Print all variables and their values [duplicate]

This question already has answers here:too many values to unpack, iterating over a dict. key=>string, value=>list(8 answers)Closed 6 years ago.This question has been asked quite a bit, and Ive tr…

How to emulate multiprocessing.Pool.map() in AWS Lambda?

Python on AWS Lambda does not support multiprocessing.Pool.map(), as documented in this other question. Please note that the other question was asking why it doesnt work. This question is different, Im…