Python (Watchdog) - Waiting for file to be created correctly

2024/9/16 23:20:59

I'm new to Python and I'm trying to implement a good "file creation" detection. If I do not put a time.sleep(x) my files are elaborated in a wrong way since they are still being "created" in the folder. (buffer is not empty) How can I circumvent this thing without waiting x seconds every time a file is created?

This is my code:

Main:

while 1:if len(parser()) > 0: #  arguments are validif len(parser()) == 3:log_path = parser()['log_path']else:log_path = os.getcwd()paths = parser()if paths:handler = Event_Handler()observer = Observer()observer.schedule(handler, paths['src_fld'], True)observer.start()try:while True:time.sleep(1)except KeyboardInterrupt:observer.stop()observer.join()else:exit(1)

Event_Handler class:

class Event_Handler(FileSystemEventHandler):def on_created(self, event):if not event.is_directory:time.sleep(1)

As I said, without that time.sleep(1) if I try to process a big file I'll fail since it's still not completely written.

Answer

For the sake of any future readers who stumble upon this question, as I have, the answer appears to be that you cannot. Watchdog does not and will not support any feature to tell if a file is "complete" as Windows doesn't allow for it and Watchdog is meant to be system-agnostic.

If you're on Linux or some distro of it, inotify is probably a safe bet. Otherwise, on Windows, the best solutions I've found are:

Upload a big file, bigfile, and then another file, bigfile-complete. When you find a file name-complete, you go back and upload/transfer/react to the original file name. In this case, your files would all be added to the monitored directory in a queue going file, file-complete, file2, file2-complete, . . .

Poll on the size of the file until it has remained fixed for a suitable length of time. When it hasn't changed in long enough that you can be reasonably certain it is finished, react to it as normal.

Similarly, when a file is being uploaded to your directory in bits and pieces, it will generate a constant stream of file-modified Watchdog events. You can poll these instead of file size, waiting until they've stopped for a reasonable length of time, and then assume the file is complete and proceed.

None of these solutions are perfect, but this seems to be an inherent issue to Watchdog on Windows. Unfortunately the "perfect" solution seems to be "swap to Linux and use inotify".

https://en.xdnf.cn/q/72790.html

Related Q&A

How do I display add model in tabular format in the Django admin?

Im just starting out with Django writing my first app - a chore chart manager for my family. In the tutorial it shows you how to add related objects in a tabular form. I dont care about the related obj…

Python Matplotlib - Impose shape dimensions with Imsave

I plot a great number of pictures with matplotlib in order to make video with it but when i try to make the video i saw the shape of the pictures is not the same in time...It induces some errors. Is th…

Move x-axis tick labels one position to left [duplicate]

This question already has answers here:Aligning rotated xticklabels with their respective xticks(6 answers)Closed last year.I am making a bar chart and I want to move the x-axis tick labels one positio…

PUT dictionary in dictionary in Python requests

I want to send a PUT request with the following data structure:{ body : { version: integer, file_id: string }}Here is the client code:def check_id():id = request.form[id]res = logic.is_id_valid(id)file…

Does python have header files like C/C++? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.Want to improve this question? Add details and clarify the problem by editing this post.Closed 9 years ago.Improve…

python: Greatest common divisor (gcd) for floats, preferably in numpy

I am looking for an efficient way to determine the greatest common divisor of two floats with python. The routine should have the following layoutgcd(a, b, rtol=1e-05, atol=1e-08) """ Re…

Difference between @property and property()

Is there a difference betweenclass Example(object):def __init__(self, prop):self._prop = propdef get_prop(self):return self._propdef set_prop(self, prop):self._prop = propprop = property(get_prop, set_…

airflow webserver command fails with {filesystemcache.py:224} ERROR - Operation not permitted

I am installing airflow on a Cent OS 7. I have configured airflow db init and checked the status of the nginx server as well its working fine. But when I run the airflow webserver command I am getting …

Django REST Framework - How to return 404 error instead of 403

My API allows access (any request) to certain objects only when a user is authenticated and certain other conditions are satisfied.class SomethingViewSet(viewsets.ModelViewSet):queryset = Something.obj…

503 Reponse when trying to use python request on local website

Im trying to scrape my own site from my local server. But when I use python requests on it, it gives me a response 503. Other ordinary sites on the web work. Any reason/solution for this?import reques…