Question 1

I have a class with various methods. I have a method in that class something like :

 class MyClass:async def master_method(self):tasks = [self.sub_method() for _ in range(10)]results = await asyncio.gather(*tasks)async def sub_method(self):subtasks = [self.my_task() for _ in range(10)]results = await asyncio.gather(*subtasks)async def my_task(self):return "task done"

So the question here is:

Are there any issues, advantages/disadvantages with using asyncio.gather() inside co-routines that are being called from another asyncio.gather() ? Any performance issues?
Are all tasks in all levels treated with the same priority by asyncio loop? Would this give the same performance as if I have called all the co-routines with a single asyncio.gather() from the master_method?

Question 2

TLDR: Using gather instead of returning tasks simplifies usage and makes code easier to maintain. While gather has some overhead, it is negligible for any practical application.

Why `gather`?

The point of gather to accumulate child tasks before exiting a coroutine is to delay the completion of the coroutine until its child tasks are done. This encapsulates the implementation, and ensures that the coroutine appears as one single entity "doing its thing".
The alternative is to return the child tasks, and expect the caller to run them to completion.

For simplicity, let's look at a single layer – corresponding to the intermediate sub_method – but in different variations.

async def child(i):await asyncio.sleep(0.2)  # some non-trivial payloadprint("child", i, "done")async def encapsulated() -> None:await asyncio.sleep(0.1)  # some preparation workchildren = [child() for _ in range(10)]await asyncio.gather(*children)async def task_children() -> 'List[asyncio.Task]':await asyncio.sleep(0.1)  # some preparation workchildren = [asyncio.create_task(child()) for _ in range(10)]return childrenasync def coro_children() -> 'List[Awaitable[None]]':await asyncio.sleep(0.1)  # some preparation workchildren = [child() for _ in range(10)]return children

All of encapsulated, task_children and coro_children in some way encode that there are sub-tasks. This allows the caller to run them in such a way that the actual goal is "done" reliably. However, each variant differs in how much it does by itself and how much the caller has to do:

The encapsulated is the "heaviest" variant: all children are run in Tasks and there is an additional gather. However, the caller is not exposed to any of this:
```
await encapsulated()
```
This guarantees that the functionality works as intended, and its implementation can freely be changed.
The task_children is the intermediate variant: all children are run in Tasks. The caller can decide if and how to wait for completion:
```
tasks = await task_children()
await asyncio.gather(*tasks)  # can add other tasks here as well
```
This guarantees that the functionality starts as intended. Its completion relies on the caller having some knowledge, though.
The coro_children is the "lightest" variant: nothing of the children is actually run. The caller is responsible for the entire lifetime:
```
tasks = await coro_children()
# children don't actually run yet!
await asyncio.gather(*tasks)  # can add other tasks here as well
```
This completely relies on the caller to start and wait for the sub-tasks.

Using the encapsulated pattern is a safe default – it ensures that the coroutine "just works". Notably, a coroutine using an internal gather still appears like any other coroutine.

`gather` speed?

The gather utility a) ensures that its arguments are run as Tasks and b) provides a Future that triggers once the tasks are done. Since gather is usually used when one would run the arguments as Tasks anyway, there is no additional overhead from this; likewise, these are regular Tasks and have the same performance/priority characteristics¹ as everything else.

The only overhead is from the wrapping Future; this takes care of bookkeeping (ensuring the arguments are tasks) and then only waits, i.e. does nothing. On my machine, measuring the overhead shows that it takes on average about twice as long as running a no-op Task. This by itself should already be negligible for any real-world task.

In addition, the pattern of gathering child tasks inherently means that there is a tree of gather nodes. Thus the number of gather nodes is usually much lower than the number of tasks. For example, for the case of 10 tasks per gather, a total of only 11 gathers is needed to handle a total of 100 tasks.

master_method                                                  0sub_method         0          1          2          3          4          5 ...my_task       0123456789 0123456789 0123456789 0123456789 0123456789 0123456789 ...

¹Which is to say, none. asyncio currently has no concept of Task priorities.

Using nested asyncio.gather() inside another asyncio.gather()

Why `gather`?

`gather` speed?

Related Q&A

AttributeError: type object Word2Vec has no attribute load_word2vec_format

Python - Core Speed [duplicate]

App Engine, transactions, and idempotency

Speed differences between intersection() and object for object in set if object in other_set

Pandas.read_csv reads all of the file into one column

Python lazy evaluation numpy ndarray

Python 2.7 NetworkX (Make it interactive)

Normal Distribution Plot by name from pandas dataframe

Change pyttsx3 language

pandas groupby dates and years and sum up amounts

Using nested asyncio.gather() inside another asyncio.gather()

Why gather?

gather speed?

Related Q&A

Why `gather`?

`gather` speed?