Asynchronous or Synchronous

Learn python web framework fastapi, see the introduction, various advantages "support asynchronous, good performance, is one of the fastest Python web frameworks"

The simplest FastAPI is like this:

from fastapi import FastAPI

app = FastAPI()

@app.get("/")
async def root():
    return {"message": "Hello World"}

Note the async def keyword. If you have used asynchronous python libraries, you are familiar with them. These libraries will tell you to add the keyword await in front of the call. But when you add it, the editor will tell you to use it in an asynchronous function, so you add async before def, and then put the mouse on the function after await, and find that the return is a coroutine thing. If you call this asynchronous function directly like a(), it will report an error and you need to use asyncio.run() to run it.

Asynchronous code can tell the program that it will "wait" at some point during its execution (IO operations, network), it doesn't need to occupy the CPU, and it can hand over control to other tasks. When it really completes, it tells the program the result of the execution. In this way, other tasks in the program do not have to wait for this slow task. In other words, synchronization is executed in order, and asynchrony switches between different tasks.

In Python, await a() tells the program not to wait for the execution of a(). a() itself or some time inside it can be paused, and it will return a result in the future. In the Python official documentation, a() is called a "awaitable object" (https://docs.python.org/zh-cn/3/library/asyncio-task.html#id3).

If an object can be used in an await statement, it is an awaitable object.
Awaitable objects come in three main types: coroutines, tasks, and Futures.

A Future represents the result of an asynchronous computation. It is a low-level object that is not thread-safe.

In other words, what can be awaited after await can be a coroutine, task, or Future.

When encountering consecutive await, it is still synchronous. Because when await is followed by a coroutine, it first turns it into a task. This asyncio.create_task() is the same. await does two things at the same time, causing the consecutive await tasks not to be created at the same time. At this time, use asyncio.gather() to run tasks concurrently.

import asyncio
import time


async def f1():
    print(f'f1--start-{time.time()}')
    await asyncio.sleep(1)
    print(f'f1--end-{time.time()}')


async def f2():
    print(f'f2--start-{time.time()}')
    await asyncio.sleep(1)
    print(f'f2--end-{time.time()}')


async def main():
    await f1()
    await f2()


if __name__ == '__main__':
    asyncio.run(main())

f1--start-1681478831.0485213
f1--end-1681478832.0549097
f2--start-1681478832.0549097
f2--end-1681478833.0619018

Change the main function to:

async def main():
    await asyncio.gather(f1(), f2())

f1--start-1681478900.615097
f2--start-1681478900.615097
f1--end-1681478901.6200216
f2--end-1681478901.6200216

Or create tasks at the same time like this:


async def main():
    task1 = asyncio.create_task(f1())
    task2 = asyncio.create_task(f2())
    await task1
    await task2
    # or await asyncio.gather(task1, task2)

f1--start-1681479408.9956422
f2--start-1681479408.9956422
f1--end-1681479410.0039244
f2--end-1681479410.0039244

await can take a coroutine or a task. If it is a coroutine, it will create a task from it. If you directly await continuously, when calling f1, it will start executing immediately, but during the execution, the control will immediately return to the main() function, and the main() function will wait for f1 to complete before continuing to execute the next line of code. However, while waiting for f1 to complete, f2 is not called.

If you directly create 2 tasks without adding await, the main() function will immediately return after creating the tasks task1 and task2, without waiting for these two tasks to complete. In other words, only f1-start and f2-start are present without end.

Asynchronous has many advantages - it can handle multiple tasks in one thread without the need to create a new thread for each task, saving the overhead of thread switching and improving program concurrency. It can handle other tasks while waiting for IO operations, without blocking the execution of the program.

The significance of asynchrony lies in fully utilizing the CPU and improving the efficiency of the program. Therefore, for compute-intensive programs, asynchrony is not very meaningful and will only increase complexity. It is only suitable for scenarios that involve a large number of IO operations, such as network programming and web development.

When I saw the benefits of asynchrony, it happened that database operations were IO operations, so I can use asynchrony to use the sqlalchemy ORM object model mapping (~~string splicing is good~~).

To write asynchronous Python web, you need to choose the library aiofiles that provides asynchronous support for file reading and writing instead of normal file reading and writing. The aiomysql+pymysql database engine is used. If the efficiency of a function is the same whether it is synchronous or asynchronous, or there is no waiting (IO) operation, then write it synchronously, which is easier to understand and debug.

The pit of asynchronous in sqlalchemy. Use create_async_engine, async with Session() as session:, await session.execute(sql), etc., instead of the normal way of creating an engine and session.

In sqlalchemy, the relationship can be used to retrieve data from another table through a table object, which is very convenient. But it throws an error in asynchronous code, with a traceback of hundreds of lines and a large number of keywords such as await loop send. At first, I didn't know it was a problem with the relationship. The error message only said that the event loop exited early, etc. It was troublesome to debug by extracting the code block separately, and it was not easy to call directly with (). Finally, I found out that it was a problem with the relationship. I simply stopped using it and manually performed CRUD operations in multiple tables, which was indeed troublesome and prone to errors.

There are synchronous and asynchronous methods for using sqlalchemy on the Internet, but you don't know which function is asynchronous. For example, in synchronous code, query and select are used, but in asynchronous code, there is no query function, but there is select. The names are the same, but they are imported from different paths in sqlalchemy. ~~Why not look at the official website~~ (I can only say that the sqlalchemy documentation is too messy, the function names between versions, it's hard to say).

Since sqlalchemy 2.0 was released not long ago

Release: 2.0.9 | Release Date: April 5, 2023

sqlalchemy supports asynchronous (~~from the function name create_async_engine~~), but the documentation is incomplete and not perfect, and the function names are exactly the same as synchronous.

The disadvantages of asynchronous programming are as follows:

High complexity: Asynchronous programming requires the use of callback functions, coroutines, event loops, and a series of concepts and techniques, which have a high learning and usage cost.
More prone to errors and difficult to debug: Due to the complex execution flow of asynchronous programming, it is more difficult to debug errors than synchronous programming.

The difficulty of asynchrony: You can't control the code you write because the execution order is unpredictable. It squeezes the CPU time and doesn't let the CPU idle.

Time is like a sponge full of water. As long as you are willing to squeeze, there will always be some. - Lu Xun

References:

https://fastapi.tiangolo.com/zh/async/#is-concurrency-better-than-parallelism

https://docs.python.org/zh-cn/3/library/asyncio-task.html#coroutines