Gunicorn

日子每天过得像陷在泥沼里。keep going

Copied from docs/source/design .

Gunicorn ‘Green Unicorn’ is a Python WSGI HTTP Server for UNIX. It’s a pre-fork worker model ported from Ruby’s Unicorn project.

Server Model

Gunicorn is based on the pre-fork worker model . This means that there is a central master process that manages a set of worker processes. The master never knows anything about individual clients . All requests and responses are handled completely by worker processes.

Note

每个 worker 是一个 process 。

Master

The master process is a simple loop that listens for various process signals and reacts accordingly. It manages the list of running workers by listening for signals like TTIN, TTOU, and CHLD. TTIN and TTOU tell the master to increase or decrease the number of running workers. CHLD indicates that a child process has terminated, in this case the master process automatically restarts the failed worker .

Note

restart the failed worker 是怎样的操作? failed worker 进程处在什么状态?怎么标识 failed 状态?restart 都做了什么?

Sync Workers

The most basic and the default worker type is a synchronous worker class that handles a single request at a time. This model is the simplest to reason about as any errors will affect at most a single request . Though as we describe below only processing a single request at a time requires some assumptions about how applications are programmed .

Note

requires some assumptions 是什么?

Async Workers

The asynchronous workers available are based on Greenlets (via Eventlet and
Gevent). Greenlets are an implementation of cooperative multi-threading for

Python. In general, an application should be able to make use of these worker classes with no changes.

Note

cooperative multi-threading 是什么?一般应用可以 with no changes 地使用这些 worker,为什么我们不可以?

Tornado Workers

There’s also a Tornado worker class. It can be used to write applications using the Tornado framework. Although the Tornado workers are capable of serving a WSGI application, this is not a recommended configuration.

AsyncIO Workers

These workers are compatible with python3. You have two kind of workers.

The worker gthread is a threaded worker. It accepts connections in the main loop, accepted connections are added to the thread pool as a connection job . On keepalive connections are put back in the loop waiting for an event. If no event happen after the keep alive timeout, the connection is closed.

The worker gaiohttp is a full asyncio worker using aiohttp.

Note

accepted connections 其实是一个 socket 对象,所以可以放到任何地方。是这个意思吗?

一个 connection 只能对应一个 client,所以这里是在服务端保持了连接。

HTTP 的 keepalive 和这里的 keepalive 作用的差异在什么地方?HTTP 的 keepalive 是在什么地方实现的?

Choosing a Worker Type

The default synchronous workers assume that your application is resource bound in terms of CPU and network bandwidth. Generally this means that your application shouldn’t do anything that takes an undefined amount of time . For instance, a request to the internet meets this criteria. At some point the external network will fail in such a way that clients will pile up on your servers.

This resource bound assumption is why we require a buffering proxy in front of a default configuration Gunicorn . If you exposed synchronous workers to the internet, a DOS attack would be trivial by creating a load that trickles data to the servers . For the curious, Slowloris is an example of this type of load.

Some examples of behavior requiring asynchronous workers:

  • Applications making long blocking calls (Ie, external web services)
  • Serving requests directly to the internet
  • Streaming requests and responses
  • Long polling
  • Web sockets
  • Comet

Note

nginx 反向代理其实就是 buffering proxy。

没懂 buffering proxy 为什么可以阻止 ddos, 即便有 buffer ,worker 都被占满了,对正常请求来说一样是瘫痪啊。

区别只是没有 buffer 时,worker 数会太多,或者系统CPU、内存等负载过高,整体挂掉;或者为了保证负载不过高一些请求只能丢掉。 而有 buffer 时至少有一部分请求可以正常处理。

另外一份文档里仔细解释了原因:nginx buffers slow clients

貌似在 async 模式下,socket 事件监听是通过轮询来实现的,而不是像 sync 模式中一直 hang 着。待确定。

Why nginx & buffering

Although there are many HTTP proxies available, we strongly advise that you use Nginx_. If you choose another proxy server you need to make sure that it buffers slow clients when you use default Gunicorn workers. Without this buffering Gunicorn will be easily susceptible to denial-of-service attacks. You can use slowloris to check if your proxy is behaving properly.

What is Slowloris?

Slowloris is basically an HTTP Denial of Service attack that affects threaded servers. It works like this:

  • We start making lots of HTTP requests.
  • We send headers periodically (every ~15 seconds) to keep the connections open.
  • We never close the connection unless the server does so. If the server closes a connection, we create a new one keep doing the same thing.

This exhausts the servers thread pool and the server can’t reply to other people.

Note

还是有一点疑惑。意思是 keepalive 放在 nginx 处理,gunicorn 不保持连接,这样就不会受这种 DoS 影响了? 那么反过来,是说 gunicorn 单独做 server 时,sync 模式下有 keepalive 功能吗?

Note

短板

  • 操作系统进程管理。什么是协程等。
  • 网络。TCP keepalive 的实现原理?和 gunicorn gthread 的 keepalive 分别负责什么事情。
  • 推送的原理,long polling 和平常说的长连接是一个东西吗?

How Many Workers?

DO NOT scale the number of workers to the number of clients you expect to have. Gunicorn ``should`` only need 4-12 worker processes to handle hundreds or thousands of requests per second.

Gunicorn relies on the operating system to provide all of the load balancing when handling requests. Generally we recommend (2 x $num_cores) + 1 as the number of workers to start off with. While not overly scientific, the formula is based on the assumption that for a given core, one worker will be reading or writing from the socket while the other worker is processing a request.

Obviously, your particular hardware and application are going to affect the optimal number of workers. Our recommendation is to start with the above guess and tune using TTIN and TTOU signals while the application is under load.

Always remember, there is such a thing as too many workers. After a point your worker processes will start thrashing system resources decreasing the throughput of the entire system.

Note

我们处理 1000 qps 大概需要 350 个 worker,和这里说的 4-12 个差别好大。 我们现在每个 container 里有多少个 worker?

How Many Threads?

Since Gunicorn 19, a threads option can be used to process requests in multiple threads. Using threads assumes use of the sync worker. One benefit from threads is that requests can take longer than the worker timeout while notifying the master process that it is not frozen and should not be killed . Depending on the system, using multiple threads, multiple worker processes, or some mixture, may yield the best results. For example, CPython may not perform as well as Jython when using threads, as threading is implemented differently by each. Using threads instead of processes is a good way to reduce the memory footprint of Gunicorn , while still allowing for application upgrades using the reload signal, as the application code will be shared among workers but loaded only in the worker processes (unlike when using the preload setting, which loads the code in the master process).

Note

memory footprint 是什么?

Config

这种 config 的包装方式还挺干净的:

class Config(object):

    def __init__(self, usage=None, prog=None):
        self.settings = make_settings()

    def __getattr__(self, name):
        if name not in self.settings:
            raise AttributeError("No configuration setting for: %s" % name)
        return self.settings[name].get()

    def __setattr__(self, name, value):
        if name != "settings" and name in self.settings:
            raise AttributeError("Invalid access!")
        super(Config, self).__setattr__(name, value)

Arbiter

Arbiter 是实际 gunicore run 的入口。

Arbiter maintain the workers processes alive. It launches or kills them if needed. It also manages application reloading via SIGHUP/USR2.

在 run() 中首先执行一个 start() 的工作。创建监听的 master。不管是 pidl client 还是 server,最终都躲不过 socket。创建 master worker 的部分,也不外乎是常见的 signal,socket。

在 master 创建完后,用 self.manage_workers() fork 了一些 workers。worker 用一个单独的 gunicorn.workers.base.Worker 类来管理。worker 初始化后,当参数传递给 fork worker process 。这里有一个有意思的点,因为初始化 Worker 在 fork 之前,所以初始化里不能做对这个 worker process 的处理,而是提供了一个 self.init_process() 来供操作 process 。另外,Worker 中有一个 max_requests 属性。

Socket

首先判断创建什么类型的 socket。如果 addr 是一个元组,创建 TCP socket,如果是一个 string ,创建 Unix socket。

然后是重试 5 次创建 socket 的过程。如果已经有 GUNICORN_FD 则 socket.fromfd ,否则使用普通的 socket.socket 。每次循环里就是经典的 socket server 用法:

sock = socket.socket(self.FAMILY, socket.SOCK_STREAM)
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(addr)
sock.setblocking(0)
sock.listen(backlog)

nginx & gunicore & quixote

wsgi 的组成。