2021年10月19日火曜日

uWSGI のパフォーマンスチューニングを ab を使ってやってみる

概要
環境
アプリ
現在のパフォーマンスを計算する
マルチスレッドをオンにする
ログを無効にする
ワーカーのライフタイムを設定する
おまけ: Python をコンパイルして動作させる
最後に
ab トラブルシューティング
参考サイト

概要

uWSGI のパフォーマンスチューニングを ab を使って実際に試す記事があったので自分も試してみました

最適な数字はこうやって出していくしかないのかもしれません

環境

Ubuntu 18.04
Python 3.8.3
flask 2.0.2
uWSGI 2.0.19.1
ApacheBench 2.3

アプリ

記事にもある通りフィボナッチ数列を計算するアプリを使用します
かなり単純なアプリでデータベースへのアクセスや外部サービスへのアクセス、キャッシュへのアクセスがないので ab の値は単純な uWSGI の性能値になります

vim app.py

from flask import Flask
import json
import fib

app = Flask(__name__)

@app.route("/<number>", methods=['GET'])
def get_fib(number):
    return json.dumps(fib.get(int(number))), 200

if __name__ == '__main__':
    app.run(host="0.0.0.0", port="8080")

vim fib.py

def get(number):
    sequence = [0, 1]
    while sequence[-1] < number:
        sequence.append(sequence[-2] + sequence[-1])
    return sequence

wsgi.ini は以下のように設定します

vim uwsgi.ini

[uwsgi]
http=:80
wsgi-file=app.py
callable=app

動作するか確認してみます

pipenv run uwsgi uwsgi.ini
curl localhost/9000

で 9000 までのフィボナッチ数列が返却されることを確認します

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946]

現在のパフォーマンスを計算する

ab コマンドを使って現在の uwsgi.ini でどれくらいのパフォーマンスが出るのか計算します

ab -c 500 -n 5000 -s 90 192.168.1.2/9000

Requests per second で 643.03 出ています

Concurrency Level:      500
Time taken for tests:   7.776 seconds
Complete requests:      5000
Failed requests:        0
Total transferred:      880000 bytes
HTML transferred:       485000 bytes
Requests per second:    643.03 [#/sec] (mean)
Time per request:       777.565 [ms] (mean)
Time per request:       1.555 [ms] (mean, across all concurrent requests)
Transfer rate:          110.52 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    9  69.8      0    1023
Processing:    30  434 1310.9     70    7712
Waiting:       24  434 1310.9     70    7712
Total:         48  443 1326.2     70    7746

マルチスレッドをオンにする

processes = 4 を追加してみましょう

vim uwsgi.ini

[uwsgi]
http=:80
wsgi-file=app.py
callable=app
processes=4

これで再度 uWSGI を起動するログに以下のようなログが追加されています


spawned uWSGI worker 1 (pid: 60526, cores: 1)
spawned uWSGI worker 2 (pid: 60566, cores: 1)
spawned uWSGI worker 3 (pid: 60567, cores: 1)
spawned uWSGI worker 4 (pid: 60568, cores: 1)

この状態で ab を実行すると以下のような結果になりました
Requests per second が 1250.73 まで上がりました

Concurrency Level:      500
Time taken for tests:   3.998 seconds
Complete requests:      5000
Failed requests:        0
Total transferred:      880000 bytes
HTML transferred:       485000 bytes
Requests per second:    1250.73 [#/sec] (mean)
Time per request:       399.768 [ms] (mean)
Time per request:       0.800 [ms] (mean, across all concurrent requests)
Transfer rate:          214.97 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   22 128.7      0    1047
Processing:    15  154 370.7     53    3836
Waiting:       14  152 370.5     52    3836
Total:         22  176 403.6     54    3843

紹介記事にもあるのですが lshw などを使い実行しているマシンの CPU 数やコア数に合わせて値をチューニングしてください

例えば 2コアの CPU の場合には更に threads を有効にするとパフォーマンスが更に向上します

# lshw -short -class cpu
H/W path           Device       Class      Description
======================================================
/0/1                            processor  Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
/0/2                            processor  Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz

vim uwsgi.ini

[uwsgi]
http=:80
wsgi-file=app.py
callable=app
processes=4
threads=2
enable-threads=True

Requests per second が 2309.60 まで上がりました
単純に2倍の性能が出ています

Concurrency Level:      500
Time taken for tests:   2.165 seconds
Complete requests:      5000
Failed requests:        0
Total transferred:      880000 bytes
HTML transferred:       485000 bytes
Requests per second:    2309.60 [#/sec] (mean)
Time per request:       216.488 [ms] (mean)
Time per request:       0.433 [ms] (mean, across all concurrent requests)
Transfer rate:          396.96 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   45 182.4     10    1046
Processing:     5  123 231.1     49    1285
Waiting:        5  117 230.9     44    1273
Total:         12  168 313.5     58    1700

ログを無効にする

disable-logging を True にします
ab テストをすると各リクエストごとにログが表示されていましたがこの設定を入れると表示されなくなります

vim uwsgi.ini

[uwsgi]
http=:80
wsgi-file=app.py
callable=app
processes=4
threads=2
enable-threads=True
disable-logging=True

再度 ab を流してみましょう
これはそこまでパフォーマンスに影響がありませんでした
先程同様で Requests per second が 2311.59 ほどでした
こんな感じで値を変更してパフォーマンスに影響のある値が何か探っていく感じになります

Concurrency Level:      500
Time taken for tests:   2.163 seconds
Complete requests:      5000
Failed requests:        0
Total transferred:      880000 bytes
HTML transferred:       485000 bytes
Requests per second:    2311.59 [#/sec] (mean)
Time per request:       216.301 [ms] (mean)
Time per request:       0.433 [ms] (mean, across all concurrent requests)
Transfer rate:          397.30 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   33 168.8      1    1045
Processing:     2  111 230.4     42    1789
Waiting:        2  109 230.1     40    1789
Total:         14  144 307.3     44    1886

ワーカーのライフタイムを設定する

先程 processes=4 で複数の4つのワーカーが立ち上がるようにしました
これらのワーカーを指定の時間経過後に自動で再起動する設定が max-worker-lifetime になります
今回は 30 で設定してみます

vim uwsgi.ini

[uwsgi]
http=:80
wsgi-file=app.py
callable=app
processes=4
threads=2
enable-threads=True
disable-logging=True
max-worker-lifetime=30

Requests per second は 2336.88 になりました
それほどパフォーマンスに影響はないようですが多少の改善が見られる感じです

Concurrency Level:      500
Time taken for tests:   2.140 seconds
Complete requests:      5000
Failed requests:        0
Total transferred:      880000 bytes
HTML transferred:       485000 bytes
Requests per second:    2336.88 [#/sec] (mean)
Time per request:       213.960 [ms] (mean)
Time per request:       0.428 [ms] (mean, across all concurrent requests)
Transfer rate:          401.65 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0   49 198.0      4    1050
Processing:     7  120 254.4     44    1296
Waiting:        4  117 253.2     40    1292
Total:         11  169 362.1     46    1866

おまけ: Python をコンパイルして動作させる

これも多少はパフォーマンスに影響があるようです
ここでは紹介しませんが Cython を使ってコンパイルしてネイティブで動作させると更にパフォーマンスは向上するようです

最後に

ab を使って uWSGI のパフォーマンスチューニングをする方法を紹介しました
これ以外にもたくさんのパラメータがあるので自身の最適なパラメータを見つける感じになると思います

実際のサービスではデータベースや外部サービスが絡むのでもっと Request per second が落ちるかなと思います
そうなると uWSGI 側のチューニングではなくデータベースやキャッシュのチューニングになるのでそのあたりの感覚も必要になるかなと思います

ab トラブルシューティング

macOS で ab を実行する場合に発生します
いろいろと面倒なので素直に Ubuntu などで実行することをオススメします

apr_socket_recv: Connection reset by peer (54)

バージョンを最新にしましょう
macOS に標準でインストールされている ab は 2.3 でこれだと上記のエラーが発生するので brew install httpd で最新版をインストールしこちらを使いましょう

socket: Too many open files (24)

macOS の場合デフォルトのファイルディスクリプタが 256 なので -c オプションを 200 にして実行しましょう

apr_socket_connect(): Invalid argument (22)

localhost の指定はできないのでローカルの IP アドレスをしていしましょう

hawksnowlog