feat(tests): add comprehensive benchmarks for auth and performance-critical endpoints

- Introduced benchmarks for password hashing, verification, and JWT token operations.
- Added latency tests for `/register`, `/refresh`, `/sessions`, and `/users/me` endpoints.
- Updated `BENCHMARKS.md` with new tests, thresholds, and execution details.
This commit is contained in:
2026-03-01 17:01:44 +01:00
parent ce4d0c7b0d
commit 0760a8284d
2 changed files with 195 additions and 9 deletions

View File

@@ -72,8 +72,8 @@ The test suite includes two categories of performance tests:
| Type | How it works | Examples |
|------|-------------|----------|
| **pytest-benchmark tests** | Uses the `benchmark` fixture for precise, multi-round timing | `test_health_endpoint_performance`, `test_openapi_schema_performance` |
| **Manual latency tests** | Uses `time.perf_counter` with explicit thresholds (for async endpoints that pytest-benchmark doesn't support natively) | `test_login_latency`, `test_get_current_user_latency` |
| **pytest-benchmark tests** | Uses the `benchmark` fixture for precise, multi-round timing | `test_health_endpoint_performance`, `test_openapi_schema_performance`, `test_password_hashing_performance`, `test_password_verification_performance`, `test_access_token_creation_performance`, `test_refresh_token_creation_performance`, `test_token_decode_performance` |
| **Manual latency tests** | Uses `time.perf_counter` with explicit thresholds (for async endpoints that pytest-benchmark doesn't support natively) | `test_login_latency`, `test_get_current_user_latency`, `test_register_latency`, `test_token_refresh_latency`, `test_sessions_list_latency`, `test_user_profile_update_latency` |
### 3. Regression detection
@@ -130,12 +130,21 @@ This is a **relative ranking within the current run** — red does NOT mean the
For this project's current endpoints:
| Endpoint | Expected range | Why |
|----------|---------------|-----|
| Test | Expected range | Why |
|------|---------------|-----|
| `GET /health` | ~11.5ms | Minimal logic, mocked DB check |
| `GET /api/v1/openapi.json` | ~1.52.5ms | Serializes entire API schema |
| `POST /api/v1/auth/login` | < 500ms threshold | Includes bcrypt password hashing |
| `get_password_hash` | ~200ms | CPU-bound bcrypt hashing |
| `verify_password` | ~200ms | CPU-bound bcrypt verification |
| `create_access_token` | ~1720µs | JWT encoding with HMAC-SHA256 |
| `create_refresh_token` | ~1720µs | JWT encoding with HMAC-SHA256 |
| `decode_token` | ~2025µs | JWT decoding and claim validation |
| `POST /api/v1/auth/login` | < 500ms threshold | Includes bcrypt password verification |
| `POST /api/v1/auth/register` | < 500ms threshold | Includes bcrypt password hashing |
| `POST /api/v1/auth/refresh` | < 200ms threshold | Token rotation + DB session update |
| `GET /api/v1/users/me` | < 200ms threshold | DB lookup + token validation |
| `GET /api/v1/sessions/me` | < 200ms threshold | Session list query + token validation |
| `PATCH /api/v1/users/me` | < 200ms threshold | DB update + token validation |
---
@@ -297,6 +306,6 @@ If StdDev is high relative to the Mean, results may be unreliable. Common causes
Try running benchmarks on an idle system or increasing `min_rounds` in `pyproject.toml`.
### Only 2 of 4 tests run
### Only 7 of 13 tests run
The async tests (`test_login_latency`, `test_get_current_user_latency`) are skipped during `--benchmark-only` runs because they don't use the `benchmark` fixture. They run as part of the normal test suite (`make test`) with manual threshold assertions.
The async tests (`test_login_latency`, `test_get_current_user_latency`, `test_register_latency`, `test_token_refresh_latency`, `test_sessions_list_latency`, `test_user_profile_update_latency`) are skipped during `--benchmark-only` runs because they don't use the `benchmark` fixture. They run as part of the normal test suite (`make test`) with manual threshold assertions.

View File

@@ -2,7 +2,7 @@
Performance Benchmark Tests.
These tests establish baseline performance metrics for critical API endpoints
and detect regressions when response times degrade significantly.
and core operations, detecting regressions when response times degrade.
Usage:
make benchmark # Run benchmarks and save baseline
@@ -20,10 +20,21 @@ import pytest
import pytest_asyncio
from fastapi.testclient import TestClient
from app.core.auth import (
create_access_token,
create_refresh_token,
decode_token,
get_password_hash,
verify_password,
)
from app.main import app
pytestmark = [pytest.mark.benchmark]
# Pre-computed hash for sync benchmarks (avoids hashing in every iteration)
_BENCH_PASSWORD = "BenchPass123!"
_BENCH_HASH = get_password_hash(_BENCH_PASSWORD)
# =============================================================================
# Fixtures
@@ -55,6 +66,50 @@ def test_openapi_schema_performance(sync_client, benchmark):
assert result.status_code == 200
# =============================================================================
# Core Crypto & Token Benchmarks (no DB required)
#
# These benchmark the CPU-intensive operations that underpin auth:
# password hashing, verification, and JWT creation/decoding.
# =============================================================================
def test_password_hashing_performance(benchmark):
"""Benchmark: bcrypt password hashing (CPU-bound, ~100ms expected)."""
result = benchmark(get_password_hash, _BENCH_PASSWORD)
assert result.startswith("$2b$")
def test_password_verification_performance(benchmark):
"""Benchmark: bcrypt password verification against a known hash."""
result = benchmark(verify_password, _BENCH_PASSWORD, _BENCH_HASH)
assert result is True
def test_access_token_creation_performance(benchmark):
"""Benchmark: JWT access token generation."""
user_id = str(uuid.uuid4())
token = benchmark(create_access_token, user_id)
assert isinstance(token, str)
assert len(token) > 0
def test_refresh_token_creation_performance(benchmark):
"""Benchmark: JWT refresh token generation."""
user_id = str(uuid.uuid4())
token = benchmark(create_refresh_token, user_id)
assert isinstance(token, str)
assert len(token) > 0
def test_token_decode_performance(benchmark):
"""Benchmark: JWT token decoding and validation."""
user_id = str(uuid.uuid4())
token = create_access_token(user_id)
payload = benchmark(decode_token, token, "access")
assert payload.sub == user_id
# =============================================================================
# Database-dependent Endpoint Benchmarks (async, manual timing)
#
@@ -65,12 +120,15 @@ def test_openapi_schema_performance(sync_client, benchmark):
MAX_LOGIN_MS = 500
MAX_GET_USER_MS = 200
MAX_REGISTER_MS = 500
MAX_TOKEN_REFRESH_MS = 200
MAX_SESSIONS_LIST_MS = 200
MAX_USER_UPDATE_MS = 200
@pytest_asyncio.fixture
async def bench_user(async_test_db):
"""Create a test user for benchmark tests."""
from app.core.auth import get_password_hash
from app.models.user import User
_test_engine, AsyncTestingSessionLocal = async_test_db
@@ -102,6 +160,17 @@ async def bench_token(client, bench_user):
return response.json()["access_token"]
@pytest_asyncio.fixture
async def bench_refresh_token(client, bench_user):
"""Get a refresh token for the benchmark user."""
response = await client.post(
"/api/v1/auth/login",
json={"email": "bench@example.com", "password": "BenchPass123!"},
)
assert response.status_code == 200, f"Login failed: {response.text}"
return response.json()["refresh_token"]
@pytest.mark.asyncio
async def test_login_latency(client, bench_user):
"""Performance: POST /api/v1/auth/login must respond under threshold."""
@@ -148,3 +217,111 @@ async def test_get_current_user_latency(client, bench_token):
assert mean_ms < MAX_GET_USER_MS, (
f"Get user latency regression: {mean_ms:.1f}ms exceeds {MAX_GET_USER_MS}ms threshold"
)
@pytest.mark.asyncio
async def test_register_latency(client):
"""Performance: POST /api/v1/auth/register must respond under threshold."""
iterations = 3
total_ms = 0.0
for i in range(iterations):
start = time.perf_counter()
response = await client.post(
"/api/v1/auth/register",
json={
"email": f"benchreg{i}@example.com",
"password": "BenchRegPass123!",
"first_name": "Bench",
"last_name": "Register",
},
)
elapsed_ms = (time.perf_counter() - start) * 1000
total_ms += elapsed_ms
assert response.status_code == 201, f"Register failed: {response.text}"
mean_ms = total_ms / iterations
print(
f"\n Register mean latency: {mean_ms:.1f}ms (threshold: {MAX_REGISTER_MS}ms)"
)
assert mean_ms < MAX_REGISTER_MS, (
f"Register latency regression: {mean_ms:.1f}ms exceeds {MAX_REGISTER_MS}ms threshold"
)
@pytest.mark.asyncio
async def test_token_refresh_latency(client, bench_refresh_token):
"""Performance: POST /api/v1/auth/refresh must respond under threshold."""
iterations = 5
total_ms = 0.0
for _ in range(iterations):
start = time.perf_counter()
response = await client.post(
"/api/v1/auth/refresh",
json={"refresh_token": bench_refresh_token},
)
elapsed_ms = (time.perf_counter() - start) * 1000
total_ms += elapsed_ms
assert response.status_code == 200, f"Refresh failed: {response.text}"
# Use the new refresh token for the next iteration
bench_refresh_token = response.json()["refresh_token"]
mean_ms = total_ms / iterations
print(
f"\n Token refresh mean latency: {mean_ms:.1f}ms (threshold: {MAX_TOKEN_REFRESH_MS}ms)"
)
assert mean_ms < MAX_TOKEN_REFRESH_MS, (
f"Token refresh latency regression: {mean_ms:.1f}ms exceeds {MAX_TOKEN_REFRESH_MS}ms threshold"
)
@pytest.mark.asyncio
async def test_sessions_list_latency(client, bench_token):
"""Performance: GET /api/v1/sessions must respond under threshold."""
iterations = 10
total_ms = 0.0
for _ in range(iterations):
start = time.perf_counter()
response = await client.get(
"/api/v1/sessions/me",
headers={"Authorization": f"Bearer {bench_token}"},
)
elapsed_ms = (time.perf_counter() - start) * 1000
total_ms += elapsed_ms
assert response.status_code == 200
mean_ms = total_ms / iterations
print(
f"\n Sessions list mean latency: {mean_ms:.1f}ms (threshold: {MAX_SESSIONS_LIST_MS}ms)"
)
assert mean_ms < MAX_SESSIONS_LIST_MS, (
f"Sessions list latency regression: {mean_ms:.1f}ms exceeds {MAX_SESSIONS_LIST_MS}ms threshold"
)
@pytest.mark.asyncio
async def test_user_profile_update_latency(client, bench_token):
"""Performance: PATCH /api/v1/users/me must respond under threshold."""
iterations = 5
total_ms = 0.0
for i in range(iterations):
start = time.perf_counter()
response = await client.patch(
"/api/v1/users/me",
headers={"Authorization": f"Bearer {bench_token}"},
json={"first_name": f"Bench{i}"},
)
elapsed_ms = (time.perf_counter() - start) * 1000
total_ms += elapsed_ms
assert response.status_code == 200, f"Update failed: {response.text}"
mean_ms = total_ms / iterations
print(
f"\n User update mean latency: {mean_ms:.1f}ms (threshold: {MAX_USER_UPDATE_MS}ms)"
)
assert mean_ms < MAX_USER_UPDATE_MS, (
f"User update latency regression: {mean_ms:.1f}ms exceeds {MAX_USER_UPDATE_MS}ms threshold"
)