add live stress harness, app-level admin login rate limit

tests/stress/live_accuracy.mjs: classroom-scale accuracy + latency
test that targets the deployed server (single-session, sid=main).
Logs in as admin via /admin/login, resets the session, joins N
students serially over HTTP, opens N student WebSockets in batches
of 8 (250ms apart) plus the instructor WS, then drives every
question through the admin "next" command. Each student picks
uniformly random A-D, sends the submit, waits for the submit_ack,
and records the round-trip latency. After session_ended, the script
verifies that every student whose pick == correct got score > 0,
every other submission got score == 0, and reports p50/p95/p99
ack latency. First live run: 50 students, 100 submits, 100% acks,
100% accuracy match, p99 555ms (≈intercontinental RTT to HK).

tests/stress/live_loop.sh: tmux-friendly loop that runs the live
test every 60s and appends a JSONL summary line per cycle to
runs/live_summary.jsonl. Mirrors the morning's api_stress run_loop
shape so per-cycle aggregates are easy to scrape.

app/rate_limit.py: tiny in-memory token bucket. Capacity + refill
in tokens/minute, keyed by client IP via X-Forwarded-For (with a
fallback to request.client.host). Process-local state — admin
login is the only user.

POST /admin/login: rate-limited at 10 attempts/minute/IP. Generous
for the legit instructor (who succeeds in 1-2 tries) and prohibitive
for brute force from a single attacker IP. Student endpoints
deliberately NOT rate-limited because campus students share NAT
gateways and IP-level limits would false-positive a whole class.

The bucket is per-app-instance (instantiated inside the router
factory), so test apps each get a fresh one and tests don't poison
each other.
This commit is contained in:
ameer
2026-05-03 00:23:07 +08:00
parent 7a483ad3ee
commit 2136286275
5 changed files with 483 additions and 1 deletions

72
app/rate_limit.py Normal file
View File

@@ -0,0 +1,72 @@
"""Tiny in-memory token-bucket rate limiter.
Used for `/admin/login` only. The student endpoints intentionally have
no IP-based throttling because a campus deployment puts ~40 students
behind one or a few NAT IPs; rate-limiting at the IP level would
false-positive the entire class.
For the admin login endpoint, IP-based limiting is appropriate: the
instructor logs in from a single device, and brute-force attempts
generally come from a few attacker IPs. Per-IP token bucket of
10 attempts / minute is generous for the legitimate user, hostile
to a guesser.
"""
from __future__ import annotations
import time
from dataclasses import dataclass
from typing import Optional
from fastapi import Request
@dataclass(slots=True)
class _Bucket:
tokens: float
last_ts: float
class TokenBucket:
"""Per-key (e.g., per-IP) token bucket.
`capacity` tokens accrue at `rate_per_sec`. Each call to `take()`
consumes one token; if the bucket is empty, returns False.
State is process-local. An app restart resets all buckets, which
is acceptable for the threat model (slows attackers; doesn't
permanently lock anyone out).
"""
def __init__(self, capacity: int, refill_per_minute: float) -> None:
self.capacity = float(capacity)
self.rate_per_sec = refill_per_minute / 60.0
self.buckets: dict[str, _Bucket] = {}
def take(self, key: str) -> bool:
now = time.monotonic()
b = self.buckets.get(key)
if b is None:
b = _Bucket(tokens=self.capacity, last_ts=now)
self.buckets[key] = b
elapsed = now - b.last_ts
b.tokens = min(self.capacity, b.tokens + elapsed * self.rate_per_sec)
b.last_ts = now
if b.tokens < 1.0:
return False
b.tokens -= 1.0
return True
def client_ip(request: Request) -> str:
"""Best-effort client IP extraction.
Caddy puts the real client in `X-Forwarded-For`; uvicorn behind a
127.0.0.1-only proxy will see `request.client.host == "127.0.0.1"`
for every request, so trusting X-F-F is necessary for any per-client
behaviour at all.
"""
xff = request.headers.get("x-forwarded-for")
if xff:
return xff.split(",")[0].strip()
return request.client.host if request.client else "unknown"