add live stress harness, app-level admin login rate limit

tests/stress/live_accuracy.mjs: classroom-scale accuracy + latency test that targets the deployed server (single-session, sid=main). Logs in as admin via /admin/login, resets the session, joins N students serially over HTTP, opens N student WebSockets in batches of 8 (250ms apart) plus the instructor WS, then drives every question through the admin "next" command. Each student picks uniformly random A-D, sends the submit, waits for the submit_ack, and records the round-trip latency. After session_ended, the script verifies that every student whose pick == correct got score > 0, every other submission got score == 0, and reports p50/p95/p99 ack latency. First live run: 50 students, 100 submits, 100% acks, 100% accuracy match, p99 555ms (≈intercontinental RTT to HK). tests/stress/live_loop.sh: tmux-friendly loop that runs the live test every 60s and appends a JSONL summary line per cycle to runs/live_summary.jsonl. Mirrors the morning's api_stress run_loop shape so per-cycle aggregates are easy to scrape. app/rate_limit.py: tiny in-memory token bucket. Capacity + refill in tokens/minute, keyed by client IP via X-Forwarded-For (with a fallback to request.client.host). Process-local state — admin login is the only user. POST /admin/login: rate-limited at 10 attempts/minute/IP. Generous for the legit instructor (who succeeds in 1-2 tries) and prohibitive for brute force from a single attacker IP. Student endpoints deliberately NOT rate-limited because campus students share NAT gateways and IP-level limits would false-positive a whole class. The bucket is per-app-instance (instantiated inside the router factory), so test apps each get a fresh one and tests don't poison each other.
2026-05-03 00:23:07 +08:00
parent 7a483ad3ee
commit 2136286275
5 changed files with 483 additions and 1 deletions
--- a/app/rate_limit.py
+++ b/app/rate_limit.py
@@ -0,0 +1,72 @@
+"""Tiny in-memory token-bucket rate limiter.
+
+Used for `/admin/login` only. The student endpoints intentionally have
+no IP-based throttling because a campus deployment puts ~40 students
+behind one or a few NAT IPs; rate-limiting at the IP level would
+false-positive the entire class.
+
+For the admin login endpoint, IP-based limiting is appropriate: the
+instructor logs in from a single device, and brute-force attempts
+generally come from a few attacker IPs. Per-IP token bucket of
+10 attempts / minute is generous for the legitimate user, hostile
+to a guesser.
+"""
+
+from __future__ import annotations
+
+import time
+from dataclasses import dataclass
+from typing import Optional
+
+from fastapi import Request
+
+
+@dataclass(slots=True)
+class _Bucket:
+    tokens: float
+    last_ts: float
+
+
+class TokenBucket:
+    """Per-key (e.g., per-IP) token bucket.
+
+    `capacity` tokens accrue at `rate_per_sec`. Each call to `take()`
+    consumes one token; if the bucket is empty, returns False.
+
+    State is process-local. An app restart resets all buckets, which
+    is acceptable for the threat model (slows attackers; doesn't
+    permanently lock anyone out).
+    """
+
+    def __init__(self, capacity: int, refill_per_minute: float) -> None:
+        self.capacity = float(capacity)
+        self.rate_per_sec = refill_per_minute / 60.0
+        self.buckets: dict[str, _Bucket] = {}
+
+    def take(self, key: str) -> bool:
+        now = time.monotonic()
+        b = self.buckets.get(key)
+        if b is None:
+            b = _Bucket(tokens=self.capacity, last_ts=now)
+            self.buckets[key] = b
+        elapsed = now - b.last_ts
+        b.tokens = min(self.capacity, b.tokens + elapsed * self.rate_per_sec)
+        b.last_ts = now
+        if b.tokens < 1.0:
+            return False
+        b.tokens -= 1.0
+        return True
+
+
+def client_ip(request: Request) -> str:
+    """Best-effort client IP extraction.
+
+    Caddy puts the real client in `X-Forwarded-For`; uvicorn behind a
+    127.0.0.1-only proxy will see `request.client.host == "127.0.0.1"`
+    for every request, so trusting X-F-F is necessary for any per-client
+    behaviour at all.
+    """
+    xff = request.headers.get("x-forwarded-for")
+    if xff:
+        return xff.split(",")[0].strip()
+    return request.client.host if request.client else "unknown"