4e4e4f92a0
* fix(security): harden auth system and fix run journal logic bug
- Fix inverted condition in RunJournal.on_chat_model_start that prevented
first human message capture (not messages → messages)
- Pre-hash passwords with SHA-256 before bcrypt to avoid silent 72-byte
truncation vulnerability
- Move load_dotenv() from module scope into get_auth_config() to prevent
import-time os.environ mutation breaking test isolation
- Return generic ‘Invalid token’ instead of exposing specific error
variants (expired, malformed, invalid_signature) to clients
- Make @require_auth independently enforce 401 instead of silently
passing through when AuthMiddleware is absent
- Rate-limit /setup-status endpoint with per-IP cooldown to mitigate
initialization-state information leak
- Document in-process rate limiter limitation for multi-worker deployments
* fix(security): return 429+Retry-After on setup-status rate limit, bound cooldown dict
Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/070d0be8-99a5-46c8-85bb-6b81b5284021
Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com>
* fix(security): add versioned password hashes with auto-migration on login
The SHA-256 pre-hash change silently broke verification for any existing
bcrypt-only password hashes. Introduce a <N>$ prefix scheme so hashes
are self-describing:
- v2 (current): bcrypt(b64(sha256(password))) with $ prefix
- v1 (legacy): plain bcrypt, prefixed $ or bare (no prefix)
verify_password auto-detects the version and falls back to v1 for older
hashes. LocalAuthProvider.authenticate() now rehashes legacy hashes to v2
on successful login via needs_rehash(), so existing users upgrade
transparently without a dedicated migration step.
* fix(auth): harden verify_password, best-effort rehash, update require_auth docstring, downgrade journal logging
- password.py: wrap bcrypt.checkpw in try/except → return False for malformed/corrupt hashes instead of crashing
- local_provider.py: wrap auto-rehash update_user() in try/except so transient DB errors don't fail valid logins
- authz.py: update require_auth docstring to reflect independent 401 enforcement
- journal.py: downgrade on_chat_model_start from INFO to DEBUG, log only metadata (batch_count, message_counts) instead of full serialized/messages content
Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/48c5cf31-a4ab-418a-982a-6343c37bb299
Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com>
* fix(auth): address code review - narrow ValueError catch, add rehash warning log, rename num_batches
Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/48c5cf31-a4ab-418a-982a-6343c37bb299
Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
82 lines
2.8 KiB
Python
82 lines
2.8 KiB
Python
"""Password hashing utilities with versioned hash format.
|
|
|
|
Hash format: ``$dfv<N>$<bcrypt_hash>`` where ``<N>`` is the version.
|
|
|
|
- **v1** (legacy): ``bcrypt(password)`` — plain bcrypt, susceptible to
|
|
72-byte silent truncation.
|
|
- **v2** (current): ``bcrypt(b64(sha256(password)))`` — SHA-256 pre-hash
|
|
avoids the 72-byte truncation limit so the full password contributes
|
|
to the hash.
|
|
|
|
Verification auto-detects the version and falls back to v1 for hashes
|
|
without a prefix, so existing deployments upgrade transparently on next
|
|
login.
|
|
"""
|
|
|
|
import asyncio
|
|
import base64
|
|
import hashlib
|
|
|
|
import bcrypt
|
|
|
|
_CURRENT_VERSION = 2
|
|
_PREFIX_V2 = "$dfv2$"
|
|
_PREFIX_V1 = "$dfv1$"
|
|
|
|
|
|
def _pre_hash_v2(password: str) -> bytes:
|
|
"""SHA-256 pre-hash to bypass bcrypt's 72-byte limit."""
|
|
return base64.b64encode(hashlib.sha256(password.encode("utf-8")).digest())
|
|
|
|
|
|
def hash_password(password: str) -> str:
|
|
"""Hash a password (current version: v2 — SHA-256 + bcrypt)."""
|
|
raw = bcrypt.hashpw(_pre_hash_v2(password), bcrypt.gensalt()).decode("utf-8")
|
|
return f"{_PREFIX_V2}{raw}"
|
|
|
|
|
|
def verify_password(plain_password: str, hashed_password: str) -> bool:
|
|
"""Verify a password, auto-detecting the hash version.
|
|
|
|
Accepts v2 (``$dfv2$…``), v1 (``$dfv1$…``), and bare bcrypt hashes
|
|
(treated as v1 for backward compatibility with pre-versioning data).
|
|
"""
|
|
try:
|
|
if hashed_password.startswith(_PREFIX_V2):
|
|
bcrypt_hash = hashed_password[len(_PREFIX_V2) :]
|
|
return bcrypt.checkpw(_pre_hash_v2(plain_password), bcrypt_hash.encode("utf-8"))
|
|
|
|
if hashed_password.startswith(_PREFIX_V1):
|
|
bcrypt_hash = hashed_password[len(_PREFIX_V1) :]
|
|
else:
|
|
bcrypt_hash = hashed_password
|
|
|
|
return bcrypt.checkpw(plain_password.encode("utf-8"), bcrypt_hash.encode("utf-8"))
|
|
except ValueError:
|
|
# bcrypt raises ValueError for malformed or corrupt hashes (e.g., invalid salt).
|
|
# Fail closed rather than crashing the request.
|
|
return False
|
|
|
|
|
|
def needs_rehash(hashed_password: str) -> bool:
|
|
"""Return True if the hash uses an older version and should be rehashed."""
|
|
return not hashed_password.startswith(_PREFIX_V2)
|
|
|
|
|
|
async def hash_password_async(password: str) -> str:
|
|
"""Hash a password using bcrypt (non-blocking).
|
|
|
|
Wraps the blocking bcrypt operation in a thread pool to avoid
|
|
blocking the event loop during password hashing.
|
|
"""
|
|
return await asyncio.to_thread(hash_password, password)
|
|
|
|
|
|
async def verify_password_async(plain_password: str, hashed_password: str) -> bool:
|
|
"""Verify a password against its hash (non-blocking).
|
|
|
|
Wraps the blocking bcrypt operation in a thread pool to avoid
|
|
blocking the event loop during password verification.
|
|
"""
|
|
return await asyncio.to_thread(verify_password, plain_password, hashed_password)
|