fix(streaming): quota error detection, error persistence, stream_end session_id fix (#767)

- quota_exhausted error type: distinguishes credit exhaustion from rate limits
- Streaming errors persisted to session file so they survive page reload
- _error flag excludes persisted errors from subsequent LLM API calls
- stream_end and title SSE events use original session_id (not s.session_id which rotates during context compaction)

Fixes #739, #652, #653
This commit is contained in:
nesquena-hermes
2026-04-20 15:48:19 -07:00
committed by GitHub
parent 76e602af25
commit 765d8520d4
7 changed files with 278 additions and 70 deletions

View File

@@ -37,7 +37,8 @@ class TestSilentErrorDetection:
"""streaming.py must return after emitting apperror (not also emit done)."""
# The return statement must come after the put('apperror') for no_response
no_resp_pos = STREAMING_PY.find("'no_response'")
return_pos = STREAMING_PY.find("return # Don't emit done", no_resp_pos)
# Comment updated: "apperror already closes the stream on the client side"
return_pos = STREAMING_PY.find("return # apperror already closes the stream", no_resp_pos)
assert no_resp_pos != -1, "no_response type not found in streaming.py"
assert return_pos != -1, (
"streaming.py must return after emitting apperror to prevent also emitting done (#373)"