singularity/singularity-forge

Author	SHA1	Message	Date
Jeremy McSpadden	6d77724378	perf: lazy-load LLM provider SDKs to reduce startup time All major LLM provider SDKs were loaded eagerly at startup, penalizing users regardless of which provider they actually use. This change defers SDK loading until first API call for: - @anthropic-ai/sdk (anthropic.ts) - openai (openai-responses.ts, openai-completions.ts, azure-openai-responses.ts) - @google/genai (google-vertex.ts) The Bedrock provider already used this pattern. Now all 5 remaining providers use the same async lazy-loader pattern: - Static import changed to `import type` (erased at compile time) - Module-level `let _SdkClass` cache variable - `async function getSdkClass()` loader with singleton caching - `createClient()` made async, uses `await getSdkClass()` - Call sites updated with `await createClient()` For google-vertex.ts, ThinkingLevel enum usage replaced with equivalent string literals to eliminate the runtime import entirely. All packages build cleanly. The startup improvement is proportional to how many providers were installed — on typical installs this eliminates eager loading of 30-40MB of SDK code.	2026-03-16 18:33:24 -05:00
TÂCHES	e4d47de1f6	Merge pull request #690 from trek-e/fix/688-thinking-minimal-gpt5 fix: clamp 'minimal' thinking level to 'low' for gpt-5.x models (#688)	2026-03-16 14:17:51 -06:00
Tom Boucher	1a499aecb2	fix: clamp 'minimal' thinking level to 'low' for gpt-5.x models (#688 ) gpt-5.x models (via Copilot/OpenAI/Azure) don't support 'minimal' as a reasoning effort level — they only accept 'none', 'low', 'medium', 'high', and 'xhigh'. Setting /thinking minimal with gpt-5.4 causes a 400 error. The openai-codex-responses provider already had this clamping, but the openai-responses and azure-openai-responses providers passed the value through unclamped. Add clampReasoningForModel() to both providers that maps 'minimal' to 'low' for gpt-5.x models, matching the existing behavior in openai-codex-responses. Fixes the bug portion of #688	2026-03-16 16:02:54 -04:00
Jamie McGregor Nelson	d4cf95f204	fix: type errors in claude-import.ts and marketplace-discovery.ts	2026-03-16 14:46:31 -04:00
Jamie McGregor Nelson	526fa7439d	fix: add missing type declarations for typecheck - Add @smithy/node-http-handler to pi-ai - Add @types/proper-lockfile, @types/hosted-git-info, @types/sql.js to pi-coding-agent - These were causing typecheck:extensions to fail due to missing type declarations	2026-03-16 12:29:45 -04:00
Andriyansyah Nurrachman	132ae92944	feat: update ollama cloud provider models (#578 )	2026-03-15 22:22:29 -06:00
Mannan Kant	96f5b58bd3	fix(pi-ai): address review comments on #504 — exhaustive switch, tests, cleanup (#587 ) - Restore exhaustive never check in mapStopReason (throw on unhandled FinishReason) - Add 12 unit tests for sanitizeSchemaForGoogle covering patternProperties removal, const→enum conversion at various depths, arrays, deeply nested objects, pass-through - Simplify redundant recursion branches into single typeof object catch-all - Fix misleading comment ("only in anyOf/oneOf") — conversion happens everywhere - Drop unnecessary (p: Part) annotation; TypeScript infers it from @google/genai types Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-15 22:21:20 -06:00
Mannan Kant	611cd0f508	copilot fix for https://github.com/gsd-build/gsd-2/issues/496 (#504 )	2026-03-15 18:19:41 -06:00
Flux Labs	e6d55f8aaf	Perf/gsd startup speed (#497 ) * docs: add startup performance analysis and optimization plan Profiled GSD CLI startup finding 2.2s for --version and ~3.8s for interactive mode. Identified 5 root causes with measured timings and created a phased optimization plan targeting <0.2s for --version and ~0.8s for interactive startup. * perf: speed up GSD startup with lazy loading and fast paths - Fast-path --version/-v and --help/-h in loader.ts before importing any heavy dependencies (2.2s → 0.15s, 14x faster) - Lazy-load undici (~200ms) only when HTTP_PROXY env vars are set - Skip initResources cpSync when managed-resources.json version matches current GSD version (~128ms saved per launch) - Lazy-load Mistral SDK (~369ms) on first API call instead of startup - Lazy-load Google GenAI SDK (~186ms) on first API call instead of startup - Parallelize extension loading with Promise.all() instead of sequential for-loop --------- Co-authored-by: TÂCHES <afromanguy@me.com>	2026-03-15 13:33:43 -06:00
Flux Labs	ecf8125e39	feat: add Ollama Cloud as model and web tool provider (#430 ) (#434 ) Add Ollama Cloud (ollama.com) as a built-in provider with both model hosting and web search/fetch capabilities. Model provider: - 13 curated models via OpenAI-compatible API (Llama 3.1, Qwen 3, DeepSeek R1, Gemma 3, Mistral, Phi-4, GPT-OSS) - Auth via OLLAMA_API_KEY environment variable - Registered in onboarding, env hydration, and model resolver Web tool provider: - Search via POST ollama.com/api/web_search - Page fetch via POST ollama.com/api/web_fetch (fallback after Jina) - Added as third search provider option alongside Tavily and Brave - /search-provider command updated with ollama option Closes #430	2026-03-14 21:03:31 -06:00
Flux Labs	595c778250	feat: add custom OpenAI-compatible endpoint option to onboarding wizard (#335 ) Adds a "Custom (OpenAI-compatible)" provider option to the API key flow in the onboarding wizard. When selected, prompts for base URL, API key, and model ID, then writes the config to models.json.	2026-03-14 15:07:47 -05:00
TÂCHES	4dcbff0c06	fix: increase timeout for z.ai provider to handle slow API spikes (#379 ) (#396 ) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-14 13:34:47 -06:00
Flux Labs	1f0c57aadf	fix: improve Cloud Code Assist 404 error with actionable model guidance (#384 ) * fix(auto): prevent hang when dispatch chain breaks after slice tasks complete (#381) After the last task in a slice completes, dispatchNextUnit() can throw (e.g. template mismatch, branch error, or any unprotected operation). The error propagates to the pi event emitter which silently swallows async rejections, leaving auto-mode active but permanently stalled — no dispatch, no stop, no recovery. Three defensive layers added: 1. Try-catch around dispatchNextUnit in handleAgentEnd — catches errors, shows them to the user, and schedules a retry via the gap watchdog. 2. Dispatch gap watchdog (30s timer) — fires when auto-mode is active but no unit was dispatched after a unit completion. Re-derives state and retries. If retry fails, stops auto-mode with diagnostics. 3. Error boundary in the agent_end event handler — last-resort catch that pauses auto-mode if handleAgentEnd itself throws. Closes #381 * fix: improve Cloud Code Assist 404 error with actionable model guidance (#368) When a model like gemini-2.0-flash isn't available via Cloud Code Assist, the 404 error now names the model and suggests using the google provider with GOOGLE_API_KEY or switching to a supported model.	2026-03-14 12:44:51 -06:00
Copilot	a6c5e4aca7	fix: add `undici` as root dependency to resolve startup crash (#372 )	2026-03-14 10:45:52 -06:00
Kassie Povinelli	c3ceb077d9	feat: add alibaba-coding-plan provider support (#295 )	2026-03-14 09:09:54 -06:00
vp275	03c48efbad	fix: strip variant suffix from model ID for OAuth Anthropic API calls Model variants like `claude-opus-4-6[1m]` use bracket suffixes to differentiate context window configurations internally, but the Anthropic API only accepts base model IDs (e.g. `claude-opus-4-6`). Sending the full variant ID via OAuth (Claude Max/Pro) causes a 404: {"type":"not_found_error","message":"model: claude-opus-4-6[1m]"} Strip any `[...]` suffix from model.id for OAuth requests only. API key auth is left unchanged since the behavior there is unverified.	2026-03-14 16:36:45 +05:30
copilot-swe-agent[bot]	bbfbb66ed2	Remove deprecated legacy dead code from OAuth module Co-authored-by: glittercowboy <186001655+glittercowboy@users.noreply.github.com>	2026-03-14 05:11:55 +00:00
Lex Christopherson	ca8697ae26	feat: use server-requested retry delay for Anthropic rate limits Anthropic's 429 responses include retry-after and x-ratelimit-reset-* headers that tell us exactly when to retry. Previously we ignored these and used exponential backoff (2s, 4s, 8s), which is both wrong and misleading in the UI countdown. - Add retryAfterMs to AssistantMessage as the structured carrier - Extract retry-after / x-ratelimit-reset-requests / x-ratelimit-reset-tokens from Anthropic SDK APIError.headers in the provider catch block - Session uses retryAfterMs when present (capped by maxDelayMs=60s), falls back to exponential backoff for errors with no timing hint The UI countdown now shows the actual Anthropic reset time. No UI changes needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-13 16:51:17 -06:00
Juan Francisco Lebrero	5ff362ed0e	feat: add claude-opus-4-6[1m] model with 1M context window (#288 ) Add the 1M context variant of Claude Opus 4.6 to the model registry and fix model resolver to try exact match before glob detection, so model IDs containing bracket characters (like [1m]) are not misinterpreted as glob patterns.	2026-03-13 16:25:45 -06:00
TÂCHES	8b9cfae9e9	feat: native Rust streaming JSON parser (#266 ) * feat: add native Rust streaming JSON parser for LLM tool call argument parsing Replaces the JS partial-json library with a Rust implementation exposed via napi-rs. The parser handles incomplete JSON from streaming deltas by closing unclosed strings, objects, arrays, removing trailing commas, and completing truncated literals. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: handle truncated numbers and remove dead partial-json dependency Adds truncated number recovery (e.g. `{"key": 12`, `{"key": 3.`, `{"key": 1e`) to the Rust streaming JSON parser, and removes the now-unused `partial-json` npm dependency from pi-ai. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-13 16:21:58 -06:00
Lex Christopherson	c80d640d35	feat: vendor Pi source into workspace monorepo Vendor all 4 Pi packages (tui, ai, agent-core, coding-agent) from pi-mono v0.57.1 as @gsd/* workspace packages under packages/. This replaces the compiled npm dependency (@mariozechner/pi-coding-agent) and patch-package workflow, giving direct source access for modifications. - Copy Pi source from pi-mono v0.57.1 into packages/ - Create workspace package.json + tsconfig.json for each package - Rename ~240 imports from @mariozechner/pi-* to @gsd/pi-* - Apply existing patches as source edits (setModel persist, VT input) - Remove @mariozechner/pi-coding-agent dep and patch-package - Update build pipeline to build packages in dependency order - Add pi-upstream git remote for future selective syncing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 21:55:17 -06:00

21 commits