Download Witsy from the releases page.
On macOS you can also brew install --cask witsy.
Witsy is a BYOK (Bring Your Own Keys) AI application: it means you need to have API keys for the LLM providers you want to use. Alternatively, you can use Ollama to run models locally on your machine for free and use them in Witsy.
It is the first of very few (only?) universal MCP clients:
Witsy allows you to run MCP servers with virtually any LLM!
| Capability | Providers |
|---|---|
| Chat | OpenAI, Anthropic, Google (Gemini), xAI (Grok), Meta (Llama), Ollama, LM Studio, MistralAI, DeepSeek, OpenRouter, Groq, Cerebras, Azure OpenAI, any provider who supports the OpenAI API standard (together.ai for instance) |
| Image Creation | OpenAI, Google, xAI, Replicate, fal.ai, HuggingFace, Stable Diffusion WebUI |
| Video Creation | OpenAI, Google, Replicate, fal.ai |
| Text-to-Speech | OpenAI, ElevenLabs, Groq, fal.ai |
| Speech-to-Text | OpenAI (Whisper), fal.ai, Fireworks.ai, Gladia, Groq, nVidia, Speechmatics, Local Whisper, Soniox (realtime and async) any provider who supports the OpenAI API standard |
| Search Engines | Perplexity, Tavily, Brave, Exa, Local Google Search |
| MCP Repositories | Smithery.ai |
| Embeddings | OpenAI, Ollama |
Non-exhaustive feature list:
- Chat completion with vision models support (describe an image)
- Text-to-image and text-to video
- Image-to-image (image editing) and image-to-video
- LLM plugins to augment LLM: execute python code, search the Internet...
- Anthropic MCP server support
- Scratchpad to interactively create the best content with any model!
- Prompt anywhere allows to generate content directly in any application
- AI commands runnable on highlighted text in almost any application
- Experts prompts to specialize your bot on a specific topic
- Long-term memory plugin to increase relevance of LLM answers
- Read aloud of assistant messages
- Read aloud of any text in other applications
- Chat with your local files and documents (RAG)
- Transcription/Dictation (Speech-to-Text)
- Realtime Chat aka Voice Mode
- Anthropic Computer Use support
- Local history of conversations (with automatic titles)
- Formatting and copy to clipboard of generated code
- Conversation PDF export
- Image copy and download
You can download a binary from from the releases page or build yourself:
npm ci
npm start
To use OpenAI, Anthropic, Google or Mistral AI models, you need to enter your API key:
To use Ollama models, you need to install Ollama and download some models.
To use text-to-speech, you need an
- OpenAI API key.
- Fal.ai API Key
- Fireworks.ai API Key
- Groq API Key
- Speechmatics API Key
- Gladia API Key
To use Internet search you need a Tavily API key.
Generate content in any application:
- From any editable content in any application
- Hit the Prompt anywhere shortcut (Shift+Control+Space / ^⇧Space)
- Enter your prompt in the window that pops up
- Watch Witsy enter the text directly in your application!
On Mac, you can define an expert that will automatically be triggered depending on the foreground application. For instance, if you have an expert used to generate linux commands, you can have it selected if you trigger Prompt Anywhere from the Terminal application!
AI commands are quick helpers accessible from a shortcut that leverage LLM to boost your productivity:
- Select any text in any application
- Hit the AI command shorcut (Alt+Control+Space / ⌃⌥Space)
- Select one of the commands and let LLM do their magic!
You can also create custom commands with the prompt of your liking!
Commands inspired by https://the.fibery.io/@public/Public_Roadmap/Roadmap_Item/AI-Assistant-via-ChatGPT-API-170.
From https://github.com/f/awesome-chatgpt-prompts.
https://www.youtube.com/watch?v=czcSbG2H-wg
You can connect each chat with a document repository: Witsy will first search for relevant documents in your local files and provide this info to the LLM. To do so:
- Click on the database icon on the left of the prompt
- Click Manage and then create a document repository
- OpenAI Embedding require on API key, Ollama requires an embedding model
- Add documents by clicking the + button on the right hand side of the window
- Once your document repository is created, click on the database icon once more and select the document repository you want to use. The icon should turn blue
You can transcribe audio recorded on the microphone to text. Transcription can be done using a variety of state of the art speech to text models (which require API key) or using local Whisper model (requires download of large files).
Currently Witsy supports the following speech to text models:
- GPT4o-Transcribe
- Gladia
- Speechmatics (Standards + Enhanced)
- Groq Whisper V3
- Fireworks.ai Realtime Transcription
- fal.ai Wizper V3
- fal.ai ElevenLabs
- nVidia Microsoft Phi-4 Multimodal
Witsy supports quick shortcuts, so your transcript is always only one button press away.
Once the text is transcribed you can:
- Copy it to your clipboard
- Summarize it
- Translate it to any language
- Insert it in the application that was running before you activated the dictation
https://www.youtube.com/watch?v=vixl7I07hBk
Witsy provides a local HTTP API that allows external applications to trigger various commands and features. The API server runs on localhost by default on port 8090 (or the next available port if 8090 is in use).
Security Note: The HTTP server runs on localhost only by default. If you need external access, consider using a reverse proxy with proper authentication.
The current HTTP server port is displayed in the tray menu below the Settings option:
- macOS/Linux: Check the fountain pen icon in the menu bar
- Windows: Check the fountain pen icon in the system tray
All endpoints support both GET (with query parameters) and POST (with JSON or form-encoded body) requests.
| Endpoint | Description | Optional Parameters |
|---|---|---|
GET /api/health |
Server health check | - |
GET/POST /api/chat |
Open main window in chat view | text - Pre-fill chat input |
GET/POST /api/scratchpad |
Open scratchpad | - |
GET/POST /api/settings |
Open settings window | - |
GET/POST /api/studio |
Open design studio | - |
GET/POST /api/forge |
Open agent forge | - |
GET/POST /api/realtime |
Open realtime chat (voice mode) | - |
GET/POST /api/prompt |
Trigger Prompt Anywhere | text - Pre-fill prompt |
GET/POST /api/command |
Trigger AI command picker | text - Pre-fill command text |
GET/POST /api/transcribe |
Start transcription/dictation | - |
GET/POST /api/readaloud |
Start read aloud | - |
GET /api/engines |
List available AI engines | Returns configured chat engines |
GET /api/models/:engine |
List models for an engine | Returns available models for specified engine |
POST /api/complete |
Run chat completion | stream (default: true), engine, model, thread (Message[]) |
GET/POST /api/agent/run/:token |
Trigger agent execution via webhook | Query params passed as prompt inputs |
GET /api/agent/status/:token/:runId |
Check agent execution status | Returns status, output, and error |
# Health check
curl http://localhost:8090/api/health
# Open chat with pre-filled text (GET with query parameter)
curl "http://localhost:8090/api/chat?text=Hello%20World"
# Open chat with pre-filled text (POST with JSON)
curl -X POST http://localhost:8090/api/chat \
-H "Content-Type: application/json" \
-d '{"text":"Hello World"}'
# Trigger Prompt Anywhere with text
curl "http://localhost:8090/api/prompt?text=Write%20a%20poem"
# Trigger AI command on selected text
curl -X POST http://localhost:8090/api/command \
-H "Content-Type: application/json" \
-d '{"text":"selected text to process"}'
# Trigger agent via webhook with parameters
curl "http://localhost:8090/api/agent/run/abc12345?input1=value1&input2=value2"
# Trigger agent with POST JSON
curl -X POST http://localhost:8090/api/agent/run/abc12345 \
-H "Content-Type: application/json" \
-d '{"input1":"value1","input2":"value2"}'
# Check agent execution status
curl "http://localhost:8090/api/agent/status/abc12345/run-uuid-here"
# List available engines
curl http://localhost:8090/api/engines
# List models for a specific engine
curl http://localhost:8090/api/models/openai
# Run non-streaming chat completion
curl -X POST http://localhost:8090/api/complete \
-H "Content-Type: application/json" \
-d '{
"stream": "false",
"engine": "openai",
"model": "gpt-4",
"thread": [
{"role": "user", "content": "Hello, how are you?"}
]
}'
# Run streaming chat completion (SSE)
curl -X POST http://localhost:8090/api/complete \
-H "Content-Type: application/json" \
-d '{
"stream": "true",
"thread": [
{"role": "user", "content": "Write a short poem"}
]
}'Witsy includes a command-line interface that allows you to interact with the AI assistant directly from your terminal.
Installation
The CLI is automatically installed when you launch Witsy for the first time:
- macOS: Creates a symlink at
/usr/local/bin/witsy(requires admin password) - Windows: Adds the CLI to your user PATH (restart terminal for changes to take effect)
- Linux: Creates a symlink at
/usr/local/bin/witsy(uses pkexec if needed)
Usage
Once installed, you can use the witsy command from any terminal:
witsyThe CLI will connect to your running Witsy application and provide an interactive chat interface. It uses the same configuration (engine, model, API keys) as your desktop application.
Available Commands
/help- Show available commands/model- Select engine and model/port- Change server port (default: 4321)/clear- Clear conversation history/history- Show conversation history/exit- Exit the CLI
Requirements
- Witsy desktop application must be running
- HTTP API server enabled (default port: 4321)
The /api/complete endpoint provides programmatic access to Witsy's chat completion functionality, enabling command-line tools and scripts to interact with any configured LLM.
Endpoint: POST /api/complete
Request body:
{
"stream": "true", // Optional: "true" (default) for SSE streaming, "false" for JSON response
"engine": "openai", // Optional: defaults to configured engine in settings
"model": "gpt-4", // Optional: defaults to configured model for the engine
"thread": [ // Required: array of messages
{"role": "user", "content": "Your prompt here"}
]
}Response (non-streaming, stream: "false"):
{
"success": true,
"response": {
"content": "The assistant's response text",
"usage": {
"promptTokens": 10,
"completionTokens": 20,
"totalTokens": 30
}
}
}Response (streaming, stream: "true"):
Server-Sent Events (SSE) format with chunks:
data: {"type":"content","text":"Hello","done":false}
data: {"type":"content","text":" world","done":false}
data: [DONE]
List Engines:
curl http://localhost:8090/api/enginesResponse:
{
"engines": [
{"id": "openai", "name": "OpenAI"},
{"id": "anthropic", "name": "Anthropic"},
{"id": "google", "name": "Google"}
]
}List Models for an Engine:
curl http://localhost:8090/api/models/openaiResponse:
{
"engine": "openai",
"models": [
{"id": "gpt-4", "name": "GPT-4"},
{"id": "gpt-3.5-turbo", "name": "GPT-3.5 Turbo"}
]
}Witsy includes a command-line interface for interacting with AI models directly from your terminal.
Requirements:
- Witsy application must be running (for the HTTP API server)
Launch the CLI:
npm run cliEnter /help to show the list of commands
Agent webhooks allow you to trigger agent execution via HTTP requests, enabling integration with external systems, automation tools, or custom workflows.
Setting up a webhook:
- Open the Agent Forge and select or create an agent
- Navigate to the "Invocation" tab (last step in the wizard)
- Check the "🌐 Webhook" checkbox
- A unique 8-character token is automatically generated for your agent
- Copy the webhook URL displayed (format:
http://localhost:{port}/api/agent/run/{token}) - You can regenerate the token at any time using the refresh button
Using the webhook:
- Send GET or POST requests to the webhook URL
- Include parameters as query strings (GET) or JSON body (POST)
- Parameters are automatically passed to the agent's prompt as input variables
- The agent must have prompt variables defined (e.g.,
{task},{name}) to receive the parameters - The webhook returns immediately with a
runIdandstatusUrlfor checking execution status
Example agent prompt:
Please process the following task: {task}
User: {user}
Priority: {priority}
Triggering the agent:
# Using GET with query parameters
curl "http://localhost:8090/api/agent/run/abc12345?task=backup&user=john&priority=high"
# Using POST with JSON
curl -X POST http://localhost:8090/api/agent/run/abc12345 \
-H "Content-Type: application/json" \
-d '{"task":"backup","user":"john","priority":"high"}'Run response:
{
"success": true,
"runId": "550e8400-e29b-41d4-a716-446655440000",
"statusUrl": "/api/agent/status/abc12345/550e8400-e29b-41d4-a716-446655440000"
}Checking execution status:
# Use the statusUrl from the webhook response (relative path)
curl "http://localhost:8090/api/agent/status/abc12345/550e8400-e29b-41d4-a716-446655440000"Status response (running):
{
"success": true,
"runId": "550e8400-e29b-41d4-a716-446655440000",
"agentId": "agent-uuid",
"status": "running",
"createdAt": 1234567890000,
"updatedAt": 1234567900000,
"trigger": "webhook"
}Status response (success):
{
"success": true,
"runId": "550e8400-e29b-41d4-a716-446655440000",
"agentId": "agent-uuid",
"status": "success",
"createdAt": 1234567890000,
"updatedAt": 1234567950000,
"trigger": "webhook",
"output": "Backup completed successfully for user john with high priority"
}Status response (error):
{
"success": true,
"runId": "550e8400-e29b-41d4-a716-446655440000",
"agentId": "agent-uuid",
"status": "error",
"createdAt": 1234567890000,
"updatedAt": 1234567999000,
"trigger": "webhook",
"error": "Failed to connect to backup server"
}- Workspaces / Projects (whatever the name is)
- Proper database (SQLite3) storage (??)
- [x]
- OpenAI Sora support
- Google Nano Banana support
- Command line interface
- HTTP Server for commanding Witsy, triggering Agents
- Table rendering as artifact, download as CSV and XSLX
- Web apps in menu bar
- Perplexity Search API support
- Design Studio drawing
- MCP Authorization support
- Implement Soniox for STT
- OpenAI GPT-5 support
- Agents (multi-step, scheduling...)
- Document Repository file change monitoring
- OpenAI API response (o3-pro)
- ChatGPT history import
- Onboarding experience
- Add, Edit & Delete System Prompts
- Backup/Restore of data and settings
- Transcribe Local Audio Files
- DeepResearch
- Local filesystem access plugin
- Close markdown when streaming
- Multiple attachments
- Custom OpenAI STT support
- AI Commands copy/insert/replace shortcuts
- Defaults at folder level
- Tool selection for chat
- Realtime STT with Speechmatics
- Meta/Llama AI support
- Realtime STT with Fireworks
- OpenAI image generation
- Azure AI support
- Brave Search plugin
- Allow user-input models for embeddings
- User defined parameters for custom engines
- Direct speech-to-text checbox
- Quick access buttons on home
- fal.ai support (speech-to-text, text-to-image and text-to-video)
- Debug console
- Design Studio
- i18n
- Mermaid diagram rendering
- Smithery.ai MCP integration
- Model Context Protocol
- Local Web Search
- Model defaults
- Speech-to-text language
- Model parameters (temperature...)
- Favorite models
- ElevenLabs Text-to-Speech
- Custom engines (OpenAI compatible)
- Long-term memory plugin
- OpenRouter support
- DeepSeek support
- Folder mode
- All instructions customization
- Fork chat (with optional LLM switch)
- Realtime chat
- Replicate video generation
- Together.ai compatibility
- Gemini 2.0 Flash support
- Groq LLama 3.3 support
- xAI Grok Vision Model support
- Ollama function-calling
- Replicate image generation
- AI Commands redesign
- Token usage report
- OpenAI o1 models support
- Groq vision support
- Image resize option
- Llama 3.2 vision support
- YouTube plugin
- RAG in Scratchpad
- Hugging face image generation
- Show prompt used for image generation
- Redesigned Prompt window
- Anthropic Computer Use
- Auto-update refactor (still not Windows)
- Dark mode
- Conversation mode
- Google function calling
- Anthropic function calling
- Scratchpad
- Dictation: OpenAI Whisper + Whisper WebGPU
- Auto-select expert based on foremost app (Mac only)
- Cerebras support
- Local files RAG
- Groq model update (8-Sep-2024)
- PDF Export of chats
- Prompts renamed to Experts. Now editable.
- Read aloud
- Import/Export commands
- Anthropic Sonnet 3.5
- Ollama base URL as settings
- OpenAI base URL as settings
- DALL-E as tool
- Google Gemini API
- Prompt anywhere
- Cancel commands
- GPT-4o support
- Different default engine/model for commands
- Text attachments (TXT, PDF, DOCX, PPTX, XLSX)
- MistralAI function calling
- Auto-update
- History date sections
- Multiple selection delete
- Search
- Groq API
- Custom prompts
- Sandbox & contextIsolation
- Application Menu
- Prompt history navigation
- Ollama model pull
- macOS notarization
- Fix when long text is highlighted
- Shortcuts for AI commands
- Shift to switch AI command behavior
- User feedback when running a tool
- Download internet content plugin
- Tavily Internet search plugin
- Python code execution plugin
- LLM Tools supprt (OpenAI only)
- Mistral AI API integration
- Latex rendering
- Anthropic API integration
- Image generation as b64_json
- Text-to-speech
- Log file (electron-log)
- Conversation language settings
- Paste image in prompt
- Run commands with default models
- Models refresh
- Edit commands
- Customized commands
- Conversation menu (info, save...)
- Conversation depth setting
- Save attachment on disk
- Keep running in system tray
- Nicer icon (still temporary)
- Rename conversation
- Copy/edit messages
- New chat window for AI command
- AI Commands with shortcut
- Auto-switch to vision model
- Run at login
- Shortcut editor
- Chat font size settings
- Image attachment for vision
- Stop response streaming
- Save/Restore window position
- Ollama support
- View image full screen
- Status/Tray bar icon + global shortcut to invoke
- Chat themes
- Default instructions in settings
- Save DALL-E images locally (and delete properly)
- OpenAI links in settings
- Copy code button
- Chat list ordering
- OpenAI model choice
- CSS variables






