Speech-to-Code is a web application that leverages Large Language Models (LLMs) to convert spoken language into executable code. This project aims to streamline the code generation process by allowing developers to express their ideas verbally and have them translated into functional code.
Combine speech, repository files, and manual text input in our intuitive prompt composer:
Sidestep rate limits and outages by using LLM APIs direcrtly:
Easily review and copy the generated code:
Customize and organize your system prompts:
- Combine speech, repository files, and manual text input
- Real-time audio visualization for voice input
- Smart file suggestions based on context
- Preview and edit prompts before submission
- Integration with multiple LLM providers (OpenAI, Anthropic)
- Customizable model parameters
- Cost tracking and display
- Model-specific optimizations
- Interactive file viewer for repository navigation
- Smart file combinations for context
- File-based suggestions
- Repository structure visualization
- Persistent chat history with automatic saving
- Session organization and management
- Soft delete functionality for chat sessions
- Automatic chat naming based on context
- Local timestamp display for all messages
- Real-time speech-to-text conversion
- Transcription editing and refinement
- Voice input visualization
- Create and edit system prompts
- Organize prompts by category
- Quick prompt selection
- Version control for prompts
- Dark/Light mode toggle
- Two-column layout for better workflow
- Responsive design
- Copy-to-clipboard functionality
- Environment variable management
- Repository path configuration
- API key management
- Port configuration
- Comprehensive session logging
- Chat history persistence
- Automatic log directory management
- Easy access to past conversations
Before you begin, ensure you have installed:
- Node.js and npm (latest version)
- Python (version 3.7 or later)
- A Windows/Linux/Mac machine with command line access
-
Clone the repository
git clone https://github.com/dharllc/speech-to-code.git cd speech-to-code
-
Make the build script executable
chmod +x build.sh
-
Run the build script
./build.sh
The script will:
- Install necessary dependencies
- Set up a Python virtual environment
- Create .env files with placeholders
- Create required directories for logs and chat sessions
- Set appropriate permissions
-
Configure Environment Variables Navigate to the Settings page to configure:
- OpenAI API Key
- Google API Key
- Anthropic API Key
- Repository Path
-
Configure ports (optional) Edit
config.json
in the root directory:{ "frontend": { "port": 3000 }, "backend": { "port": 8000 } }
-
Start the frontend
cd frontend npm start
-
Start the backend
cd backend source venv/bin/activate uvicorn main:app --reload --log-level debug
Access the application at http://localhost:3000
π
speech-to-code/
βββ backend/
β βββ .env
β βββ main.py
β βββ llm_interaction.py
β βββ model_config.py
β βββ system_prompts.json
β βββ context_maps/
β βββ utils/
βββ frontend/
β βββ public/
β βββ src/
β βββ components/
β βββ services/
β βββ config/
βββ logs/
β βββ chat_sessions/ # Persistent chat history storage
βββ README.md
Backend dependencies failing to install:
If you encounter build errors with packages like grpcio
, tiktoken
, or tokenizers
on macOS:
-
Upgrade pip and build tools:
cd backend source venv/bin/activate pip install --upgrade pip setuptools wheel
-
Install problematic packages with precompiled wheels:
pip install --only-binary=all grpcio tiktoken tokenizers
-
Install remaining dependencies:
pip install uvicorn fastapi python-dotenv openai anthropic google-generativeai
"uvicorn not found" error:
- Ensure you've activated the virtual environment:
source venv/bin/activate
- Install uvicorn directly:
pip install uvicorn
"REPO_PATH environment variable is not set" error:
- Create a
.env
file in thebackend/
directory - Add:
REPO_PATH=/path/to/your/repositories
- Example:
REPO_PATH=/Users/username/Documents/GitHub
If you encounter other issues:
- Verify API keys in Settings
- Check dependencies
- Ensure both servers are running
- Check the logs directory for detailed session logs
- Verify proper permissions on the logs directory
For detailed logs, check:
- Console output of both servers
- Session logs in
logs/sessions/
- Application logs for debugging
We welcome contributions! Check our issues page for current tasks or suggest new features.
Have suggestions? Email me at sachin@dharllc.com
This project is licensed under the MIT License π