Speech-to-Code

Speech-to-Code is a web application that leverages Large Language Models (LLMs) to convert spoken language into executable code. This project aims to streamline the code generation process by allowing developers to express their ideas verbally and have them translated into functional code.

📸 Application Screenshots

🎯 Compose Your Prompt

Combine speech, repository files, and manual text input in our intuitive prompt composer:

💬 Use LLM APIs Directly

Sidestep rate limits and outages by using LLM APIs direcrtly:

📋 Review and Copy Code

Easily review and copy the generated code:

⚙️ Manage System Prompts

Customize and organize your system prompts:

✨ Key Features

🎯 Advanced Prompt Composer

Combine speech, repository files, and manual text input
Real-time audio visualization for voice input
Smart file suggestions based on context
Preview and edit prompts before submission

🤖 Multi-Model Support

Integration with multiple LLM providers (OpenAI, Anthropic)
Customizable model parameters
Cost tracking and display
Model-specific optimizations

📁 Repository Integration

Interactive file viewer for repository navigation
Smart file combinations for context
File-based suggestions
Repository structure visualization

💬 Chat Sessions Management

Persistent chat history with automatic saving
Session organization and management
Soft delete functionality for chat sessions
Automatic chat naming based on context
Local timestamp display for all messages

🎤 Transcription Management

Real-time speech-to-text conversion
Transcription editing and refinement
Voice input visualization

💡 System Prompt Management

Create and edit system prompts
Organize prompts by category
Quick prompt selection
Version control for prompts

🎨 User Experience

Dark/Light mode toggle
Two-column layout for better workflow
Responsive design
Copy-to-clipboard functionality

⚙️ Advanced Settings

Environment variable management
Repository path configuration
API key management
Port configuration

📝 Logging and History

Comprehensive session logging
Chat history persistence
Automatic log directory management
Easy access to past conversations

🚀 Getting Started

📋 Prerequisites

Before you begin, ensure you have installed:

Node.js and npm (latest version)
Python (version 3.7 or later)
A Windows/Linux/Mac machine with command line access

🔧 Installation

Clone the repository

git clone https://github.com/dharllc/speech-to-code.git
cd speech-to-code

Make the build script executable
```
chmod +x build.sh
```
Run the build script
```
./build.sh
```
The script will:
- Install necessary dependencies
- Set up a Python virtual environment
- Create .env files with placeholders
- Create required directories for logs and chat sessions
- Set appropriate permissions
Configure Environment Variables Navigate to the Settings page to configure:
- OpenAI API Key
- Google API Key
- Anthropic API Key
- Repository Path

🚀 Running the Application

Configure ports (optional) Edit config.json in the root directory:

{
    "frontend": {
        "port": 3000 
    },
    "backend": {
        "port": 8000 
    }
}

Start the frontend
```
cd frontend
npm start
```

Start the backend

cd backend
source venv/bin/activate
uvicorn main:app --reload --log-level debug

Access the application at http://localhost:3000 🌐

📁 Project Structure

speech-to-code/
├── backend/
│   ├── .env
│   ├── main.py
│   ├── llm_interaction.py
│   ├── model_config.py
│   ├── system_prompts.json
│   ├── context_maps/
│   └── utils/
├── frontend/
│   ├── public/
│   └── src/
│       ├── components/
│       ├── services/
│       └── config/
├── logs/
│   └── chat_sessions/      # Persistent chat history storage
└── README.md

🔧 Troubleshooting

Common Installation Issues

Backend dependencies failing to install: If you encounter build errors with packages like grpcio, tiktoken, or tokenizers on macOS:

Upgrade pip and build tools:

cd backend
source venv/bin/activate
pip install --upgrade pip setuptools wheel

Install problematic packages with precompiled wheels:

pip install --only-binary=all grpcio tiktoken tokenizers

Install remaining dependencies:

pip install uvicorn fastapi python-dotenv openai anthropic google-generativeai

"uvicorn not found" error:

Ensure you've activated the virtual environment: source venv/bin/activate
Install uvicorn directly: pip install uvicorn

"REPO_PATH environment variable is not set" error:

Create a .env file in the backend/ directory
Add: REPO_PATH=/path/to/your/repositories
Example: REPO_PATH=/Users/username/Documents/GitHub

General Issues

If you encounter other issues:

Verify API keys in Settings
Check dependencies
Ensure both servers are running
Check the logs directory for detailed session logs
Verify proper permissions on the logs directory

For detailed logs, check:

Console output of both servers
Session logs in logs/sessions/
Application logs for debugging

🤝 Contributing

We welcome contributions! Check our issues page for current tasks or suggest new features.

📬 Feedback

Have suggestions? Email me at sachin@dharllc.com

📄 License

This project is licensed under the MIT License 📜

Name		Name	Last commit message	Last commit date
Latest commit History 321 Commits
.github		.github
backend		backend
frontend		frontend
screenshots		screenshots
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
boot.sh		boot.sh
build.sh		build.sh
config.json		config.json
git-branch-display-implementation-plan.md		git-branch-display-implementation-plan.md
git-branch-implementation-checklist.md		git-branch-implementation-checklist.md
package-lock.json		package-lock.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Speech-to-Code

📸 Application Screenshots

🎯 Compose Your Prompt

💬 Use LLM APIs Directly

📋 Review and Copy Code

⚙️ Manage System Prompts

✨ Key Features

🎯 Advanced Prompt Composer

🤖 Multi-Model Support

📁 Repository Integration

💬 Chat Sessions Management

🎤 Transcription Management

💡 System Prompt Management

🎨 User Experience

⚙️ Advanced Settings

📝 Logging and History

🚀 Getting Started

📋 Prerequisites

🔧 Installation

🚀 Running the Application

📁 Project Structure

🔧 Troubleshooting

Common Installation Issues

General Issues

🤝 Contributing

📬 Feedback

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

dharllc/speech-to-code

Folders and files

Latest commit

History

Repository files navigation

Speech-to-Code

📸 Application Screenshots

🎯 Compose Your Prompt

💬 Use LLM APIs Directly

📋 Review and Copy Code

⚙️ Manage System Prompts

✨ Key Features

🎯 Advanced Prompt Composer

🤖 Multi-Model Support

📁 Repository Integration

💬 Chat Sessions Management

🎤 Transcription Management

💡 System Prompt Management

🎨 User Experience

⚙️ Advanced Settings

📝 Logging and History

🚀 Getting Started

📋 Prerequisites

🔧 Installation

🚀 Running the Application

📁 Project Structure

🔧 Troubleshooting

Common Installation Issues

General Issues

🤝 Contributing

📬 Feedback

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages