Deployment#

This section outlines the deployment strategies and operational considerations for the Vaani Voice Assistant. As a real-time system involving audio processing, network requests, and hardware integration, proper deployment is critical for ensuring low latency and reliability.

System Architecture in Deployment#

Vaani operates as a hybrid edge-cloud application. The deployment environment must support: - Local Processing: Vosk ASR and Wake Word detection require local CPU resources. - Network I/O: Gemini API and Google Speech fallback require stable internet connectivity. - Audio Hardware: Direct access to ALSA (Linux), CoreAudio (macOS), or WASAPI (Windows).

Environment Setup#

To ensure reproducibility and stability, we support two primary deployment methodologies: an automated shell-script based approach for standard environments, and a comprehensive manual process for custom configurations.

Method A: Automated Deployment (Recommended)#

For valid POSIX-compliant systems (macOS, Debian/Ubuntu, Fedora, Arch), the setup.sh script provides a “one-click” provisioning experience. It handles system-level dependencies, Python environment creation, and model acquisition.

Execution:

# 1. Ensure execution permissions
chmod +x setup.sh

# 2. Run the provisioner
./setup.sh

Operations Performed: 1. System Detection: Identifies the OS and package manager (apt, dnf, brew, pacman). 2. Dependency Injection: Installs portaudio, ffmpeg, and vlc via the native package manager. 3. Environment Isolation: Creates a local .venv and installs pinned Python dependencies. 4. Asset Acquisition: Downloads and validates the Vosk speech models into the models/ directory.

Method B: Manual Deployment (Step-by-Step)#

For environments requiring granular control or non-standard paths, follow this strictly ordered procedure.

1. System Dependencies Ensure the following libraries are present in the library path/linker path:

PortAudio: Required for PyAudio microphone access.
FFmpeg: Required for audio transcoding (via yt-dlp).
VLC (libvlc): Required for the media playback engine.

Example (Ubuntu):

sudo apt-get update
sudo apt-get install -y portaudio19-dev ffmpeg vlc libportaudio2

2. Python Environment We enforce the use of virtual environments to prevent sys.path pollution.

# Create the virtual environment
python3 -m venv .venv

# Activate the environment (Bash/Zsh)
source .venv/bin/activate

3. Application Dependencies Install the application-specific libraries defined in the manifest.

pip install -r requirements.txt

4. Model Provisioning The application requires local ASR models to function. 1. Create a models/ directory in the project root. 2. Download a compatible Vosk model (e.g., vosk-model-small-en-in-0.4). 3. Extract the archive such that the model folder is directly inside models/.

Service Daemonization#

For a “always-on” assistant experience, Vaani should be run as a background daemon. We provide configurations for the two most common init systems: systemd (Linux) and launchd (macOS).

Linux: Systemd Service#

Systemd offers robust process supervision. The following unit file ensures Vaani starts after the network and sound subsystems are initialized.

File Location: /etc/systemd/system/vaani.service

[Unit]
Description=Vaani Voice Assistant Service
# Critical: Wait for network and audio to be ready to avoid startup race conditions
After=network.target sound.target

[Service]
Type=simple
User=myuser
# Set the working directory to the project root to ensure relative paths (like models/) resolve correctly
WorkingDirectory=/home/myuser/vaani
# Unbuffered output ensures logs appear immediately in journalctl
Environment="PYTHONUNBUFFERED=1"
ExecStart=/home/myuser/vaani/.venv/bin/python /home/myuser/vaani/main.py
# Auto-restart logic for resilience
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

Management Commands:

sudo systemctl daemon-reload      # Reload configuration
sudo systemctl enable vaani       # Enable on boot
sudo systemctl start vaani        # Start immediately
journalctl -u vaani -f            # Follow logs in real-time

macOS: LaunchAgent#

On macOS, launchd manages user-session agents. Unlike system daemons, a LaunchAgent has access to the graphical user session, which is often required for audio permission validation.

File Location: ~/Library/LaunchAgents/com.vaani.assistant.plist

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.vaani.assistant</string>

    <key>ProgramArguments</key>
    <array>
        <string>/Users/myuser/vaani/.venv/bin/python</string>
        <string>/Users/myuser/vaani/main.py</string>
    </array>

    <key>WorkingDirectory</key>
    <string>/Users/myuser/vaani</string>

    <key>RunAtLoad</key>
    <true/>

    <key>KeepAlive</key>
    <true/>

    <!-- Log redirection for debugging -->
    <key>StandardOutPath</key>
    <string>/Users/myuser/vaani/logs/vaani.out</string>
    <key>StandardErrorPath</key>
    <string>/Users/myuser/vaani/logs/vaani.err</string>
</dict>
</plist>

Management Commands:

launchctl load ~/Library/LaunchAgents/com.vaani.assistant.plist
launchctl start com.vaani.assistant

Operational Considerations#

Security & API Keys#

Vaani relies on external APIs (Google Gemini). Never hardcode API keys in the source. We utilize a .env file strategy:

Production keys are injected via the environment or a secure execution context.
File permissions on .env should be restricted (chmod 600 .env).

Model Management#

The ASR (Automatic Speech Recognition) models are significant artifacts (~50MB - 100MB). - Update Checks: The setup.sh script includes checksum logic to only download models if they are missing or corrupted. - Path Resolution: The codebase dynamically resolves model paths relative to the project root. Ensure the working directory is set correctly in service files.