Aurix Project: Task Checklist
This document outlines the tasks required to build the Aurix application, based on the voice-first architecture.
Phase 0: Core Setup & Environment
- Initialize Electron Forge project with Vite + TypeScript + React.
- Establish secure IPC bridge between main and renderer processes using
contextBridge. - Set up basic UI shell with Tailwind CSS and initialize
shadcn/ui. - Integrate electron-store for local encrypted storage (instead of SQLite).
- Set up a basic logging service for debugging.
Phase 1: Hybrid Voice Capture & Transcription
- Implement the core
HybridTranscriptionServicein the main process to manage transcription sources. - Implement connectivity monitoring to switch between online/offline modes automatically.
- Renderer Process: Implement the
WebSpeechHandlerto manage the Web Speech API. - Main Process: Implement the Vosk model manager to download and load the local model (stubbed due to native module issues).
- Main Process: Integrate the Vosk engine for offline transcription (stubbed due to native module issues).
- Create the IPC channels for the renderer to start/stop transcription and for the main process to receive results from both sources.
- Implement the UI for recording and displaying the live transcript from the hybrid service.
- Add a "Privacy Mode" toggle in the UI that allows the user to force offline transcription.
Phase 2: AI Workflow (LangGraph)
- Define the
WorkflowStateobject that will be passed through the graph. - Build the
transcription_nodeto handle the incoming data from theHybridTranscriptionService. - Build the
analysis_nodeusing a local LLM (via Ollama) to classify content and calculate a complexity score. - Build the
document_generation_nodeto generate structured Markdown from the transcript. - Build the
diagram_generation_nodeto detect diagram descriptions and generate Mermaid syntax. - Build the
assembly_nodeto combine all parts into a final document. - Build the
cognitive_load_index_nodeto calculate the final θ score. - Wire all nodes together in a LangGraph workflow with appropriate conditional logic for error handling.
Phase 3: Core UI Implementation
- Build the main application layout (e.g., sidebar for sessions, main view for content).
- Implement the real-time UI for an active recording session, showing the live transcript and the progressively generated document.
- Create the dashboard view for browsing and managing past sessions/documents.
- Implement the UI for displaying the Cognitive Load Index and its historical trends.
- Build the user feedback mechanism (e.g., a slider) to capture subjective workload.
Phase 4: Finalization & Stretch Goals
- Implement the optional authentication flow using Auth0 for cloud features.
- Implement the data synchronization service using PouchDB/CouchDB for authenticated users.
- Add support for additional recording modes (PTT, VAD).
- Implement advanced voice commands.
- Build out the settings page (e.g., select microphone, choose AI models, configure hotkeys).
- Test and finalize the production build and auto-update process.