Story
1. Introduction & Key Features
My project is a real-time AI voice assistant built on an ESP32. It listens to your questions, transcribes them via Deepgram, queries ChatGPT for up-to-date answers, then speaks back through an I²S amp—all while showing each step on a 16×2 LCD display. This makes it a hands-free, web-connected assistant without needing a PC.
- 
Live Web Data: Fetch news, weather, stock quotes, or any current event. 
- 
Local Logging: Records every question (WAV). 
- 
Clear UI: 16×2 LCD shows “Ready for Ans...”, “record start...”, “Speech to text”, and the actual text. 
- 
Portable Power: Runs from any USB charger or power bank—no batteries to worry about. 
2. Hardware Design & Assembly
All pin-out and wiring details can be found in the Schematic and Layout section. In brief, the core modules are:
- 
ESP32-WROOM-32: Main controller with Wi-Fi & I²S 
- 
INMP441 I²S Mic: Digital microphone 
- 
MAX98357A Amp: I²S audio output to speaker 
- 
microSD Module: WAV recording & logs 
- 
16×2 I²C LCD: Status display 
- 
Pushbutton: Record trigger 
3. Project Demonstration Video
See it in action:
https://drive.google.com/file/d/1I4ZKJhOAJsSllxNhRCoU3JL7TKS1CD7C/view?usp=drivesdk
4. Software Workflow & Code
st=>start: Button Press  
rec=>operation: Record WAV to SD  
stt=>operation: Deepgram STT → Text  
gpt=>operation: ChatGPT API → Reply Text  
tts=>operation: OpenAI TTS → Audio  
play=>end: Play via I²S → LCD shows steps  
st->rec->stt->gpt->tts->play
- 
Initialize Peripherals (I²S, SD, LCD, Wi-Fi) 
- 
Record Audio: Hold button → WAV saved to SD. 
- 
Transcribe: Upload WAV to Deepgram → receive plain text. 
- 
Chat & TTS: Send text to ChatGPT → get reply → send reply to OpenAI TTS. 
- 
Playback: Stream audio through MAX98357A → display messages. 
- 
Loop: Ready for next query. 
Code Highlights:
- 
lib_audio_recording.ino– I²S + SD card WAV writer
- 
lib_audio_transcription.ino– HTTPS POST to Deepgram STT
- 
lib_OpenAI_Chat.ino– ChatGPT Completions API handler
- 
lib_audio_tts.ino– OpenAI TTS playback routine
- 
main.ino– Orchestrates button, LCD, and state machine
5. Reference Code
Find the complete source code and documentation on GitHub:
https://github.com/kaloprojects/KALO-ESP32-Voice-ChatGPT
6. Step-by-Step Build Tutorial
- 
Solder headers to each module. 
- 
Wire modules per the PDF’s pin diagram. 
- 
Flash firmware via Arduino IDE 
- 
Insert a FAT32-formatted microSD card. 
- 
Power via USB charger—watch the LCD boot. 
- 
Hold the button, speak—watch and listen! 
7. Lessons Learned & Pitfalls
- 
I²S Buffering: Tune buffer sizes to prevent underruns. 
- 
Network Timeouts: Implement retries for STT and GPT calls. 
- 
Power Stability: Use a reliable USB supply to avoid drops during Wi-Fi. 
Conclusion
This project seamlessly integrates embedded audio I/O, cloud-based AI, and real-time web access into a user-friendly device—ideal for anyone looking to explore AI assistants on the go.
 
    
 
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                         
                                        


 
                             
                            


 
                                                 
                                                                                         
                                     
                             
                             
                             
                             
                     
                                                 
                                         
                                                 
                                         
                                                 
                                         
                                         
                                                 
                                         
                                         
                                         
                                                 
                                         
                                                 
                                         
                                                 
                                         
                                                 
                                         
                                         
                                                
