This ESP32 Voice Recorder Has No Buttons and No Touch – Here’s How It Works

Last Updated on May 19, 2026 by Engr. Shahzada Fahad

Table of Contents

Can an ESP32 really work as a voice recorder?

In this article, I turn an ESP32 into a real, gesture-controlled voice recorder and show how it records audio and saves proper WAV files to an SD card with no buttons and no touch controls at all. making it a complete DIY ESP32 voice recorder. Minimal ESP32 Voice Recorder setup with no touch controls or physical buttons

Normally, audio recording projects mean extra modules, complex wiring, and messy code. You need a separate microphone, an amplifier, power management, and even after all that, you still fight noise and stability issues.

But this board changes everything with gesture control, making it feel like a true no button voice recorder and a smooth gesture controlled ESP32 project, almost like a smart voice recorder ESP32 that responds naturally without any physical interaction.

Makerfabs MaTouch 3.16-inch TFT display with ESP32-S3 for building an ESP32 voice recorder

This is the Makerfabs MaTouch ESP32-S3 Parallel TFT 3.16-inch ST7701S development kit..It already includes a digital microphone, a powerful ESP32-S3, a large display, and proper power management — all in one clean design.

ESP32 Voice Recorder board with onboard components on Makerfabs MaTouch ESP32-S3 TFT module

No extra modules. No wiring mess. That’s why it’s perfect for building a gesture-controlled ESP32 voice recorder.

One thing I really appreciate about Makerfabs is that they don’t hide the design.

All the important connections are exposed and clearly laid out.

Wiring schematic of ESP32 Voice Recorder with MaTouch TFT display module

You can actually see how the microphone, display, power circuit, and ESP32 are connected.

This makes it much easier to learn, understand the signal flow, and eventually build your own designs from scratch. Don’t worry, if you are not able to see the connections clearly, this circuit diagram is available in the folder.

Portable ESP32 Voice Recorder with rechargeable lithium battery connection

Portability is another big deal for a real voice recorder.

This board supports a battery with built-in charging, so with a small Li-ion battery, the project becomes fully portable.

No power banks. No USB cables.

Just switch it on and record anywhere.

Close-up of INMP441 I2S MEMS microphone connected to ESP32 voice recorder circuit

The microphone used here is the INMP441 digital I2S microphone.

It’s one of the best choices for ESP32 audio projects.

Because it’s digital, the audio goes straight to the ESP32 over I2S, avoiding analog noise, unstable ADC readings, and signal issues.

The INMP441 isn’t meant for studio music production, and its sensitivity is fixed; but for speech, learning, and embedded voice recorders, it’s an excellent choice.

And the best part?

Everything is already integrated properly.

Microphone, display, power system, and controls are all ready to go.

That means less time fighting hardware and more time understanding how audio recording actually works on the ESP32.

If you want ready-to-use project folders and extra resources, you can support my work on Patreon.

With just a $1 subscription, you get access to all my premium projects.

ESP32 Voice Recorder Software Setup – Arduino IDE, Boards & Libraries

Before we start writing any code, let me quickly share my software setup so you can follow along without any issues.

Installing board packages and libraries for ESP32 Voice Recorder project

I am using Arduino IDE version 2.3.6, along with the ESP32 boards package version 2.0.11; if you want you can also try 2.0.16. This setup works reliably with the ESP32-S3 and avoids compatibility problems.

For libraries, you will need two important ones.

The first is the QMI8658 library version 1.0.1, which is used for gesture and motion input on this board.

And for Arduino_GFX_Library watch the video.

For the Arduino_GFX_Library

Go into the “lib” folder.

Installing Arduino GFX library for ESP32 voice recorder display on MaTouch TFT 3.16-inch board

From here, copy “GFX_Library_for_Arduino”… and paste it into your Arduino libraries folder.

If you already have an older GFX library installed, make sure you delete it first.

Otherwise, you may run into multiple libraries found errors during compilation, and that can cause unnecessary confusion.

Once you have the Arduino IDE installed, the ESP32 board package set correctly, and both of these libraries added, the software side is fully ready. At this point, we are officially prepared to move forward and start building the voice recorder.

Arduino IDE:

Arduino IDE 2.3.6

ESP32 Boards package:

ESP32 V2.0.11 or ESP32 V2.0.16

Libraries:

QMI8658 v1.0.1 library

GFX_Library_for_Arduino

Install JPEGDecoder v2.0.0 library

ESP32 Voice Recorder Full Code

#include <driver/i2s.h>
#include "FS.h"
#include "SD.h"
#include "SPI.h"
#include <Arduino_GFX_Library.h>
#include <QMI8658.h>

// ================= I2S MIC CONFIG =================
// 22050 Hz = AM Radio Quality. 
// Clear speech, but not "Studio" quality. Good for V1.
#define SAMPLE_RATE 22050 
#define I2S_BCLK    44
#define I2S_LRC     43
#define I2S_DIN      2


// ================= SD CARD ========================
#define SD_CS    42
#define SD_MOSI  40
#define SD_MISO  38
#define SD_SCK   39

// ================= TFT DISPLAY ====================
#define TFT_BLK  46
#define SCREEN_W 320
#define SCREEN_H 820

// ================= IMU (ACCEL) ====================
#define SDA 17
#define SCL 18

// ================= AUDIO BUFFERING ================
#define BUFFER_SIZE 1024 

// ================= SHAKE SETTINGS =================
// Asymmetric Thresholds: Easy to start, Harder to stop
#define SHAKE_THRESHOLD_START 1200   
#define SHAKE_THRESHOLD_STOP  1800 
#define SHAKE_DEBOUNCE        1000  

// ================= UI COLORS ======================
#define COLOR_BG      0x10A2 
#define COLOR_CARD    0x2124 
#define COLOR_ACCENT  0x035A 
#define COLOR_SUCCESS 0x0600 
#define COLOR_DANGER  0xF800 
#define COLOR_TEXT    0xFFFF 
#define COLOR_MUTED   0x8410 

// ================= TFT INIT =======================
Arduino_DataBus *bus = new Arduino_SWSPI(GFX_NOT_DEFINED, 45, 39, 40, GFX_NOT_DEFINED);
Arduino_ESP32RGBPanel *rgbpanel = new Arduino_ESP32RGBPanel(
    7,4,5,6,12,11,8,16,15,0,14,10,9,3,13,48,47,1,21,41,
    1,10,8,50,1,10,8,20
);
Arduino_RGB_Display *gfx = new Arduino_RGB_Display(
    SCREEN_W, SCREEN_H, rgbpanel, 0, true,
    bus, GFX_NOT_DEFINED,
    st7701_type9_init_operations,
    sizeof(st7701_type9_init_operations)
);

// ================= GLOBALS ========================
QMI8658 imu;
File recFile;
bool isRecording = false;
unsigned long lastShake = 0;
float lastAccelX = 0;
float lastAccelY = 0;
float lastAccelZ = 0;
uint32_t totalBytes = 0;
int recordingNumber = 1;
bool imuCalibrated = false;

// ================= UI HELPERS =====================
void centerPrint(String text, int y, int size, uint16_t color) {
  gfx->setTextSize(size);
  gfx->setTextColor(color);
  int16_t textWidth = text.length() * 6 * size; 
  int16_t x = (SCREEN_W - textWidth) / 2;
  if (x < 0) x = 0;
  gfx->setCursor(x, y);
  gfx->print(text);
}

void drawBaseUI() {
  gfx->fillScreen(COLOR_BG);
  gfx->fillRect(0, 0, SCREEN_W, 60, COLOR_ACCENT);
  centerPrint("VOICE RECORDER", 20, 3, COLOR_TEXT);
}

void drawStandbyUI() {
  drawBaseUI();
  gfx->fillRoundRect(20, 100, SCREEN_W - 40, 300, 15, COLOR_CARD);
  
  // Blue Circle
  gfx->fillCircle(SCREEN_W / 2, 220, 60, COLOR_ACCENT);
  gfx->fillCircle(SCREEN_W / 2, 220, 50, COLOR_CARD); 
  
  centerPrint("STANDBY", 320, 3, COLOR_TEXT);
  
  char nextFile[32];
  sprintf(nextFile, "Next: rec_%03d.wav", recordingNumber);
  centerPrint(nextFile, 360, 2, COLOR_MUTED);

  centerPrint("Shake to Record", 430, 2, COLOR_MUTED);
}

void drawRecordingUI() {
  drawBaseUI();
  gfx->fillRoundRect(20, 100, SCREEN_W - 40, 300, 15, COLOR_CARD);
  
  // Red Pulsing Circle
  gfx->fillCircle(SCREEN_W / 2, 220, 60, COLOR_DANGER);
  centerPrint("REC", 212, 2, COLOR_TEXT);
  
  centerPrint("RECORDING...", 320, 3, COLOR_TEXT);
  centerPrint("22kHz Quality", 360, 2, COLOR_MUTED);
  centerPrint("Shake to Stop", 430, 2, COLOR_MUTED);
}

void drawSavedUI(String filename) {
  drawBaseUI();
  gfx->fillRoundRect(20, 100, SCREEN_W - 40, 300, 15, COLOR_CARD);
  
  // Green Circle
  gfx->fillCircle(SCREEN_W / 2, 220, 60, COLOR_SUCCESS);
  
  centerPrint("SAVED!", 320, 3, COLOR_TEXT);
  centerPrint(filename, 360, 2, COLOR_MUTED);
  centerPrint("Shake to record again", 430, 2, COLOR_MUTED);
}

// ================= WAV HEADER =====================
void writeWavHeader(File &file, uint32_t dataSize) {
  uint32_t chunkSize = 36 + dataSize;
  uint16_t audioFormat = 1;
  uint16_t numChannels = 1;
  uint32_t sampleRate = SAMPLE_RATE;
  uint16_t bitsPerSample = 16;
  uint32_t byteRate = sampleRate * numChannels * bitsPerSample / 8;
  uint16_t blockAlign = numChannels * bitsPerSample / 8;

  file.seek(0);
  file.write((const uint8_t*)"RIFF", 4);
  file.write((const uint8_t*)&chunkSize, 4);
  file.write((const uint8_t*)"WAVE", 4);
  file.write((const uint8_t*)"fmt ", 4);
  uint32_t subChunk1Size = 16;
  file.write((const uint8_t*)&subChunk1Size, 4);
  file.write((const uint8_t*)&audioFormat, 2);
  file.write((const uint8_t*)&numChannels, 2);
  file.write((const uint8_t*)&sampleRate, 4);
  file.write((const uint8_t*)&byteRate, 4);
  file.write((const uint8_t*)&blockAlign, 2);
  file.write((const uint8_t*)&bitsPerSample, 2);
  file.write((const uint8_t*)"data", 4);
  file.write((const uint8_t*)&dataSize, 4);
}

// ================= MIC INIT =======================
void MicInit() {
  i2s_config_t cfg = {
    .mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_RX),
    .sample_rate = SAMPLE_RATE,
    .bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT,
    .channel_format = I2S_CHANNEL_FMT_ONLY_RIGHT,
    .communication_format = I2S_COMM_FORMAT_I2S,
    .intr_alloc_flags = ESP_INTR_FLAG_LEVEL1,
    .dma_buf_count = 8,      
    .dma_buf_len = 512,      
    .use_apll = true         
  };

  i2s_pin_config_t pin = {
    .bck_io_num = I2S_BCLK,
    .ws_io_num  = I2S_LRC,
    .data_out_num = I2S_PIN_NO_CHANGE,
    .data_in_num  = I2S_DIN
  };

  i2s_driver_install(I2S_NUM_0, &cfg, 0, NULL);
  i2s_set_pin(I2S_NUM_0, &pin);
}

// ================= START RECORD ===================
void startRecording() {
  char filename[32];
  sprintf(filename, "/rec_%03d.wav", recordingNumber);
  
  // Smart Indexing Check
  while(SD.exists(filename)) {
    recordingNumber++;
    sprintf(filename, "/rec_%03d.wav", recordingNumber);
  }

  recFile = SD.open(filename, FILE_WRITE);
  if(!recFile){
    drawBaseUI();
    centerPrint("ERROR", 200, 4, COLOR_DANGER);
    centerPrint("SD Card Write Failed", 250, 2, COLOR_TEXT);
    return;
  }

  for (int i=0;i<44;i++) recFile.write((uint8_t)0); // reserve header
  totalBytes = 0;
  isRecording = true;
  lastShake = millis(); // Reset debounce

  MicInit();
  drawRecordingUI();
}

// ================= STOP RECORD ====================
void stopRecording() {
  isRecording = false;
  i2s_driver_uninstall(I2S_NUM_0);
  writeWavHeader(recFile, totalBytes);
  recFile.close();

  char filename[32];
  sprintf(filename, "rec_%03d.wav", recordingNumber);
  drawSavedUI(String(filename));
  
  recordingNumber++; // Prepare for next
  lastShake = millis(); // Reset debounce
}

// ================= SETUP ==========================
void setup() {
  Serial.begin(115200);

  pinMode(TFT_BLK, OUTPUT);
  digitalWrite(TFT_BLK, LOW);
  

  gfx->begin();
  drawBaseUI();
  centerPrint("BOOTING...", 200, 2, COLOR_MUTED);

  SPI.begin(SD_SCK, SD_MISO, SD_MOSI, SD_CS);
  if (!SD.begin(SD_CS)) {
    centerPrint("SD FAIL", 250, 3, COLOR_DANGER);
    while(1);
  }

  // === SMART INDEXING ===
  drawBaseUI();
  centerPrint("INDEXING...", 200, 2, COLOR_MUTED);
  while(true) {
    char testName[32];
    sprintf(testName, "/rec_%03d.wav", recordingNumber);
    if (!SD.exists(testName)) {
      break; 
    }
    recordingNumber++;
  }

  if (!imu.begin(SDA, SCL)) {
    centerPrint("IMU FAIL", 250, 3, COLOR_DANGER);
    while(1);
  }

  imu.setAccelRange(QMI8658_ACCEL_RANGE_2G);
  imu.setAccelODR(QMI8658_ACCEL_ODR_500HZ);
  imu.setAccelUnit_mg(true);
  imu.enableSensors(QMI8658_ENABLE_ACCEL);

  drawBaseUI();
  centerPrint("Calibrating...", 200, 2, COLOR_MUTED);
  delay(1500);
  QMI8658_Data d;
  if (imu.readSensorData(d)) {
    lastAccelX = d.accelX;
    lastAccelY = d.accelY;
    lastAccelZ = d.accelZ;
  }
  imuCalibrated = true;

  drawStandbyUI();
}

// ================= LOOP ===========================
void loop() {
  uint8_t buffer[BUFFER_SIZE];
  size_t bytesRead;

  // ---------- RECORD DATA ----------
  if (isRecording) {
    // RAW Recording: No DSP, No Gain, No Loop Processing
    i2s_read(I2S_NUM_0, buffer, BUFFER_SIZE, &bytesRead, portMAX_DELAY);
    if (bytesRead > 0) {
      recFile.write(buffer, bytesRead);
      totalBytes += bytesRead;
    }
  }

  // ---------- SHAKE DETECT ----------
  if (imuCalibrated) {
    QMI8658_Data d;
    if (imu.readSensorData(d)) {
      float delta = abs(d.accelX - lastAccelX) + abs(d.accelY - lastAccelY) + abs(d.accelZ - lastAccelZ);
      lastAccelX = d.accelX;
      lastAccelY = d.accelY;
      lastAccelZ = d.accelZ;

      // START: Standard Threshold
      // STOP: Higher Threshold (Prevents accidental stops)
      int currentThreshold = isRecording ? SHAKE_THRESHOLD_STOP : SHAKE_THRESHOLD_START;

      if (delta > currentThreshold && millis() - lastShake > SHAKE_DEBOUNCE) {
        lastShake = millis();
        if (!isRecording) startRecording();
        else stopRecording();
      }
    }
  }
}

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

#include <driver/i2s.h>

#include "FS.h"

#include "SD.h"

#include "SPI.h"

#include <Arduino_GFX_Library.h>

#include <QMI8658.h>

// ================= I2S MIC CONFIG =================

// 22050 Hz = AM Radio Quality.

// Clear speech, but not "Studio" quality. Good for V1.

#define SAMPLE_RATE 22050

#define I2S_BCLK 44

#define I2S_LRC 43

#define I2S_DIN 2

// ================= SD CARD ========================

#define SD_CS 42

#define SD_MOSI 40

#define SD_MISO 38

#define SD_SCK 39

// ================= TFT DISPLAY ====================

#define TFT_BLK 46

#define SCREEN_W 320

#define SCREEN_H 820

// ================= IMU (ACCEL) ====================

#define SDA 17

#define SCL 18

// ================= AUDIO BUFFERING ================

#define BUFFER_SIZE 1024

// ================= SHAKE SETTINGS =================

// Asymmetric Thresholds: Easy to start, Harder to stop

#define SHAKE_THRESHOLD_START 1200

#define SHAKE_THRESHOLD_STOP 1800

#define SHAKE_DEBOUNCE 1000

// ================= UI COLORS ======================

#define COLOR_BG 0x10A2

#define COLOR_CARD 0x2124

#define COLOR_ACCENT 0x035A

#define COLOR_SUCCESS 0x0600

#define COLOR_DANGER 0xF800

#define COLOR_TEXT 0xFFFF

#define COLOR_MUTED 0x8410

// ================= TFT INIT =======================

Arduino_DataBus *bus = new Arduino_SWSPI(GFX_NOT_DEFINED, 45, 39, 40, GFX_NOT_DEFINED);

Arduino_ESP32RGBPanel *rgbpanel = new Arduino_ESP32RGBPanel(

7,4,5,6,12,11,8,16,15,0,14,10,9,3,13,48,47,1,21,41,

1,10,8,50,1,10,8,20

);

Arduino_RGB_Display *gfx = new Arduino_RGB_Display(

SCREEN_W, SCREEN_H, rgbpanel, 0, true,

bus, GFX_NOT_DEFINED,

st7701_type9_init_operations,

sizeof(st7701_type9_init_operations)

);

// ================= GLOBALS ========================

QMI8658 imu;

File recFile;

bool isRecording = false;

unsigned long lastShake = 0;

float lastAccelX = 0;

float lastAccelY = 0;

float lastAccelZ = 0;

uint32_t totalBytes = 0;

int recordingNumber = 1;

bool imuCalibrated = false;

// ================= UI HELPERS =====================

void centerPrint(String text, int y, int size, uint16_t color) {

gfx->setTextSize(size);

gfx->setTextColor(color);

int16_t textWidth = text.length() * 6 * size;

int16_t x = (SCREEN_W - textWidth) / 2;

if (x < 0) x = 0;

gfx->setCursor(x, y);

gfx->print(text);

}

void drawBaseUI() {

gfx->fillScreen(COLOR_BG);

gfx->fillRect(0, 0, SCREEN_W, 60, COLOR_ACCENT);

centerPrint("VOICE RECORDER", 20, 3, COLOR_TEXT);

}

void drawStandbyUI() {

drawBaseUI();

gfx->fillRoundRect(20, 100, SCREEN_W - 40, 300, 15, COLOR_CARD);

// Blue Circle

gfx->fillCircle(SCREEN_W / 2, 220, 60, COLOR_ACCENT);

gfx->fillCircle(SCREEN_W / 2, 220, 50, COLOR_CARD);

centerPrint("STANDBY", 320, 3, COLOR_TEXT);

char nextFile[32];

sprintf(nextFile, "Next: rec_%03d.wav", recordingNumber);

centerPrint(nextFile, 360, 2, COLOR_MUTED);

centerPrint("Shake to Record", 430, 2, COLOR_MUTED);

}

void drawRecordingUI() {

drawBaseUI();

gfx->fillRoundRect(20, 100, SCREEN_W - 40, 300, 15, COLOR_CARD);

// Red Pulsing Circle

gfx->fillCircle(SCREEN_W / 2, 220, 60, COLOR_DANGER);

centerPrint("REC", 212, 2, COLOR_TEXT);

centerPrint("RECORDING...", 320, 3, COLOR_TEXT);

centerPrint("22kHz Quality", 360, 2, COLOR_MUTED);

centerPrint("Shake to Stop", 430, 2, COLOR_MUTED);

}

void drawSavedUI(String filename) {

drawBaseUI();

gfx->fillRoundRect(20, 100, SCREEN_W - 40, 300, 15, COLOR_CARD);

// Green Circle

gfx->fillCircle(SCREEN_W / 2, 220, 60, COLOR_SUCCESS);

centerPrint("SAVED!", 320, 3, COLOR_TEXT);

centerPrint(filename, 360, 2, COLOR_MUTED);

centerPrint("Shake to record again", 430, 2, COLOR_MUTED);

}

// ================= WAV HEADER =====================

void writeWavHeader(File &file, uint32_t dataSize) {

uint32_t chunkSize = 36 + dataSize;

uint16_t audioFormat = 1;

uint16_t numChannels = 1;

uint32_t sampleRate = SAMPLE_RATE;

uint16_t bitsPerSample = 16;

uint32_t byteRate = sampleRate * numChannels * bitsPerSample / 8;

uint16_t blockAlign = numChannels * bitsPerSample / 8;

file.seek(0);

file.write((const uint8_t*)"RIFF", 4);

file.write((const uint8_t*)&chunkSize, 4);

file.write((const uint8_t*)"WAVE", 4);

file.write((const uint8_t*)"fmt ", 4);

uint32_t subChunk1Size = 16;

file.write((const uint8_t*)&subChunk1Size, 4);

file.write((const uint8_t*)&audioFormat, 2);

file.write((const uint8_t*)&numChannels, 2);

file.write((const uint8_t*)&sampleRate, 4);

file.write((const uint8_t*)&byteRate, 4);

file.write((const uint8_t*)&blockAlign, 2);

file.write((const uint8_t*)&bitsPerSample, 2);

file.write((const uint8_t*)"data", 4);

file.write((const uint8_t*)&dataSize, 4);

}

// ================= MIC INIT =======================

void MicInit() {

i2s_config_t cfg = {

.mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_RX),

.sample_rate = SAMPLE_RATE,

.bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT,

.channel_format = I2S_CHANNEL_FMT_ONLY_RIGHT,

.communication_format = I2S_COMM_FORMAT_I2S,

.intr_alloc_flags = ESP_INTR_FLAG_LEVEL1,

.dma_buf_count = 8,

.dma_buf_len = 512,

.use_apll = true

};

i2s_pin_config_t pin = {

.bck_io_num = I2S_BCLK,

.ws_io_num = I2S_LRC,

.data_out_num = I2S_PIN_NO_CHANGE,

.data_in_num = I2S_DIN

};

i2s_driver_install(I2S_NUM_0, &cfg, 0, NULL);

i2s_set_pin(I2S_NUM_0, &pin);

}

// ================= START RECORD ===================

void startRecording() {

char filename[32];

sprintf(filename, "/rec_%03d.wav", recordingNumber);

// Smart Indexing Check

while(SD.exists(filename)) {

recordingNumber++;

sprintf(filename, "/rec_%03d.wav", recordingNumber);

}

recFile = SD.open(filename, FILE_WRITE);

if(!recFile){

drawBaseUI();

centerPrint("ERROR", 200, 4, COLOR_DANGER);

centerPrint("SD Card Write Failed", 250, 2, COLOR_TEXT);

return;

}

for (int i=0;i<44;i++) recFile.write((uint8_t)0); // reserve header

totalBytes = 0;

isRecording = true;

lastShake = millis(); // Reset debounce

MicInit();

drawRecordingUI();

}

// ================= STOP RECORD ====================

void stopRecording() {

isRecording = false;

i2s_driver_uninstall(I2S_NUM_0);

writeWavHeader(recFile, totalBytes);

recFile.close();

char filename[32];

sprintf(filename, "rec_%03d.wav", recordingNumber);

drawSavedUI(String(filename));

recordingNumber++; // Prepare for next

lastShake = millis(); // Reset debounce

}

// ================= SETUP ==========================

void setup() {

Serial.begin(115200);

pinMode(TFT_BLK, OUTPUT);

digitalWrite(TFT_BLK, LOW);

gfx->begin();

drawBaseUI();

centerPrint("BOOTING...", 200, 2, COLOR_MUTED);

SPI.begin(SD_SCK, SD_MISO, SD_MOSI, SD_CS);

if (!SD.begin(SD_CS)) {

centerPrint("SD FAIL", 250, 3, COLOR_DANGER);

while(1);

}

// === SMART INDEXING ===

drawBaseUI();

centerPrint("INDEXING...", 200, 2, COLOR_MUTED);

while(true) {

char testName[32];

sprintf(testName, "/rec_%03d.wav", recordingNumber);

if (!SD.exists(testName)) {

break;

}

recordingNumber++;

}

if (!imu.begin(SDA, SCL)) {

centerPrint("IMU FAIL", 250, 3, COLOR_DANGER);

while(1);

}

imu.setAccelRange(QMI8658_ACCEL_RANGE_2G);

imu.setAccelODR(QMI8658_ACCEL_ODR_500HZ);

imu.setAccelUnit_mg(true);

imu.enableSensors(QMI8658_ENABLE_ACCEL);

drawBaseUI();

centerPrint("Calibrating...", 200, 2, COLOR_MUTED);

delay(1500);

QMI8658_Data d;

if (imu.readSensorData(d)) {

lastAccelX = d.accelX;

lastAccelY = d.accelY;

lastAccelZ = d.accelZ;

}

imuCalibrated = true;

drawStandbyUI();

}

// ================= LOOP ===========================

void loop() {

uint8_t buffer[BUFFER_SIZE];

size_t bytesRead;

// ---------- RECORD DATA ----------

if (isRecording) {

// RAW Recording: No DSP, No Gain, No Loop Processing

i2s_read(I2S_NUM_0, buffer, BUFFER_SIZE, &bytesRead, portMAX_DELAY);

if (bytesRead > 0) {

recFile.write(buffer, bytesRead);

totalBytes += bytesRead;

}

// ---------- SHAKE DETECT ----------

if (imuCalibrated) {

QMI8658_Data d;

if (imu.readSensorData(d)) {

float delta = abs(d.accelX - lastAccelX) + abs(d.accelY - lastAccelY) + abs(d.accelZ - lastAccelZ);

lastAccelX = d.accelX;

lastAccelY = d.accelY;

lastAccelZ = d.accelZ;

// START: Standard Threshold

// STOP: Higher Threshold (Prevents accidental stops)

int currentThreshold = isRecording ? SHAKE_THRESHOLD_STOP : SHAKE_THRESHOLD_START;

if (delta > currentThreshold && millis() - lastShake > SHAKE_DEBOUNCE) {

lastShake = millis();

if (!isRecording) startRecording();

else stopRecording();

}

This code turns the ESP32-S3 into a complete, gesture-controlled voice recorder. Audio from the digital I2S microphone is captured cleanly and written directly to the SD card in standard WAV format, so every recording can be played instantly on any device. The ESP32 handles this in real time, making sure no audio data is lost.

The TFT display gives clear visual feedback at every stage. In standby, the system waits for input. When recording starts, the screen switches to a bold red “REC” indicator, and when recording stops, it turns green and confirms that the file has been saved. You always know exactly what the recorder is doing.

Instead of buttons, the QMI8658 motion sensor is used for control. After a short calibration at startup, a simple shake gesture starts or stops recording. This makes the whole system fast, intuitive, and completely touch-free.

How to Upload Code to ESP32-S3 Voice Recorder (Correct Settings)

To upload the program, here is what you need to do:

First, go to the Tools menu > Board > ESP32, and select ESP32S3 Dev Module.

Go back to the Tools Menu > Port, and choose the correct communication port.

Again to the tools menu > Flash Size, and select 16MB.

Again go to the Tools menu > Partition Scheme and select 16M Flash

And one last time, go to Tools Menu > PSRAM, and select OPI PSRAM.

Once these settings are done, you can simply click the Upload button.

Uploading program to MaTouch ESP32-S3 parallel TFT 3.16-inch board for voice recorder firmware

As you can see, the program has been successfully uploaded.

Real ESP32 Voice Recorder Demo – Shake to Record, Save & Play Audio

As soon as we flip the power switch, the system takes about two seconds to initialize.

Calibrating IMU and gyroscope sensors for gesture-controlled ESP32 voice recorder project

During this time, it’s zeroing out the accelerometer and calibrating itself. Pro tip here: keep the device completely still during this phase. If you move it too much, the shake detection can become a little too sensitive later.

Once the calibration is done, the system enters standby mode.

ESP32 Voice Recorder running in low power standby mode on TFT display

At this point, it’s fully armed and ready to go. Now, instead of fumbling around with tiny buttons, all you need to do is give the device a quick shake to start recording.

ESP32 Voice Recorder quick shake feature with accelerometer sensor

The moment recording starts, the interface changes instantly.

Interactive UI design for ESP32 Voice Recorder with touchscreen display

A big red circle appears on the screen with the word “REC” stamped right in the center. You simply cannot miss it. That large recording indicator makes it very clear that audio is being captured, even if the recorder is placed a few feet away from you.

Sound recording storage process in ESP32 Voice Recorder using SD card

When you are done recording, just shake the device again. The recording stops immediately, the screen turns green, and a clear “SAVED!” message appears.

SD card audio storage feature in ESP32 Voice Recorder setup

The file name is shown right there on the display, so there’s no guessing and no risk of losing your audio.

Before we play the recording, let me set the right expectations.

Right now, this project is using very basic audio settings. We are using a buffer size of 1024, 16-bit audio, and a 22.05 kHz sample rate. That puts us around AM-radio to podcast-draft quality; clear enough for speech, but definitely not studio grade. And that’s completely by design.

Right now, the goal is not to chase perfect sound. The goal is to build a reliable, gesture-controlled voice recorder and really understand the fundamentals: how the ESP32 captures digital audio, how that data is written cleanly to an SD card, and how motion control replaces physical buttons. This is version one, and it’s doing exactly what it’s meant to do.

Alright, with that out of the way… let’s play the recording. “For the practical Demonstration watch the video tutorial”.

Alright, so what you are hearing right now is coming directly from the ESP32 voice recorder. This is raw audio; 16-bit, 22 kilohertz, written straight to the SD card.

So yes, it sounds a bit thin, and it doesn’t have that studio warmth or punch.

And that’s completely expected.

This version is focused on functionality, not perfection.

It’s here to prove that the ESP32 can record audio reliably, save proper WAV files, and can be controlled entirely using gestures.

In the next video and article, we are taking a big step forward. I will be building a true studio-style voice recorder specifically for voiceovers.

That means proper DSP processing, digital gain control, limiters, noise handling, and all the techniques that actually shape and polish audio. If sound quality matters to you, that’s a video you absolutely don’t want to miss.

So, that’s all for now.

Watch Video Tutorial:

ESP32 Voice Recorder | No Buttons, No Touch

Frequently Asked Questions

Can an ESP32 really work as a voice recorder?

Yes, the ESP32-S3 can be used as a complete ESP32 voice recorder, capable of capturing audio and saving it as WAV files on an SD card.

How does this ESP32 voice recorder work without buttons or touch?

This project uses a gesture controlled ESP32 system, where a simple shake starts and stops recording instead of physical buttons.

What is the shake-to-record feature in this ESP32 project?

The shake feature uses motion detection to trigger recording, making it a true no button voice recorder experience.

Which ESP32 board is used in this voice recorder project?

The project uses the Makerfabs MaTouch ESP32-S3 TFT development board, which integrates display, power, and audio support.

What microphone is used in this ESP32 audio recording system?

The project uses the INMP441 ESP32 I2S microphone, which provides clean digital audio input.

Why is an I2S microphone used in ESP32 audio recording?

An ESP32 I2S microphone like INMP441 sends digital audio directly to ESP32, reducing noise and improving stability.

Does the ESP32 save WAV files directly?

Yes, this project uses ESP32 save WAV file functionality to store properly formatted audio files on SD card.

Where are audio files stored in this ESP32 voice recorder?

All recordings are saved on an SD card using the ESP32 SD card recording system.

What format is used for recorded audio files?

The system saves audio in standard WAV format, so files can be played on any device without conversion.

What sensor is used for gesture detection?

The QMI8658 accelerometer/IMU is used to detect shaking for controlling the ESP32 voice recorder without buttons.

What libraries are required for ESP32 audio recording?

This ESP32 audio recording project uses:

QMI8658 library
Arduino_GFX_Library
JPEGDecoder
SD and SPI libraries

What Arduino IDE version is used in this project?

The project uses Arduino IDE 2.3.6 for programming the ESP32.

Which ESP32 board package is required?

You need ESP32 board package v2.0.11 or v2.0.16 for stable compilation.

What sample rate is used in ESP32 audio recording?

The system uses 22.05 kHz sample rate, which gives clear speech quality suitable for voice recording.

What type of ESP32 project is this?

This is a gesture controlled ESP32 voice recorder project with SD card storage and I2S audio input.

Can ESP32 handle real-time audio recording?

Yes, ESP32 can handle real-time ESP32 audio recording by streaming I2S data into SD card storage.

Is this project suitable for beginners?

It is intermediate level because it combines ESP32 I2S microphone, SD card, IMU sensor, and display UI.

What makes this ESP32 voice recorder different?

It has:

No buttons
No touch input
Shake control
SD card WAV recording
Integrated display UI

Discover more from Electronic Clinic

Subscribe to get the latest posts sent to your email.

Engr. Shahzada Fahad

This ESP32 Voice Recorder Has No Buttons and No Touch – Here’s How It Works

Can an ESP32 really work as a voice recorder?

ESP32 Voice Recorder Software Setup – Arduino IDE, Boards & Libraries

ESP32 Voice Recorder Full Code

How to Upload Code to ESP32-S3 Voice Recorder (Correct Settings)

Real ESP32 Voice Recorder Demo – Shake to Record, Save & Play Audio

Watch Video Tutorial:

Frequently Asked Questions

Can an ESP32 really work as a voice recorder?

How does this ESP32 voice recorder work without buttons or touch?

What is the shake-to-record feature in this ESP32 project?

Which ESP32 board is used in this voice recorder project?

What microphone is used in this ESP32 audio recording system?