Lab 7: Build a Voice Changer/Voice Shield#
Last updated 8/12/24
00. Content #
Mathematics#
N/A
Programming Skills#
flow structures (
try
/finally
,for
/else
)Timing conflicts
Buffers
Embedded Systems#
N/A
0. Required Hardware #
Headphones
Write your name and email below:
Name: me
Email: me @purdue.edu
import numpy as np
import matplotlib.pyplot as plt
1. Intro #
1. Introduction #
In a previous lab, we learned about modulating signals and in later one, we learned how to capture and manipulate audio using python. That served as a good primer on how to manipulate audio in Python. We now combine the ideas from the earlier laboratories to create a real-time voice changer. If the time taken for processing data is more than a few milliseconds, we will be forced to either skip sections of audio or have a substantial lag between input and output. In this lab, we will explore ways to minimize latency in producing the modulated output inorder to process the audio in realtime. But before that let us look into audio recordng and playback with Python.
2. Audio Recorder #
We will be using PyAudio for making our voice changer To explore more in detail about PyAudio and its features, you could refer the API docementation for PyAudio.
For Recording Audio with Python, we have to do the following:
Open a data stream to get audio data frame from microphone
Iterate over the stream and append each frame to a list of frames.
Stop and close the data stream.
Save the data frames as a .wave file.
Close and terminate the audio stream.
Voice_recorder.py is implemented following the above pseudo code. It records 5s of mono channel audio and saves it as ‘my_recording.wav’.
Let us go through the code in voice_recorder.py.
p = pyaudio.PyAudio()
stream = p.open(format = FORMAT,
channels = CHANNELS,
rate = RATE,
input = True,
frames_per_buffer = chunk)
Here we are initializing an Audio stream. Since we enabled input as True, python will recognize this stream as an input audio stream.
for i in range(0, RATE//chunk * RECORD_SECONDS):
frame = stream.read(chunk, exception_on_overflow = False)
frames.append(frame)
This is the part where we read ‘chunk’ sized samples from the input audio stream. These samples are then appended to a list. The list of samples can be saved as a wav file.
Exercise 1#
What role does exception_on_overflow = False
do in frame = stream.read(chunk, exception_on_overflow = False)
on voice_recorder.py? Run the code after setting exception_on_overflow = True
and report your observations. What does the result indicate ?
2. Audio Player #
The idea behind the implementation of an audio player is similar but opposite to that of an audio recorder. Since we have the audio file saved in some format, a naive approach would be to load the audio as a single numpy array and play it using Audio() function.
But this approach is not an optimal solution when dealing with huge files.
So a good approach will be to read smaller chunks at a time from the audio file, so that we do not exhaust our total system memory.
The Pseudo code for the audio recording using python and PyAudio is as follows:
Initialize and open an output audio stream.
Iterate and read frames of data from the audio file.
Write the data frames to the output stream.
Close and terminate the audio stream.
Audio_player.py is implemented following the above pseudo code.
Exercise 2#
Record a 10s audio clip using voice_recorder.py and then play the 10s audio using audio_player.py. Report your observations.
3. Megaphone - a Pythonic Approach #
A megaphone is a portable, cone-shaped device used to amplify a person’s voice. It is a simple device to amplify input sounds in realtime. In this exercise we will be using python to implememt a Megahorn in realtime.
To implement such a device, we follow these steps:
Initialize and open an input and output stream.
Read frames from the input stream.
Write the frames from the input stream to the output stream.
Close and terminate the data streams.
Megaphone.py is implemented following the steps above.
Exercise 3#
Run Megaphone.py and report your observations. How well does this program perform when you are not using headphones? Report your observations.
4. Voice Changer #
Voice changers are one of those gadgets that you might have seen in spy movies or as a part of Halloween costumes.
When you speak, your throat produces a vibration which travels as a wave. Essentially, your voice is a sound wave. A voice changer, changes the shape of this sound wave by altering it. This can be done in multiple ways.
One of the ways to implement a voice changer is to find out the frequency components of your voice and replace those frequency components with another frequency. However this is computationally expensive and will require specialized hardware to run the operation in realtime.
A simple way to do this is to multiply the received audio signal with a wave of known frequency. This will shift the frequency components of the signal altering your voice.
To implement such a device, we will do the following:
Initialize and open an input and output stream.
Read frames from the input stream.
Multiply each such audio frames with a wave of known frequency.
Write these modified audio frames to the output stream.
Close and terminate the data streams.
Real_time_voice_changer.py is implemented using this approach.
Exercise 4#
Run Real_time_voice_changer.py and report your observations. Identify the shape and frequency of the wave used in altering the input.
Exercise 5#
Change the shape of the waveform used in Real_time_voice_changer.py. Report your code and observations.
Exercise 6#
How does Real_time_voice_changer.py perform when you:
increase the frequency of the input altering wave?
decrease the frequency of the input altering wave?
Reflection #
Do not skip this section! Lab will be graded only on completion of this section.
1. What parts of the lab, if any, do you feel you did well?
2. What are some things you learned today?
3. Are there any topics that could use more clarification?
4. Do you have any suggestions on parts of the lab to improve?