Rade Kutil - VO+PS Audio Processing (SS15)

Documents for the VO

script

list of questions

PS-Exercises

Here is some guitar sound to use as test input, and also some speech sound.

Implement the bandpass filter with configurable $f_c$ and $f_d$.
Implement a two-channel equalizer with two peak filters using the above bandpass filter. Modulate the gain parameters via a LFO (low-frequency-oscillation), i.e. a $\sin$-function with, say, 2Hz.
Implement a phaser with only one allpass. Modulate $f_c$ with a low-frequency oscillator.
Extend the phaser to four allpasses with separate parameters for each allpass. Modulate the $f_c$-parameters independently with non-harmonic low-frequencies. Also try to implement the feedback loop. The result should sound like this.
Implement a 4-fold Wah-Wah effect. Modulate $f_c$ with a low-frequency oscillator, and calculate $f_d$ in a constant-q manner. The result should sound like this.
Generate a 5 second sine tone and resample it to an almost similar sampling rate using linear, Lanczos and allpass-interpolation. See if you can hear any difference for high frequencies.
Implement a stereo rotary speaker effect. The result should sound like this.
Implement a primitive vocoder based on the Hilbert transform: Read two sounds, e.g. guit3.wav and fox.wav. Transform both by a truncated Hilbert transform. Calculate the instantaneous amplitude of both. Then substitute the amplitude of guit3 by the amplitude of fox. The result should sound like this.
Implement a compressor that uses a squarer as detector and limits the level at -30dB. To test it, read guit3.wav and fox.wav, and mix them together with guit3.wav divided by 10 (radio host situation). Experiment with attack- and release-time parameters. The result should sound like this.

Some caveats: (1) The output of the squarer is converted to dB by 10*log10 because it is squared, otherwise it is 20*log₁₀. (2) For the second averager, the role of attack and release are reversed.
Implement a vocoder based on STFT.
Implement time-stretching based on STFT.
Implement pitch-shifting by resampling the output of time-stretching. Try to find the fundamental frequency in each frame, e.g. by choosing the lowest peak larger than a certain amplitude. Then pitch-shift the voice-signal on a frame-by-frame basis to the same target frequency to get a monotonous voice.
Implement an oscillator according to the digital resonator. Control the frequency with an LFO (low frequency oscillator). For large and fast frequency variations there should be audible amplitude variations. Now determine the amplitude by a squarer-detector and an averager (equal attack and release). Correct the amplitude by dividing $x[t]$ and $x[t-1]$ by $(a-\bar{a})/10+1$, where $a$ is the detected amplitude and $\bar{a}$ is the desired expected amplitude.
No MatLab this time. For the signal $$x=(\ldots,0,0,1,2,1,0,-1,-2,-1,0,0,\ldots)\, ,$$ calculate the optimal linear prediction coefficients with the Levinson-Durbin algorithm by hand. No window function is used, i.e. it is constant 1. $m$ is as far as we can get in 0.75 hours. Check the result after each step. Calculate the residual signal.
Convolve the input signal with two different white-noise signals (length about 4000 samples) to get a decorrelated stereo output. Test it with our test signals and also white noise as input signal. Play the output followed by a convolved but non-decorrelated (use the same white-noise for left and right) signal to hear the difference.