Rade Kutil - VO+PS Audio Processing (SS13)

Documents for the VO

script

list of questions

PS-Exercises

The exercises use Csound. Here is some guitar sound to use as test input, and also some speech sound.

Install and test Csound. Implement the parametric first-order lowpass filter as opcode in Csound with $f_c$ as k-rate parameter. Test the filter with an instrument that filters white noise with the filter and controls the parameter with a low frequency oscillator (LFO).
Implement a phaser with only one allpass and $f_c$ and $q$ as parameter. Calculate $f_d$ from $f_c$ and $q$ in a constant-q manner. Maybe put this calculation in a separate k-rate opcode. Modulate $f_c$ in an instrument with a low-frequency oscillator.
Extend the phaser to four allpasses with separate parameters for each allpass. Also implement the feedback loop. Modulate the $f_c$-parameters independently with non-harmonic low-frequencies.
Implement a 4-fold Wah-Wah effect. Modulate $f_c$ with a low-frequency oscillator, and calculate $f_d$ in a constant-q manner. The result should sound like this.
Implement a stereo rotary speaker effect. Use opcodes delayr, delayw, deltapi for the delay line. Use lfo to produce a-rate delay-modulation for deltapi and amplitude. Set nchnls to 2, and read audio with soundin "guit3.wav" (in does not work with mono input when nchnls=2), and use outs for audio output. The result should sound like this.
Implement a primitive vocoder based on the Hilbert transform: Read two sounds, e.g. guit3.wav and fox.wav. Transform both with the function hilbert. Calculate the instantaneous amplitude of both. Then substitute the amplitude of guit3 by the amplitude of fox. The result should sound like this.
Implement a compressor that uses a squarer as detector and limits the level at -30dB. To test it, read guit3.wav and fox.wav with soundin, and mix them together with guit3.wav divided by 10 (radio host situation). Experiment with attack- and release-time parameters. The result should sound like this.

Some caveats: (1) Because conditional expressions don't work at a-rate, something like c=ceil((x+abs(x))*0.5) has to be used, which gives 1 for x>0 and 0 for x<0, if x is in the range -1..+1; multiply the if-yes-expression by c, the else-expression by (1-c) and add them. (2) The output of the squarer is converted to dB by 10*log10 because it is squared, otherwise it is 20*log₁₀; so the reverse is 10^x/20, i.e. exp(x*log(10)/20), because there is no function for base-10. (3) Signals are in the range -32768..+32767, divide them to get peak 1 = 0dB. (4) For the second averager, the role of attack and release are reversed.
Look at the opcodes pvsanal, pvssynth, pvscale, and pvsmorph and use them on our input sounds. Choose another interesting opcode starting with pvs... to play around with.
It is said that female voices move the formant (the spectral shape) along with the frequency, while males do not. Try to build a feminizer effect based on this: Calculate the fundamental frequency with pvspitch, look what range it has with printk, limit it with a useful lower and upper bound, and move the formant with pvsscale by the quotient of the fundamental frequency and its lower bound. Also, move the frequency up a bit. It didn't really work for me, but maybe you are luckier.
Implement an oscillator according to the digital resonator. Control the frequency with an LFO. For large and fast frequency variations there should be audible amplitude variations. Now determine the amplitude by a squarer-detector and an averager (equal attack and release). Correct the amplitude by dividing $x[t]$ and $x[t-1]$ by $(a-\bar{a})/10+1$, where $a$ is the detected amplitude and $\bar{a}$ is the desired expected amplitude.
No Csound this time. For the signal $$x=(\ldots,0,0,1,2,1,0,-1,-2,-1,0,0,\ldots)\, ,$$ calculate the optimal linear prediction coefficients with the Levinson-Durbin algorithm by hand. No window function is used, i.e. it is constant 1. $m$ is as far as we can get in 0.75 hours. Check the result after each step. Calculate the residual signal. (Note that there was a mistake in the script, correct is: $y^{(n)} = r_{xx}[1\ldots n]$.)
Convolve the input signal with two different white-noise signals to get a decorrelated stereo output. Generate a function table of length 4096 or so containing the white noise (f-Statement, GEN21). Test it with our test signals and also white noise as input signal. Play the output followed by a convolved but non-decorrelated (use the same white-noise for left and right) signal to hear the difference.
Choose a room size and sound source and listener position, calculate the delay of the direct sound and the first few reflections (in 2D, maybe approximately and graphically). Also calculate the period of the first few modes of the room (in 2D). Then implement Moorer's reverberator with delayr/w, comb, and alpass.