Convolver Panned binaural demonstration |
|
by Peter FischerIn this demonstration a monophonic sound source is produced and panned 360° around the listener. The resulting stereo track is binaural and is designed to be played back over headphones. The listener should be able to hear the sound being panned around in a circle using headphones. This effect is achieved in three parts:
In summary, the left ear HRIRs corresponding to the azimuth of their respective speaker feed are convolved and summed to give the binaural signal for the left ear. The procedure is repeated for the right ear HRIRs producing the binaural signal for the right ear. Eight separate convolutions are required for four virtual speakers. The inspiration for this particular demonstration came about from reading both of the papers listed as references and applying those concepts using freely available command line tools and datasets. The walkthroughThis is a demonstration of taking a mono sound source and panning that sound around the listener in a 360° circle. There are four parts to this example:
Tools required
MethodGeneration of the mono sound sourceUsing either Adobe Audition or the freeware Audacity (the stable version of Audacity, not the beta version, is recommended) generate a mono sound file to be panned. The following walkthrough is for Audacity.
Alternately you may use a pre-existing mono sound file or generate one using the command line tool sox. Mono panningUse the command line tool abfpan from mctools to pan the mono noise file. abfpan noise.wav noisepan.wav 0.0 1.0
Remember to add the MCTools directory to your path or run the command line from the MCTools directory. The resulting noisepan.wav is a B Format file. B Format to speaker decodeUse another useful tool, this time abfdcode, to decode the B Format file to a square speaker arrangement: abfdcode noisepan.wav noisesq.wav
noisesq.wav is a four channel file with speaker feeds Front Left, Front Right, Back Left and Back Right. Square speaker feeds to BinauralFor this step use Convolvercmd to convolve the square speaker feeds with the HRIRs representing the position of those speakers. The convolvercmd filter is given a text configuration file. You can download a text version of the file but will have to alter the file path to reflect the path of the HRIRs on disk.
If the config file is called binaural.txt, run the following from the command line: convolvercmd 4 1 -9 binaural.txt noisesq.wav noisebin.wav
where
Thats itPlay noisebin.wav in your favourite player. Enjoy. References
Acknowledgments
Further readingWhat is all this Ambisonics/B Format stuff?A method of surround sound encoding developed by Michael Gerzon, Peter Fellgett and John Hayes in the 1970's as a alternative to Quadrophonic sound systems. The basic principle of ambisonics is a mathematical decomposition of a 3D sound field, specifically as a spherical harmonic decomposition of the 3D sound field. To use an analogy, any complex single waveform can be decomposed into a infinite series of sine and cosine terms. This is termed Fourier theory and gives us the Fourier Series and the massively useful Fourier Transform. The Fourier Transform is used in its discrete and fast form as the Fast Fourier Transform (FFT). It is the basis on which convolution programs such as Convolver perform their calculations. In essence a time domain signal is transformed to the frequency domain by FFT. A time domain Impulse Response (IR) is also transformed to the frequency domain by FFT. These two frequency domain signals are multiplied (convolution!) and the result is transformed back to the time domain by iFFT (inverse Fast Fourier Transform). Sound pressure waves can be described as an infinite series of spherical waves - the Bessel-Fourier series. The Bessel functions describe the spherical, radial functions. The angular, Cartesian functions are referred as the spherical harmonics and are the expression of the Bessel functions on a Cartesian coordinate system e.g. X, Y and Z. Like the Fourier Series the Bessel-Fourier series starts at 0th order and extends to infinite order. The traditional ambisonic B Format is the spherical harmonic terms up to and including order 1. These channels are terms W (0th order), X (1st order vector direction X), (Y 1st order vector direction Y) and Z (1st order vector direction Z). A physical analogy: A single omni-direction microphone records the sound pressure (0th order component - B Format W) of a sound field. A ribbon microphone aligned so that it records signals to the front and rear (1st order pressure gradient in the X direction - B Format X). A ribbon microphone at 90 degrees to the first ribbon microphone in the horizontal plane recording signals from the left and right (1st order pressure gradient in the Y direction - B Format Y). Finally a ribbon microphone at 90 degrees to the first two aligned so it records signals up and down (1st order pressure gradient in the Z direction - B Format Z). First order ambisonics is a highly truncated version of a 3D sound field although the orders can be theoretically extended to whatever level of detail is required. The B Format is then a highly compact representation of a 3D sound field and easily manipulated mathematically e.g. rotated and is useful for the generation of synthetic sound fields as in the panned Binaural demonstration. Decoding of B Format signal to a regular polyhedral arrangement of speakers is achieved by multiplying each B Format channel by its specific gain and summing the result to each speaker. Further referencesTwo good if somewhat mathematical papers are:
|
Send mail to
with questions or comments Convolver or about this web
site.
|