What is the VOX audio format file type used for?
Well, that’s a fine question for ya to have to research! But why look around when the answer’s right here in the form of another illumination on the subject of telephony audio?
First of all, “vox” is Latin for voice. And it’s also a file extension for what are primarily voice files – and an excellent extension at that. In fact, VOX has so very many meanings that even in the sub-culture of Telcom, and describing only audio file formats, there are so many variations of .vox files being used to store IVR prompts that I will touch only briefly on a few of the most prevalent forms.
All three share similar properties of size and clarity depending on the bandwidth used for transmission. When I first started here at Marketing Messages in the summer of 1998, I really didn’t know anything about the VOX audio format. Then my immersion began in earnest. At that time, nearly all of the voice prompts I edited as linear PCM audio were destined to become VOX. All of the recordings were encoded as 8000 Hz, 16-bit, linear PCM and converted after editing to the final, specific VOX format. (These days all of our recordings are done either at 48 or 44.1 kHz and I have to say that it’s much nicer to work with higher fidelity!)
To me, the most readily approachable is the Dialogic .vox file. That’s likely due to the fact that Marketing Messages recorded the system library for Artisoft’s TeleVantage phone system and we had many, many clients order custom voice prompts which were delivered as…64 kbps (kilobits per second) Dialogic .vox files. This format is an 8-bit, 8000 Hertz, mU-law encoded PCM (Pulse Code Modulation) audio format saved with a .vox extension. A variation of this format is the indexed .vap version, in which multiple audio source files can be combined as a library in one indexed .vap file. In my day I’ve produced thousands upon thousands of Dialogic .vox files.
Natural MicroSystems VOX
Another biggie for us is the NMS (Natural Micro Systems) .vox. This proprietary format contains a header that describes the contents of the audio file (sampling rate, bit rate, length, index number, and sometimes the text of the audio itself) and can be saved in a variety of compressions; some of these are 64 kbps, 32 kbps, 24 kbps, and even 16 kbps. 32 kbps is the most commonly requested NMS.vox format. It is a 4-bit, 8000 Hz ADPCM (Adaptive Differential Pulse Code Modulation) encoded audio file.
Rhetorex .vox files are near and dear to Marketing Messages because our own Comdial phone system stores its audio in this native format. If you’d care to listen to a Rhetorex file, give us a call at 1-800-4-VOICES (486-4237) after hours and listen to the Auto-Attendant that answers the phone. You’ll hear a 32 kbps Rhetorex encoded .vox file on the other end.
Now you may well ask, why is such a low-bandwidth audio file being used in this high-fidelity day and age? Because phone lines were designed to deliver human speech, not a symphony orchestra. Human speech falls within a pretty tight frequency range, mostly from 350-2000 Hz. These .vox file formats were designed specifically to reproduce human speech with as little overhead as possible. And they do it well. Marvelously well — providing you’re listening to it as you were intended to by placing a telephone handset to your ear.
I find Vox files interesting because most people can’t play them on a computer even though they share similar properties with .wav files. Being able to work with such a “mysterious” format makes me feel like I’m in charge of a secret or a spell of some sort. Often times, when we deliver .vox our client, will request a Windows-friendly format (.wav or .mp3) as well so they can preview their new custom audio prior to loading it to their phone system. Being the audio wizards that we are, we happily oblige them.
Until my next post, vox on!