The new flurry of next generation handhelds has brought with them a portable avalanche of bells and whistles.
Among them, enhanced audio capabilities are now standard equipment, long-awaited by many potential handheld buyers.
In this article, we'll discuss the complexities of creating, using, and faithfully reproducing digital
sounds, and how audio hardware capabilities may affect your decision to purchase a new device.
To understand how sound is reproduced on a handheld, it's useful to cover some basics on how digital
sound waves are created and perceived. In the real, non-digital world, sounds propagate through the air as pressure waves,
and are strictly analog, real-time phenomena. In non-geek terms, as pressure waves strike our
eardrums (and transfer energy to them), the energy in any microsecond is not limited to a single set of
possible values, but can be anywhere within a continuous range of values. Furthermore, the pressure exerted
is constantly changing with time, and the speed that these changes can occur is nearly unlimited.
When these same sounds are converted to the digital world, however, limits appear which don't exist in the real world.
On a computer, a sound wave is converted into a stream of numbers by the process of sampling. Sampled sounds approximate
their true-life counterparts by taking snapshots, called "samples," of a sound wave's energy at regular quick intervals,
and storing each energy value as a number in the sampled data stream. When played back, the numbers are re-used to replicate
the original sound wave's shape.
The quality of the reproduction depends on how accurately it mimics the original sound wave. This depends primarily on two factors,
the dynamic range of the sound samples, and the frequency response.
The dynamic range of the digital wave is determined by the range of possible values associated with each sample point.
For instance, an 8-bit number can store 256 values ranging from 0 to 255 (or -128 to +127). When a sound wave is sampled,
the sound energy is mapped so that the loudest part of the wave uses the full range of numbers. Thus, the values in an
8-bit sound sample go back and forth between -128 and +127, with silence represented by a stream of zeroes.
When the dynamic range is too small to accurately reproduce the shape of the original sound wave, the result is perceived
as noise in the playback. Human ears have a difficult time noticing a 1/255th difference in volume, equal to 8-bit precision.
Yet, 16-bits (taking twice as much space and yielding 65536 values), is considered the standard for high quality audio.
The difference exists because when the loudest part of a waveform is mapped to the full range of sampled values, the quieter
parts of the wave don't get the benefit of all the bits of data. For instance, if you were to sample the 1812 Overture, you'd
want to adjust your mapping so the loud "boomy" bits of the piece gave you values that ranged from -128 to +127. These parts
would use all 8 bits, and would probably sound just fine. The quieter bits, however, would not fare so well. If they were, say,
only 1/16th the volume of the loudest part of the music, then they would range from -7 to +8, effectively using
only 4 bits of the full 8-bit sample. By resampling the entire wave in 16-bits, however, even the quiet part above would still get
12 bits of precision, yielding much better overall sound.
The second component, frequency response, refers to the speed at which the digital samples are taken, and is measured
in samples per second, or Hertz (Hz). When the
sound values are changing rapidly, the sample rate must increase to accurately reflect the shape of the original
sound wave. A fast-changing waveform represents not only Ella-Fitzgerald-wine-glass-shattering high-frequency notes,
but the high-frequency components of instruments like drums, cymbals, and other
percussion instruments. When the sampling rate is insufficient to reproduce the
original waveform, these sounds are perceived as "muddied" or less "crisp" than the originals.
A sampled wave can play back its highest frequency simply by alternating between two different values. Since the pattern of the
two values repeats itself every two samples, the highest frequency a digital waveform can reproduce, called the Nyquist frequency,
is one half the sampling rate. Since the range of human hearing extends to about 20000 Hz, the minimum sampling rate for
high-fidelity sounds is about 40000 Hz. The minimum sampling rate for normal-quality speech is about 8000 Hz.
Digital sound formats used in computing devices vary in quality and size. Audio CDs store 16-bit stereo sounds at 44000 Hz,
good enough that differences from the original source are practically imperceptible. Yet, like an SUV in a parking garage,
CD-quality audio occupies a vast amount of space, filling up 10 Megabytes of storage per minute of audio.
When sound quality is not critical, dynamic range and/or sampling frequency are often compromised to reduce file size. Eight-bit,
8000 Hz, sounds are often considered fine for voice and non-musical applications.
Compression can also be used to reduce file size. Like video, however, audio data is not easily compressed, and effectively doing so
always requires modifying the raw sound data. This type of compression, called "lossy", results
in distortion of the sound, usually perceived as background noise in the final output. For example, the optional ADPCM compression
supported by TealMovie can cut the audio file size in half, but adds a noticeable amount of noise to the sound sample in
Advanced compression, such as MP3, uses computing power and clever algorithms to compress sounds to as little as one-tenth their original size.
While MP3 files still employ "lossy" compression, the changes made are difficult to hear with most sounds. The downside is that
an incredible amount of computing power is needed to play back MP3 files; so much, in fact that most PalmOS handhelds which support
MP3 files cannot do so without specialized hardware. Which brings us to...
Surprisingly, even the first PalmOS handhelds had the capability to support sampled sound playback. This capability came from the
Motorola Dragonball processor, which had built-in hardware to play 8-bit sampled sounds at about a 10000 Hz playback rate.
Unfortunately, in order to play back anything more than clicks, beeps, and alarm sounds, additional filtering hardware was needed.
Since audio was not considered a high-priority feature, the extra hardware was not considered worth the small addition expense.
Today, all Dragonball-based PalmOS handhelds still have basic built-in sound capabilities. Some, like the TRGpro, HandEra 330,
and Sony CLIE handhelds, have the proper filtering to deliver distortion-free sound. Many Sony handhelds also have headphone jacks,
a nice feature for using the device in multimedia capabilities.
Most new ARM-based handhelds also include support for higher quality audio. Both the Sony NX70 and Palm Tungsten T offer
headphone jacks and sampled sound playback. The important difference, however, is that the Tungsten T has the capability to
play back uncompressed sampled sounds in a variety of formats and sample rates up to CD quality levels. While the NX70 can
play high-quality MP3 files off a Memory Stick, it can only play back sampled sounds in a proprietary 8-bit, 8000Hz compressed
format, limiting the sound quality of games and applications that either build their sound data on-the-fly or store their sound data
The Sony NX70, NR70 and similar models do have some other special capabilities, however, supporting multi-voice synthesized
instruments. While not capable of playing back speech, their little-used hardware creates sounds from basic waveforms which sound
a lot like bells and harpsichords. Used for the devices' musical alarms, the multi-voice capability can create music sounding
like a room full of instruments, or, a room full of harpsichords, anyway.
The Beat Plus Springboard module adds Sony-like audio to Handspring
Visors, while other Springboard modules and the Kyocera 7135 Smartphone offer MP3 playback capability. In most cases, MP3
playback is done using specialized hardware chips that read the data directly off SD card or Memory Stick. Essentially, they resemble
standalone MP3 players which are only loosely integrated into the handheld. This approach can help maximize battery life, but can
limit their accessibility from software applications, which we'll soon see.
A very important consideration in choosing a new device is not only what sound capabilities the device offers, but how likely
third party software will be able to take advantage of those capabilities. This is often not so much of an issue of what hardware
is present, but of how accessible the information to use that hardware is to developers.
Supporting sound for Dragonball-based (OS4 and earlier) handhelds is largely a no-brainer, as the necessary documentation
is readily available freely from Motorola. Almost all devices in this class support basic audio of varying
sound qualities, though it's important to note that original Dragonball processor in the original Pilot, PalmPilot and Palm III did not
have enough sound buffering to simultaneously play sampled sounds and do anything else, like, um, play video. Later models, starting
with the PalmIIIx, not only improved the original clarity-challenged display, but added the Dragonball-EZ, -VZ, and
-SuperVZ chips. These allowed for buffered sound, and more importantly, TealMovie sound support.
Even with Dragonball hardware, however, software sound support often depends on what documentation is available to developers.
The TRGpro and HandEra 330, for instance, include a software-controlled amplifier which must be explicitly turned on by a program
before playing a sampled sound. Fortunately, HandEra (formerly TRG), made this information available to developers months prior
to release. Audio support on some Sony handhelds, however, is often not so easy. For instance, playing sampled sounds on an NR70 or NX70
requires manipulation of no less than three software-controlled volume controls; public documentation is readily available for only one.
PalmOS 5 promises to alleviate many of these problems with new support for playback of sampled sounds. The
improved sound support was added in PalmOS 5.1, but Palm backed these changes into the version of PalmOS which shipped with the
Tungsten T. The Sony NX70, however, does not support the new sound library. In addition, unlike the Sony NR70, the
Dragonball hardware is no longer present on the NX, so sampled sound playback is severely limited and adding it requires specialized documentation
which is difficult to come by. This can severely hamper third-party software support.
But what about MP3 playback? Indeed, playing back MP3 files is a nifty feature which can add much to the handheld-owning experience.
Unfortunately, since most MP3 playback uses custom hardware chips, software applications usually are limited to just starting and
stopping playback of SD or Memory Stick-based MP3 files. Direct access to the sound hardware is generally unavailable, so high
quality MP3 playback capability does not necessarily translate into higher quality sound for games, movies, and other applications which
require faster access to sounds or playback of sound data from main memory.
If audio is an important buying consideration, decide what types of audio applications you'll be using the most. Games?
Movies? Music? Consider purchasing a device with a headphone jack and good sampled sound playback capability. If you'll want
to play MP3 files, make sure the manufacturer specifies an adequate battery life when playing the files.
Keep in mind that sound quality varies from model to model, and that a device with "enhanced" audio for MP3 playback
may only play back voice-quality sampled sounds. Lastly, consider software support in your buying decision. If the software
accompanying the device suits all your purposes, then add-on programs may be irrelevant. But if you try a lot of third party games
and applications, you'll want to make sure that all your device's new capabilities will be fully utilized. Check the manuals of
programs you might use to see what is compatible. Popular devices will tend to enjoy better software support, as will devices
from manufacturers with strong, helpful developer programs. Here are some examples of features to consider:
Developer Documentation: Excellent
PalmOS Sound Library: Yes
Sampled Sound: Flexible, up to 16-bit, 44000 Hz CD-Quality
MP3 Playback: Via add-on software only (availability unknown)
Other: Voice Recorder
Developer Documentation: Poor
PalmOS Sound Library: No
Sampled Sound: 8-bit, 8000 Hz ADPCM Compressed Only