Carnegie Mellon Sound Carnegie Mellon Sound Sampling Basics
Carnegie Mellon
Sound Carnegie Mellon
Sound Sampling Basics • • Common Sampling Rates • 8 KHz (Phone) or 8. 012820513 k. Hz (Phone, Ne. XT) • 11. 025 k. Hz (1/4 CD std) • 16 k. Hz (G. 722 std) • 22. 05 k. Hz (1/2 CD std) • 44. 1 k. Hz (CD, DAT) • 48 k. Hz (DAT) Bits per Sample • • 8 or 16 Number of Channels • mono/stereo/quad/ etc. Carnegie Mellon
Common Sound File Formats • Mulaw (Sun, Ne. XT). au • RIFF Wave (MS WAV). wav • MPEG Audio Layer (MPEG). mp 2. mp 3 • AIFC (Apple, SGI). aiff. aif • HCOM (Mac). hcom • SND (Sun, Ne. XT). snd • VOC (Soundblaster card proprietary standard). voc • AND MANY OTHERS! Carnegie Mellon
What’s in a Sound File Format • • Header Information • Magic Cookie • Sampling Rate • Bits/Sample • Channels • Byte Order • Endian • Compression type Data Carnegie Mellon
Example File Format (NIST SPHERE) NIST_1 A 1024 sample_rate -i 16000 channel_count -i 1 sample_n_bytes -i 2 sample_byte_format -s 2 10 sample_sig_bits -i 16 sample_count -i 594400 sample_coding -s 3 pcm sample_checksum -i 20129 end_head Carnegie Mellon
WAVe file format (Microsoft) RIFF A collection of data chunks. Each chunk has a 32 -bit Id followed by a 32 -bit chunk length followed by the chunk data. 0 x 00 0 x 04 0 x 08 0 x 0 C 0 x 10 0 x 14 0 x 16 0 x 18 0 x 1 C 0 x 20 0 x 22 0 x 24 0 x 28 0 x 2 C chunk id 'RIFF' chunk size (32 -bits) wave chunk id 'WAVE' format chunk id 'fmt ' format chunk size (32 -bits) format tag (currently pcm) number of channels 1=mono, 2=stereo sample rate in hz average bytes per second number of bytes per sample 1 = 8 -bit mono 2 = 8 -bit stereo or 16 -bit mono 4 = 16 -bit stereo number of bits in a sample data chunk id 'data' length of data chunk (32 -bits) Sample data Carnegie Mellon
Mu-Law u-LAW (or mu-LAW) is sgn(x) y= ---- ln( 1+ u |x|) ln(1+u) u=100 or 255, A=87. 6, mp = Peak message value, Carnegie Mellon
Compression u-LAW sihttp: //shuttle. nasa. gov/askmcc/answers/lence detection ADPCM (adaptive, delta PCM, 24/32/40 kbps) LPC-10 E (Linear Predictive Coding 2. 4 kb/s) CELP 4. 8 Kb/s - builds on LPC GSM (European Cell Phones, RPE-LPC) 1650 bytes/sec (at 8000 samples/sec) Real. Audio (builds on CELP, GSM, proprietary) MPEG Audio Layers (builds on ADPCM) Layer-2: From 32 kbps to 384 kbps - target bit rate of 128 kbps Layer-3: From 32 kbps to 320 kbps - target bit rate of 64 kbps Complex compression, using perceptual models Carnegie Mellon
Sound Editing • Gold. Wave – requires a sound card. – digital audio sound player, recorder and editor – can load, play and edit many different file formats • . wav, . au, . voc, . snd – displays separate graphics for the left and right channels – very easy to use – good sound quality • Others: WHAM, Cool Edit, SOX, WINPLANY, Digital Audio Playback Facility, MOD 4 Win, etc. Carnegie Mellon
Tips for Audio on the Web There is no generic audio standard on the Web Few systems on the Web have 16 -bit sound capabilities Listening to 16 -bit sounds on an 8 -bit system results in strange effects Users will be annoyed if they spend a lot of time downloading a sound and they can’t play it • Distribute only 8 -bit sounds on your Web page • Or, provide different sound files in both 8 - and 16 -bits • Record in the highest sampling rate and size you can, and then process down to 8 -bit • Keep file size small • downsampling to 8 -bit • use a lower sampling rate • use mono sounds • Describe what format those sounds are in • WAVE, AIFF, or other format • Providing the file size in the description is a politeness to help estimate download times • If you need high sound quality and have large audio files: • Use a smaller sound clip in m-law format as a preview or for those who can’t to listen to the higher-quality sample. Check out http: //www. realaudio. com/help/content/audiohints. html. Carnegie Mellon
Space Requirements Storage Requirements for One Minute of Sound Carnegie Mellon
References • http: //www. nlc-bnc. ca/pubs/netnotes/notes 24. htm • http: //www. spies. com/sox (conversion tool) • http: //freebsd. cdrom. com/. 5/cica/sounds/gldwav 21. z ip Carnegie Mellon
Sound That’s all for today Carnegie Mellon
- Slides: 14