?? rfc1890.txt
字號:
RFC 1890 AV Profile January 1996
For sample-based encodings producing one or more octets per sample,
samples from different channels sampled at the same sampling instant
are packed in consecutive octets. For example, for a two-channel
encoding, the octet sequence is (left channel, first sample), (right
channel, first sample), (left channel, second sample), (right
channel, second sample), .... For multi-octet encodings, octets are
transmitted in network byte order (i.e., most significant octet
first).
The packing of sample-based encodings producing less than one octet
per sample is encoding-specific.
4.3 Guidelines for Frame-Based Audio Encodings
Frame-based encodings encode a fixed-length block of audio into
another block of compressed data, typically also of fixed length. For
frame-based encodings, the sender may choose to combine several such
frames into a single message. The receiver can tell the number of
frames contained in a message since the frame duration is defined as
part of the encoding.
For frame-based codecs, the channel order is defined for the whole
block. That is, for two-channel audio, right and left samples are
coded independently, with the encoded frame for the left channel
preceding that for the right channel.
All frame-oriented audio codecs should be able to encode and decode
several consecutive frames within a single packet. Since the frame
size for the frame-oriented codecs is given, there is no need to use
a separate designation for the same encoding, but with different
number of frames per packet.
Schulzrinne Standards Track [Page 7]
RFC 1890 AV Profile January 1996
4.4 Audio Encodings
encoding sample/frame bits/sample ms/frame
____________________________________________________
1016 frame N/A 30
DVI4 sample 4
G721 sample 4
G722 sample 8
G728 frame N/A 2.5
GSM frame N/A 20
L8 sample 8
L16 sample 16
LPC frame N/A 20
MPA frame N/A
PCMA sample 8
PCMU sample 8
VDVI sample var.
Table 1: Properties of Audio Encodings
The characteristics of standard audio encodings are shown in Table 1
and their payload types are listed in Table 2.
4.4.1 1016
Encoding 1016 is a frame based encoding using code-excited linear
prediction (CELP) and is specified in Federal Standard FED-STD 1016
[2,3,4,5].
The U. S. DoD's Federal-Standard-1016 based 4800 bps code excited
linear prediction voice coder version 3.2 (CELP 3.2) Fortran and C
simulation source codes are available for worldwide distribution at
no charge (on DOS diskettes, but configured to compile on Sun SPARC
stations) from: Bob Fenichel, National Communications System,
Washington, D.C. 20305, phone +1-703-692-2124, fax +1-703-746-4960.
4.4.2 DVI4
DVI4 is specified, with pseudo-code, in [6] as the IMA ADPCM wave
type. A specification titled "DVI ADPCM Wave Type" can also be found
in the Microsoft Developer Network Development Library CD ROM
published quarterly by Microsoft. The relevant section is found under
Product Documentation, SDKs, Multimedia Standards Update, New
Multimedia Data Types and Data Techniques, Revision 3.0, April 15,
1994. However, the encoding defined here as DVI4 differs in two
respects from these recommendations:
Schulzrinne Standards Track [Page 8]
RFC 1890 AV Profile January 1996
o The header contains the predicted value rather than the first
sample value.
o IMA ADPCM blocks contain odd number of samples, since the
first sample of a block is contained just in the header
(uncompressed), followed by an even number of compressed
samples. DVI4 has an even number of compressed samples only,
using the 'predict' word from the header to decode the first
sample.
Each packet contains a single DVI block. The profile only defines the
4-bit-per-sample version, while IMA also specifies a 3-bit-per-sample
encoding.
The "header" word for each channel has the following structure:
int16 predict; /* predicted value of first sample
from the previous block (L16 format) */
u_int8 index; /* current index into stepsize table */
u_int8 reserved; /* set to zero by sender, ignored by receiver */
Packing of samples for multiple channels is for further study.
The document, "IMA Recommended Practices for Enhancing Digital Audio
Compatibility in Multimedia Systems (version 3.0)", contains the
algorithm description. It is available from:
Interactive Multimedia Association
48 Maryland Avenue, Suite 202
Annapolis, MD 21401-8011
USA
phone: +1 410 626-1380
4.4.3 G721
G721 is specified in ITU recommendation G.721. Reference
implementations for G.721 are available as part of the CCITT/ITU-T
Software Tool Library (STL) from the ITU General Secretariat, Sales
Service, Place du Nations, CH-1211 Geneve 20, Switzerland. The
library is covered by a license.
4.4.4 G722
G722 is specified in ITU-T recommendation G.722, "7 kHz audio-coding
within 64 kbit/s".
G728 is specified in ITU-T recommendation G.728, "Coding of speech at
16 kbit/s using low-delay code excited linear prediction".
Schulzrinne Standards Track [Page 9]
RFC 1890 AV Profile January 1996
4.4.6 GSM
GSM (group speciale mobile) denotes the European GSM 06.10
provisional standard for full-rate speech transcoding, prI-ETS 300
036, which is based on RPE/LTP (residual pulse excitation/long term
prediction) coding at a rate of 13 kb/s [7,8,9]. The standard can be
obtained from
ETSI (European Telecommunications Standards Institute)
ETSI Secretariat: B.P.152
F-06561 Valbonne Cedex
France
Phone: +33 92 94 42 00
Fax: +33 93 65 47 16
4.4.7 L8
L8 denotes linear audio data, using 8-bits of precision with an
offset of 128, that is, the most negative signal is encoded as zero.
4.4.8 L16
L16 denotes uncompressed audio data, using 16-bit signed
representation with 65535 equally divided steps between minimum and
maximum signal level, ranging from -32768 to 32767. The value is
represented in two's complement notation and network byte order.
4.4.9 LPC
LPC designates an experimental linear predictive encoding contributed
by Ron Frederick, Xerox PARC, which is based on an implementation
written by Ron Zuckerman, Motorola, posted to the Usenet group
comp.dsp on June 26, 1992.
4.4.10 MPA
MPA denotes MPEG-I or MPEG-II audio encapsulated as elementary
streams. The encoding is defined in ISO standards ISO/IEC 11172-3 and
13818-3. The encapsulation is specified in work in progress [10],
Section 3. The authors can be contacted at
Don Hoffman
Sun Microsystems, Inc.
Mail-stop UMPK14-305
2550 Garcia Avenue
Mountain View, California 94043-1100
USA
electronic mail: don.hoffman@eng.sun.com
Schulzrinne Standards Track [Page 10]
RFC 1890 AV Profile January 1996
Sampling rate and channel count are contained in the payload. MPEG-I
audio supports sampling rates of 32000, 44100, and 48000 Hz (ISO/IEC
11172-3, section 1.1; "Scope"). MPEG-II additionally supports ISO/IEC
11172-3 Audio...").
4.4.11 PCMA
PCMA is specified in CCITT/ITU-T recommendation G.711. Audio data is
encoded as eight bits per sample, after logarithmic scaling. Code to
convert between linear and A-law companded data is available in [6].
A detailed description is given by Jayant and Noll [11].
4.4.12 PCMU
PCMU is specified in CCITT/ITU-T recommendation G.711. Audio data is
encoded as eight bits per sample, after logarithmic scaling. Code to
convert between linear and mu-law companded data is available in [6].
PCMU is the encoding used for the Internet media type audio/basic. A
detailed description is given by Jayant and Noll [11].
4.4.13 VDVI
VDVI is a variable-rate version of DVI4, yielding speech bit rates of
between 10 and 25 kb/s. It is specified for single-channel operation
only. It uses the following encoding:
DVI4 codeword VDVI bit pattern
__________________________________
0 00
1 010
2 1100
3 11100
4 111100
5 1111100
6 11111100
7 11111110
8 10
9 011
10 1101
11 11101
12 111101
13 1111101
14 11111101
15 11111111
Schulzrinne Standards Track [Page 11]
RFC 1890 AV Profile January 1996
5. Video
The following video encodings are currently defined, with their
abbreviated names used for identification:
5.1 CelB
The CELL-B encoding is a proprietary encoding proposed by Sun
Microsystems. The byte stream format is described in work in
progress [12]. The author can be contacted at
Michael F. Speer
Sun Microsystems Computer Corporation
2550 Garcia Ave MailStop UMPK14-305
Mountain View, CA 94043
United States
electronic mail: michael.speer@eng.sun.com
5.2 JPEG
The encoding is specified in ISO Standards 10918-1 and 10918-2. The
RTP payload format is as specified in work in progress [13]. Further
information can be obtained from
Steven McCanne
Lawrence Berkeley National Laboratory
M/S 46A-1123
One Cyclotron Road
Berkeley, CA 94720
United States
Phone: +1 510 486 7520
electronic mail: mccanne@ee.lbl.gov
5.3 H261
The encoding is specified in CCITT/ITU-T standard H.261. The
packetization and RTP-specific properties are described in work in
progress [14]. Further information can be obtained from
Thierry Turletti
Office NE 43-505
Telemedia, Networks and Systems
Laboratory for Computer Science
Massachusetts Institute of Technology
545 Technology Square
Cambridge, MA 02139
United States
electronic mail: turletti@clove.lcs.mit.edu
Schulzrinne Standards Track [Page 12]
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -