?? voicebox speech processing toolbox for matlab.htm
字號:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0058)http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html -->
<HTML><HEAD><TITLE>VOICEBOX: Speech Processing Toolbox for MATLAB</TITLE>
<META http-equiv=Content-Type content="text/html; charset=windows-1252">
<META content="MSHTML 6.00.2800.1106" name=GENERATOR>
<META content="E:\Program Files\Microsoft Office\Office\html.dot"
name=Template></HEAD>
<BODY vLink=#800080 link=#0000ff>
<H1>VOICEBOX: Speech Processing Toolbox for MATLAB</H1>
<H2>Introduction</H2>
<P>VOICEBOX is a speech processing toolbox consists of MATLAB routines that are
maintained by and mostly written by <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/dmb.html">Mike Brookes</A>, <A
href="http://www.ee.ic.ac.uk/">Department of Electrical & Electronic
Engineering</A>, <A href="http://www.ic.ac.uk/">Imperial College</A>, Exhibition
Road, London SW7 2BT, UK. Several of the routines require MATLAB V5.</P>
<P>The routines are available as a <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.tar.Z">compressed
tar file</A> or as a <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.zip">zip archive</A>
and are made available under the terms of the <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/copying.txt">GNU Public
License</A>.</P>
<P>The routine VOICEBOX.M contains various installation-dependent parameters
which may need to be altered before using the toolbox.</P>
<P>For reading compressed SPHERE format files, you will need the <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/shorten.zip">SHORTEN</A>
program written by Tony Robinson and SoftSound Limited <A
href="http://www.softsound.com/">http://www.softsound.com/</A>. The path to the
shorten executable must be set in voicebox.m.</P>
<P>Please send any comments, suggestions, bug reports etc to <A
href="mailto:mike.brookes@ic.ac.uk">mike.brookes@ic.ac.uk</A>. </P>
<HR>
<H2>Contents</H2>
<HR>
<DL>
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#file">Audio
File Input/Output </A>
<DD>Read and write WAV and other speech file formats
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#frequency">Frequency
Scales </A>
<DD>Convert between Hz, Mel, Erb and MIDI frequency scales
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#fourier">Fourier/DCT/Hartley
Transforms</A>
<DD>Various related transforms
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#random">Random
Number Generation</A>
<DD>Generate random vectors and noise signals
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#distance">Vector
Distances</A>
<DD>Calculate distances between vector lists
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#analysis">Speech
Analysis</A>
<DD>Active level estimation, Spectrograms
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#lpc">LPC
Analysis of Speech</A>
<DD>Linear Predictive Coding routines
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#synthesis">Speech
Synthesis</A>
<DD>Glottal waveform models
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#enhance">Speech
Enhancement</A>
<DD>Spectral noise subtraction
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#coding">Speech
Coding</A>
<DD>PCM coding, Vector quantisation
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#recog">Speech
Recognition</A>
<DD>Front-end processing for recognition
<DT><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html#utility">Utility
Functions</A>
<DD>Miscellaneous utility functions </DD></DL>
<HR>
<HR>
<H2><A name=file>Audio File Input/Output</A></H2>
<BLOCKQUOTE>
<P>Routines are available to read and, in some cases write, a variety of file
formats:</P>
<TABLE cellPadding=2 width="100%" border=0>
<TBODY>
<TR>
<TD width=50><B>Read</B></TD>
<TD width=50><B>Write</B></TD>
<TD width=30><B>Suffix</B></TD>
<TD> </TD></TR>
<TR>
<TD width=50><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/readwav.txt">readwav</A></TD>
<TD width=50><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/writewav.txt">writewav</A></TD>
<TD width=30>.wav</TD>
<TD>These routines allow an arbitrary number of channels and can deal
with linear PCM (any precision up to 32 bits), A-law PCM and Mu-law PCM.
Large files can be read and written in small chunks.</TD></TR>
<TR>
<TD width=50><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/readhtk.txt">readhtk</A></TD>
<TD width=50><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/writehtk.txt">writehtk</A></TD>
<TD width=30>.htk</TD>
<TD>Read and write waveform files used by Entropic's Hidden Markov
Toolkit.</TD></TR>
<TR>
<TD width=50><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/readsfs.txt">readsfs</A></TD>
<TD width=50> </TD>
<TD width=30>.sfs</TD>
<TD>Speech Filing system files from Mark Huckvale at UCL.</TD></TR>
<TR>
<TD width=50><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/readsph.txt">readsph</A></TD>
<TD width=50> </TD>
<TD width=30>.sph</TD>
<TD>NIST Sphere format files (including TIMIT). Needs <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/shorten.zip">SHORTEN</A>
for compressed files.</TD></TR>
<TR>
<TD width=50><A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/readaif.txt">readaif</A></TD>
<TD width=50> </TD>
<TD width=30>.aif</TD>
<TD>AIFF format (Audio Interchange File Format) used by Mac
users.</TD></TR></TBODY></TABLE></BLOCKQUOTE>
<HR>
<H2><A name=frequency>Frequency Scale Conversion</A></H2>
<UL>
<LI>The <I>mel scale</I> is based on the human perception of sinewave pitch.
The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/mel2frq.txt">mel2frq</A>
and <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/frq2mel.txt">frq2mel</A>
convert between this scale and frequency in Hz.
<LI>The <I>erb</I> scale is based on the equivalent rectangular bandwidths of
the human ear. The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/erb2frq.txt">erb2frq</A>
and <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/frq2erb.txt">frq2erb</A>
convert between the erb rate scale and frequency in Hz.
<LI>The <I>midi standard</I> specifies a numbering of <I>semitones</I> with
middle C being 60. The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/frq2midi.txt">frq2midi</A>
and <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/midi2frq.txt">midi2frq</A>
convert between this musical frequency scale and Hz. <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/frq2midi.txt">frq2midi</A>
will in addition output note names in a character format. <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/midi2frq.txt">midi2frq</A>
can use the normal equal tempered scale or else the pythagorean scale of just
intonation. </LI></UL>
<HR>
<H2><A name=fourier>Fourier, DCT and Hartley Transforms</A></H2>
<UL>
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/rfft.txt">rfft</A>, and
<A href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/irfft.txt">irfft</A>
perform forward and inverse fourier transforms on real data. Only half of the
conjugate symmetric transform is generated by the forward routine RFFT. For
even length data, the inverse routine, IRFFT, is asymptotically twice as fast
as the built-in fft routine IFFT. The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/rsfft.txt">rsfft
</A>performs the forward transform on real symmetric data.
<LI>The routines <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/rdct.txt">rdct</A>, and
<A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/irdct.txt">irdcft</A>
perform forward and inverse discrete cosine transforms on real data. The
routines are asymptotically twice as fast as the complex-data routines in the
image-processing and signal-processing toolboxes.
<LI>The routine <A
href="http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/txt/rhartley.txt">rhartley
</A>performs a forward or inverse Hartley transform. </LI></UL>
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -