Difference between revisions of "RT LPC"

From CSWiki
Jump to: navigation, search
Line 2: Line 2:
[http://www.cs.princeton.edu/~amisra/ Ananya Misra] + [http://www.cs.princeton.edu/~gewang/ Ge Wang]
[http://www.cs.princeton.edu/~amisra/ Ananya Misra] + [http://www.cs.princeton.edu/~gewang/ Ge Wang]
Assignment 1 for [http://www.cs.princeton.edu/courses/archive/fall04/cos597E/ 597E - Digital Speech Processing],
[http://www.research.att.com/~mazin/ Professor Mazin Rahim]

Revision as of 18:59, 8 May 2007

rt_lpc : real-time LPC analysis + synthesis + visualization

Ananya Misra + Ge Wang


We released this under GPL. It can be found at:



rt_lpc is a light-weight application that performs real-time LPC analysis and synthesis. It features the following:

  • real-time LPC analysis
  • real-time LPC synthesis
  • visualization of original, predicted, and error waveforms
  • visualization of vocal tract shape from LPC coefficients
  • adjustable LPC analysis order
  • adjustable synthesis pitch shift
  • lots of other choices (pitch pulse source selection, emphasis filter)
  • STFT plot
  • modular LPC library
  • available on MacOS X, Linux, and Windows under GPL
  • part of the sndtools distribution

it looks like this:



you can start rt_lpc from the command line with the following optional arguments:

   --srate<N> - sets the real-time audio sample rate
   --ola<N> - enables (N==1) or disables (N==0) overlap add

for example:

   rt_lpc --srate22050 --ola1

When the program is running, you have the following options (make sure the window is in focus when you press the keyboard):

   's' - toggle fullscreen
   '=' - increase pitch factor
   '-' - decrease pitch factor
   'p' - increase order
   'o' - decrease order
   't' - toggle vocal tract
   'v' - select vocal tract rendering mode
   'g' - toggle using impulse train / glottal pulse
   'b' - toggle preemphasis and deemphasis filter
   'm' - use MIDI input as pitch
   'w' - toggle wutrfall plot
   'd' - toggle dB plot for spectrum
   'f' - move spectrum + z
   'j' - move spectrum - z
   'e' - spacing more!
   'i' - spacing less!
   L, R mouse button - rotate left or right
   'h' - print this help message
   'q' - quit

Software Design

the system "architecture" is shown below:


The basic steps involved are:

  • retrieve the next buffer of audio from the soundcard/mic
  • preemphasis filtering (optional)
  • do LPC analysis
    • perform autocorrelation on the frame of audio signal
    • construct the R matrix that is NxN, where N is the LPC order
    • invert it and obtain the N coefficients
    • use the coefficients to find MSE/LPC residual/power
    • determine the pitch from the autocorrelation
  • do LPC synthesis (given pitch, power, coefficients, and order N)
    • generate a source based on the pitch information
      • if un-pitchted, then use white noise
      • if pitched, use impulse train or glottal pulse train (optional)
    • pass source through all-pole filter with the N coefficients
    • do minimal state-tracking to enhance transitions between frames
  • deemphasis filtering (optional)
  • render graphics
    • input waveform
    • prediction
    • residue
    • STFT
    • vocal tract shape from coeffcients
    • order/pitch

Our LPC library

In our implementation, the LPC library contains all the functionality related to LPC analysis synthesis, with the hope that it may be useful for other applications. The interface looks like the following:

   // init
   lpc_data lpc_create( );
   // analysis
   void lpc_analyze( lpc_data instance, SAMPLE * x, int len, float * coefs, 
                     int order, float * power, float * pitch, 
                     SAMPLE * residue = NULL );
   // synthesis
   void lpc_synthesize( lpc_data instance, SAMPLE * y, int len, float * coefs,
                        int order, float power, float pitch, int alt = 0 );
   // done
   void lpc_destroy( lpc_data & instance );
   // helper -- autocorrelation
   float autocorrelate( SAMPLE * x, int len, SAMPLE * y );
   // helper -- lpc prediction 
   float lpc_predict( lpc_data lpc, SAMPLE * x, int len, float * coefs, int order );
   // helper -- preemphasis
   void lpc_preemphasis( SAMPLE * x, int len, float alpha );
   // helper -- deemphasis
   void lpc_deemphasis( SAMPLE * y, int len, float alpha );
   // helper -- set alt src
   void lpc_alt( lpc_data lpc, SAMPLE * buffer, int len );



This homey veranda scene had become commonplace in the U.S.
by 1914, a year when Americans bought half a million talking
machines - which were now, regardless of make, known popularly
as Victrolas. As new models were introduced, their amplifying
horns became more and more elaborate and decorative, with names
like "Morning Glory" and "Cygnet".