Kevin Chou's Final Project: A Speech-Centric Ambience Generator
This project is a first attempt to generate a soundscape from the audio qualities of speech. In this initial implementation, certain features of audio were sonified as a complement to an input speech file (see Future Work). You can listen to a sonified version of Martin Luther King Jr.'s "I have a Dream" speech here.
An important feature of for this speech based soundscape generator is the use of a sliding buffer that will hold the data for the four features implemented in ChucK's unit analyzers--Centroid, Flux, RMS, and Rolloff. The raw data taken at the sample rate (more specifically based on the size of the FFT window), is not terribly useful or coherent--it is much more useful to find the how the data changes over time. Toward this end, averages and standard deviations were implemented at real time using a running total method--recalculating each average and standard deviation after every new data point could not keep up with the sampling rate. This buffer-statistical model is the basis for the sonic qualities of the project.
Running the Code
Running the code:
- Make sure you have miniAudicle or ChucK: ChucK
- Careful with directories! The home directory should point to the folder containing the ChucK files or just run it on Command Line/Terminal.
- To run the program: % chuck adder.ck will automatically add all needed files. Enjoy!
Download the Code
Find it here: FinalProject.zip
The original goal for this project was to implement a speech-centric-emotion-musicalizer (is that a word?). In other words, the program would analyze a speech from file or from a mic and output an ambiance that would follow the tone, emotion, or other attributes within the speech. Thus, the program would sonically convey a passionate or boring speech in a way that would musically make sense.
Toward this goal:
- A more thorough investigation of speech qualities and its correlation to the features of audio. This will entail a look at audio from many sources--speech my many different people--in order to find the audio feature correlations to certain emotions.
- An exploration of other sonic descriptors for emotions. Potential sources are natural recordings.
- An evaluation of the current code--attempting to streamline the code to prevent memory overflow. Also needed is a rethinking of the implementation of the sounds.