Difference between revisions of "ChucK/ISMIR2008"
m (→Chuck (ge))
|Line 146:||Line 146:|
Mention that an example of learning appears in Section AAA.
Mention that an example of learning appears in Section AAA.
== introduce some "working pipelines"==
== introduce some "working pipelines"==
Revision as of 18:02, 29 March 2008
Our fancy ISMIR paper outline.
- 1 Notes for us
- 2 Abstract
- 3 scratch
- 4 Introduction
- 5 Background & Motivation
- 6 Recent additions to ChucK to allow for prototyping and realtime MIR
- 7 introduce some "working pipelines"
- 8 case studies
- 9 related work
- 10 conclusion & map out future work
Notes for us
6 page limit.
Title: Support for MIR prototyping and real-time applications in the ChucK programming language
- There is no general purpose language for MIR prototyping that both gives access to analysis building blocks and allows for low-level coding
- This sort of framework can be really useful for fast prototyping, flexible coding, and education
- There is furthermore no such framework that does this that combines MIR and performance / synthesis; ChucK does this.
- We're going to show what is available in the language, and examples of how to work with it to accomplish MIR tasks
(re + ge)
In this paper, we discuss the recent additions of audio analysis and machine learning infrastructure to the ChucK music programming language that make it a suitable and unique tool both for music information retrieval system prototyping and for applying music information retrieval algorithms in real-time music performance contexts. From its inception, ChucK has offered both high-level control over building block components as well as fine-grained, low-level sample-synchronous manipulation; the new analysis and learning capabilities of the language were designed to preserve this breadth of control options and "do it yourself" approach to creating algorithms and systems for audio, allowing the programmer to experiment with new features, signal processing techniques, and learning algorithms with ease and flexibility. Additionally, these new capabilities are tightly integrated into ChucK's synthesis framework, making it trivial to use the results of analysis and learning tasks to drive music creation and interaction in real time.
Designed as a language for composers and performers, ChucK offers programmers high-level control over building block components as well as fine-grained, sample-level control of audio data.
Something about bridging gap between MIR and performance
In this paper, we discuss recent additions to the ChucK music programming language that make it a suitable and unique tool for music information retrieval prototyping tasks and for applying music information retrieval algorithms to music in live performance settings. Fills a niche w/ low & highlevel control => prototyping. Also, as a language originally designed for the live performer, ChucK
the new analysis and learning capabilities of ChucK are exposed at both these levels, and tightly integrated with its synthesis and control capabilities to facilitate analysis and learning in real-time contexts. We discuss the language capabilities in some detail, outline examples of how an MIR researcher can use ChucK for prototyping and applying music analysis in real time, and present a toolkit written in ChucK to facilitate these tasks.
ChucK is a computer music programming language whose primary design goals included
tight control object oriented & built in objects -> high level good for prototyping in music synthesis (e.g., can both build filters from scratch and experiment with re-networking unit generators)
Catchy first paragraph:
(does the ICMC paper start like this? if so, we should change it)
ChucK began as a high-level programming language for music and sound synthesis, whose design goals included offering the musician user a wide breadth of programmable control-- from the structural level down to the sample level, using a clear and concise syntax, and employing a set of abstractions and built-in objects to facilitate rapid prototyping and live coding. We have recently expanded the language to provide support for audio analysis and machine learning, with two primary goals: first, to offer real-time and on-the-fly analysis and learning capabilities to computer music composers and performers; and second, to offer music information retrieval (MIR) researchers a new platform for rapid prototyping and for easily porting algorithms to a real-time performance context. Our previous papers \ref[icmc2007] and \ref[icmc2008] focus on former goal, and in this paper we deal with the latter.
We begin in section AAA by suggesting that music performance can and should be more significant among the focal points and application domains of MIR research, and motivate the need for additional shared tools between MIR and performance. We also touch on the state of prototyping toolkits in MIR, then describe the ChucK language as it is used for music creation, including prototyping and live coding systems. In section AAA, we describe in some detail how we have incorporated analysis and learning into the language, with attention to preserving the flexible and powerful control that makes ChucK suited for prototyping and experimenting with new algorithms, and to tightly and naturally integrating the new functionality with ChucK's synthesis framework. Sections AAA and AAA illustrate the new potential of ChucK as an MIR tool, introducing a working pipeline for MIR tasks and presenting three case studies of how we have been using ChucK to perform music analysis. In Section AAA, we discuss related work in MIR and computer music, and finally in Section AAA we discuss ongoing directions for ChucK as an MIR tool and announce the release of a repository of examples and supporting code for MIR researchers desiring to experiment with the language.
(ge:) something about the importance of rapid prototyping, and also how rapid prototyping enables new MIR-based performance practices.
____. (symbolic representations of "recorded"/non-performed; reviews; etc.).
Why is the focus so narrow? One reason for excluding performed music is the difficulty of translating tools for MIR into a real-time context.
Background & Motivation
MIR and performance (re)
- MIR & performance
- How peformers/composers have used MIR-like algorithms, & what tools they use
- Why performance should get more attention from MIR!!!
- What is MIR, really?
- What good are MIR tools doing for the world?
- How is an MIR researcher supposed to evaluate and improve his/her work?
- MIR prototyping
- Define prototyping for our purposes; why is it important? (+ side benefits)
- What tools exist
Music information retrieval and music performance
Research in music information retrieval has primarily focused on analyzing and understanding recorded music and other non-performative musical representations and metadata. However, many music information retrieval tasks, such as mood and style analysis, instrumentation and harmony identification, and transcription and score alignment are directly relevant to real-time interactive performance. A core focus of MIR is building computer systems that understand musical audio at a semantic level (\cite[downie?]), so that humans can search through, retrieve, visualize, and otherwise interact with musical data in a meaningful way. Making sense of audio data at this higher level is also essential to </i>machine musicianship</i>, wherein the performing computer-- like any musically trained human collaborator-- is charged with interacting with other performers in musically appropriate ways \cite[rowe].
Despite the shared need to bridge the "semantic gap" between low-level audio features and higher-level musical properties, there does not exist a shared tool set for accomplishing this task in computer music and MIR. MIR researchers employ an abundance of tools for performing signal processing, feature extraction, and machine learning and computer modeling to better understand audio data. Some of these tools are general-purpose signal processing or machine learning packages, such as Matlab or Weka AAAcite, and others such as CLAM, MARSYAS, M2K, and jAAA have been designed specifically for MIR AAAcite. Most of these languages and frameworks were not designed to be used for computer music performance, and most do not perform synthesis and do not suffice for real-time computation; as a result, none have widespread use among computer musicians. On the other hand, most computer music languages do not readily accommodate analysis of audio in the language; for example, spectral analysis and processing tasks must be coded as C++ externals in order to be used in SuperCollider or Max/MSP. Enterprising composers and performers have of course been writing such externals and standalone code to accomplish pitch tracking, score following, harmonic analysis, and other tasks for many years. However, the requirement that specialized code be pushed into externals is a barrier to rapid development and experimentation by programming novices. So, despite the many shared tasks of MIR researchers and computer musicians, the dominant programming paradigms in the two fields do not include natural avenues for code sharing or collaboration between these groups.
Computer musicians would undoubtedly benefit from lower barriers to adapting state-of-the-art MIR algorithms for their real-time performance needs. We additionally posit that MIR researchers can benefit from increased collaboration with real musicians. Many standard MIR research tasks mentioned above face challenges including copyright restrictions on obtaining data or releasing systems to the public, and difficulty or expense in obtaining ground truth. MIR systems for tasks such as transcription or mood or style classification in the context of a particular musical composition or performance paradigm can circumvent such problems: the relevant data is freely available (the composer or ensemble wants you to have it!), the ground truth may be well-defined, or it may be easy to construct (composers and performers have built-in incentive for the system to perform well). There is also the benefit that an MIR researcher can make an impact on people's experiences with music, which is sadly still hard for those researchers unaffiliated with the music industry.
In summary, one major goal of our work with ChucK is to provide a tool that meets the needs of computer musicians requiring MIR-like analysis algorithms, and of MIR researchers interested in producing tools that can be used in real-time, possibly in conjunction with sound synthesis and performance. We also hope that making work at the intersection of MIR and computer music easier, we will encourage more work in this area, and facilitate a richer cross-pollination of these two fields than has been happening.
Prototyping in MIR
Let us briefly digress and motivate our other major goal in this work, to provide a new rapid prototyping environment for MIR research. We summarize our basic requirements for an MIR prototyping environment as follows: the ability to design new signal processing algorithms, audio features, and learning algorithms; the ability to apply new and existing signal processing, feature extraction, and learning algorithms in new ways; and the ability to do these tasks quickly by taking advantage of high-level building blocks for common tasks, and by specifying the system either via a GUI or concise and clear code. There do exist several programming environments for MIR that accommodate many of the above requirements, including Matlab, M2K, Weka, and AAAmore?Marsyas?Clam?jAudio?, and their popularity suggests that they meet the needs of many MIR research tasks. We propose that ChucK meets all of these requirements and inhabits a unique place in the palette of tools at the MIR researcher's disposal, not only because of its easy accommodation of real-time and performance tasks, but also because of its particular approach to dealing with time. AAAchange?
Scratch: Research in music information retrieval has primarily focused on analyzing and understanding recorded music and other non-performative musical representations and metadata. However, many music information retrieval tasks, such as analyzing the mood, rhythm, style, harmony, or instrumentation of music are directly relevant to real-time interactive performance. Furthermore, many established approaches to these tasks are appropriate to a real-time context. However, there has been a paucity of tools and environments that accommodate MIR-style analysis and learning in addition to real-time synthesis and interactive performance. We have recently augmented the ChucK music programming language with analysis and learning capabilities in an effort to begin to bridge the gap between MIR and music performance, and to allow MIR researchers and computer music performers to leverage each other's expertise, tools, and experiences.
Cite Raphael in here somewhere.
- ChucK (ge)
- A short history
- Examples of how to use the language (esp. UGens & time control)
- OTF & timeliness make suited for prototyping, but not originally for analysis
- In summary, our goal was to modify ChucK & build tools in it to foster a tighter connection between MIR and performance, and to provide fast prototyping capabability in a new framework for MIRers
ChucK is an ongoing, open-source research experiment in designing a computer music programming language from the "ground-up". A main focus of the design was the precise programmability of time and concurrency, with a emphasis on encouraging concise, readable code. System throughput for real-time audio remains an important consideration, but first and foremost, the language was designed to provide maximal control and flexibility for the programmer. In particular, the various components of the design are as follows:
- flexibility: allows programmers to specify both high-level and low-level time-based operations, in a single unified and well-defined mechanism
- concurrency: programmers can craft and precisely synchronize parallel code modules that share both data and time
- readability: the language attempts to provide a strong correspondence between code structure, time, audio building blocks; chuck is fairly good at doing this, as the language is increasingly being used a teaching tool in computer music programs, including at Princeton, Stanford, Georgia Tech, CalArts.
- a do-it-yourself approach: by combining the ease of high-level computer music environments with the expressiveness of lower-level languages, ChucK is able to support high-level musical/sonic representations, as well as the prototyping and implementation of low-level, "white-box" signal-processing elements in the same language.
- on-the-fly: by leverage the ChucKian approach to programming, it is possible and often beneficial to write and experiment with code on-the-fly, allowing programs to be edited as they run.
More specifically, ChucK enables time itself to be computable (as a first-class citizen of the language), and allows a program to be self-aware in time and can control the rate of its own progress through time. Furthermore, many processes can share a central notion of time, making it possible to naturally reason about parallel code based on time. Next, the timing mechanism lends itself directly to a concurrent programming model, which is essential to expressively capture parallelism. Multiple processes (called shreds), each advancing time in its own manner, can be synchronized and serialized directly from the timing information. Thus arises our concept of a strongly-timed language, in which processes have precise control over their own timing and synchronization. Using this timing/concurrency model, on-the-fly programming can be carried out by exchanging time-aware code segments. To further facilitate this, the Audicle provides a graphical environment in which to write ChucK programs on-the-fly, and to visualize the programs in terms of code, audio synthesis, concurrency, and timing, all in real-time. Together, ChucK, on-the-fly programming, and the Audicle form a system and workbench for experimenting with sound synthesis for composition and performance, as more recently for creation of real-time analysis and MIR based programs.
Recent additions to ChucK to allow for prototyping and realtime MIR
- Say what they are, ...
- Available features!
(re) Reference ICMC paper and make note of cool areas this opens up, but which we don't have time to discuss here (side benefit: make it clear that this isn't a copy of the ICMC paper)
A natural consequence of ChucK's new analysis capabilities is that analysis results can be computed and used as features whose relationship to high-level musical concepts can be learned via labeled examples and standard classification algorithms.
-This is what's done in MIR, and what we hope to see done more often in computer music
-Weka is a popular framework for applied machine learning in MIR and other fields, so we have designed ChucK's learning infrastructure based on Weka to make it object oriented and extensible in the future, and to reduce the learning curve for MIR users familiar with Weka.
Describe FeatureCollector, Instance, Instances, Classifier.
Make note that this is all implemented in chuck, so users can not only change classification by adjusting parameters (e.g., number of rounds for boosting), but also by changing the classifiers themselves (e.g., AdaBoost.M1 can easily be adapted to AdaBoost.MH).
Mention that an example of learning appears in Section AAA.
One consequence of making standard classification algorithms available in a real-time context is that the training stage itself can be executed on-the-fly, during a rehearsal or even a performance. Given an appropriate control interface for providing class labels in real-time, a user can choose the concept to be learned on the spot. For example, in a live-coding improvisation with a violinist and a vocalist, the programmer may wish the computer to execute one function whenever the singer sings, and another whenever the violinist plays. After specifying the desired features and classifier in the code, the programmer can indicate
This is explored in depth in a previous paper \cite; we underscore that design goals of the learning component include making learning implementation easy enough to do in such a context, which is even more time-critical than a prototyping context. At the same time, the facts that feature extraction can be performed in ChucK with sample level control, and learning algorithms themselves are written in ChucK, ensure that specificity and flexibility are not sacrificed in the pursuit of supporting rapid development.
introduce some "working pipelines"
intent: give reader a feel for "if I have an idea and want to code it quickly, how would this work in chucK?" (and why would I not use matlab, m2k, etc.) ge? the working pipeline to my brain is busted
- intent: show that chuck indeed works for MIR tasks, that it is indeed a concise and nice syntax, flexible, powerful, etc.
- case study 1: bergstra artist/genre ID (shows above specifically) (re)
- case study 2: examples from Stanford class (shows some real uses by performers) (ge)
- case study 3: AYYYYYYYGY
For each, say what it does, why it is good, but why ours is different
- Matlab real-time stuff
- Supercollider & Max/MSP externals
conclusion & map out future work
Discussion /take-home points
- ChucK has expanded to a new frontier
- We will be continuing to work on it, driven by our own applications as well as requests from the ChucK user community
- Future work
- file I/O is coming soon
- online repository of templates and examples (like SMELT) that can get people started
- modeling (HMMs)