Bartek Plichta, Michigan State University
This paper will provide a demonstration of Akustyk. Akustyk (http://bartus.org/akustyk) is an open-source speech analysis, synthesis, and data management software package. It is available for MS Windows, Linux, and Mac OS X. Akustyk is fully integrated with Praat (http://praat.org), a popular, open-source software from the University of Amsterdam.
Since its introduction in November 2003, Akustyk has enjoyed a growing popularity among sociolinguists, phoneticians, and field linguists. Thanks to user contributions, Akustyk has been extended to support vowel systems of many languages and dialects including several less commonly spoken languages, such as Amharic, Eton, Hmong, Maori, and Wutung. Akustyk is committed to best practices in the areas of acoustic analysis, synthesis, and linguistic data management. Akustyk has been successfully integrated into linguistics curricula in several university research laboratories and has been reported to offer an attractive alternative to costly and unwieldy commercial software such as Kay Elemetrics Multi-speech.
Akustyk is designed to work in multi-user networked environments. Akustyk sessions are managed according to the principle of inheritance starting with user environment at the top and token-level data at the bottom of the data management hierarchy. At any point, users can hold any number of active sessions with any number of speakers, and any number of tokens within each session. For example, a field linguist studying an English/Maori bilingual community can set up one session per each language and switch between them as needed. Within each of these sessions, the linguist can set up any number of speakers and switch among them with ease.
Akustyk manages all data and metadata in an SQL-compliant database (tab and comma-delimited text files. Built-in data integrity mechanisms ensure that no data is lost and no entity relationship is broken. Akustyk?s data tables can be easily imported into relational database software. A Microsoft Access template is available for download, while help with other database software is provided on a case-by-case basis.
Akustyk takes advantage of IPA phonetic symbols. Each token is encoded with an ASCII code, an Akustyk-specific code, a numerical code, and a Unicode code. All of the encoding is done behind the scenes and the users do not need to worry about them. Each session generates its own unique vowel inventory, and users simply pick the right symbol from a drop-down menu. As a result, all acoustic tokens are correctly encoded and can thus be used for data analysis, plotting, long-term preservation, and other purposes.
Data and metadata
Akustyk generates a large number of data (over 100 parameters) at each analysis point. This data are stored both as raw and as processed/computed values. In addition, Akustyk provides easy metadata encoding tools. Metadata is encoded at each analysis level. There is session-specific metadata, speaker-specific metadata, as well as token-level metadata. Most of the metadata categories can be customized.
XML and SMIL
Akustyk takes advantage of powerful, open-source speech transcription technologies, such as XML and SMIL. It can be used to create time-aligned audio/text corpora and produce web-ready multimedia presentations with QuickTime, RealMedia, and WindowsMedia. In addition, Akustyk provides a simple way to convert popular Praat TextGrid objects into XML-encoded text files.