Hermann Ackermann, Wolfram Ziegler
Audra Ames, Sara Wielandt, Dianne Cameron, Stan Kuczaj
David Ardell, Noelle Anderson, Bodo Winter
Rie Asano, Edward Ruoyang Shi
Mark Atkinson, Kenny Smith, Simon Kirby
Andreas Baumann, Christina Prömer, Kamil Kazmierski, Nikolaus Ritt
Christian Bentz
Aleksandrs Berdicevskis, Hanne Eckhoff
Richard A. Blythe, Alistair H. Jones, Jessica Renton
Cedric Boeckx, Constantina Theofanopoulou, Antonio Benítez-Burraco
Megan Broadway, Jamie Klaus, Billie Serafin, Heidi Lyn
Jon W. Carr, Kenny Smith, Hannah Cornish, Simon Kirby
Federica Cavicchio, Livnat Leemor, Simone Shamay-Tsoory, Wendy Sandler
Zanna Clay, Jahmaira Archbold, Klaus Zuberbuhler
Katie Collier, Andrew N. Radford, Balthasar Bickel, Marta B. Manser, Simon W. Townsend
Jennifer Culbertson, Simon Kirby, Marieke Schouwstra
Christine Cuskley, Vittorio Loreto
Christine Cuskley, Bernardo Monechi, Pietro Gravino, Vittorio Loreto
Dan Dediu, Scott Moisik
Sabrina Engesser, Amanda R. Ridley, Simon W. Townsend
Dankmar Enke, Roland Mühlenbernd, Igor Yanovich
Kerem Eryilmaz, Hannah Little, Bart de Boer
Nicolas Fay, Shane Rogers
Maryia Fedzechkina, Becky Chu, T. Florian Jaeger, John Trueswell
Olga Feher, Kenny Smith, Elizabeth Wonnacott, Nikolaus Ritt
Piera Filippi, Sebastian Ocklenburg, Daniel Liu Bowling, Larissa Heege, Albert Newen, Onur Güntürkün, Bart de Boer
Piera Filippi, Jenna V. Congdon, John Hoang, Daniel Liu Bowling, Stephan Reber, Andrius Pašukonis, Marisa Hoeschele, Sebastian Ocklenburg, Bart de Boer, Christopher B. Sturdy, Albert Newen, Onur GÜntÜrkÜn
Molly Flaherty, Katelyn Stangl, Susan Goldin-Meadow
Marlen Fröhlich, Paul H Kuchenbuch, Gudrun Müller, Barbara Fruth, Takeshi Furuichi, Roman M Wittig, Simone Pika
Victor Gay, Daniel Hicks, Estefania Santacreu-Vasut
Andreea Geambasu, Michelle J. Spierings, Carel ten Cate, Clara C. Levelt
Matt Hall, Russell Richie, Marie Coppola
Stefan Hartmann, Peeter Tinits, Jonas Nölle, Thomas Hartmann, Michael Pleyer
Wolfram Hinzen, Joana Rosselló
Rick Janssen, Bodo Winter, Dan Dediu, Scott Moisik, Sean Roberts
Rick Janssen, Dan Dediu, Scott Moisik
Jasmeen Kanwal, Kenny Smith, Jennifer Culbertson, Simon Kirby
Deborah Kerr, Kenny Smith
Buddhamas Kriengwatana, Paola Escudero, Anne Kerkhoven, Carel ten Cate
Adriano Lameira, Jeremy Kendal, Marco Gamba
Molly Lewis, Michael C. Frank
Casey Lister, Tiarn Burtenshaw, Nicolas Fay, Bradley Walker, Jeneva Ohan
Hannah Little, Kerem Eryılmaz, Bart de Boer
Hannah Little, Kerem Eryılmaz, Bart de Boer
Giuseppe Longobardi, Armin Buch, Andrea Ceolin, Aaron Ecay, Cristina Guardiano, Monica Irimia, Dimitris Michelioudakis, Nina Radkevich, Gerhard Jaeger
Heidi Lyn, Stephanie Jett, Megan Broadway, Mystera Samuelson
Michael Mcloughlin, Luca Lamoni, Ellen Garland, Simon Ingram, Alexis Kirke, Michael Noad, Luke Rendell, Eduardo Miranda
Adrien Meguerditchian, Damien Marie, Konstantina Margiotoudi, Scott A. Love, Alice Bertello, Romain Lacoste, Muriel Roth, Bruno Nazarian, Jean-Luc Anton, Olivier Coulon
Jérôme Michaud
Ashley Micklos
Marie Montant, Johannes Ziegler, Benny Briesemeister, Tila Brink, Bruno Wicker, Aurélie Ponz, Mireille Bonnard, Arthur Jacobs, Mario Braun
Yasamin Motamedi, Marieke Schouwstra, Kenny Smith, Simon Kirby
Roland Mühlenbernd, Johannes Wahle
Tomoya Nakai, Kazuo Okanoya
Savithry Namboodiripad, Daniel Lenzen, Ryan Lepic, Tessa Verhoef
Alan Nielsen, Dieuwke Hupkes, Simon Kirby, Kenny Smith
Bill Noble, Raquel Fernández
Irene M. Pepperberg, Katia Zilber-Izhar, Scott Smith
Lynn Perry, Marcus Perlman, Gary Lupyan, Bodo Winter, Dominic Massaro
Ljiljana Progovac
Andrea Ravignani, Tania Delgado, Simon Kirby
Terry Regier, Alexandra Carstensen, Charles Kemp
Lilia Rissman, Laura Horton, Molly Flaherty, Marie Coppola, Annie Senghas, Diane Brentari, Susan Goldin-Meadow
Gareth Roberts, Mariya Fedzechkina
Carmen Saldana, Simon Kirby, Kenny Smith
Carlos Santana
William Schueller, Pierre-Yves Oudeyer
Catriona Silvey, Christos Christodoulopoulos
Katie Slocombe, Stuart Watson, Anne Schel, Claudia Wilke, Emma Wallace, Leveda Cheng, Victoria West, Simon Townsend
Ruth Sonnweber, Andrea Ravignani
Michelle Spierings, Carel ten Cate
Kevin Stadler, Elyse Jamieson, Kenny Smith, Simon Kirby
Monica Tamariz, Joleana Shurley
Monica Tamariz, Jon W. Carr
Bill Thompson, Heikki Rasilo
Oksana Tkachman, Carla L. Hudson Kam
Simon Townsend, Andrew Russell, Sabrina Engesser
Francesca Tria, Vittorio Loreto, Vito Servedio, S. Mufwene Salikoko
Anu Vastenius, Jordan Zlatev, Joost Van de Weijer
Tessa Verhoef, Carol Padden, Simon Kirby
Slawomir Wacewicz, Przemyslaw Zywiczynski, Arkadiusz Jasinski
Bodo Winter, David Ardell
Bodo Winter, Lynn Perry, Marcus Perlman, Gary Lupyan
Marieke Woensdregt, Kenny Smith, Chris Cummins, Simon Kirby
Eva Zehentner, Andreas Baumann, Nikolaus Ritt, Christina Prömer
Keywords: agent modelling, anatomical biasing, evolutionary computation, neural networks
Short description: Simple neural network agents are able to replicate speech sounds using a 3D vocal tract model. Investigation of anatomical biases in population is now feasible.
Abstract:
Many factors have been proposed to explain why groups of people use different speech sounds in their language. These range from cultural, cognitive, environmental (e.g., Everett, et al., 2015) to anatomical (e.g., vocal tract (VT) morphology). How could such anatomical properties have led to the similarities and differences in speech sound distributions between human languages?
It is known that hard palate profile variation can induce different articulatory strategies in speakers (e.g., Brunner et al., 2009). That is, different hard palate profiles might induce a kind of bias on speech sound production, easing some types of sounds while impeding others. With a population of speakers (with a proportion of individuals) that share certain anatomical properties, even subtle VT biases might become expressed at a population-level (through e.g., bias amplification, Kirby et al., 2007). However, before we look into population-level effects, we should first look at within-individual anatomical factors. For that, we have developed a computer-simulated analogue for a human speaker: an agent. Our agent is designed to replicate speech sounds using a production and cognition module in a computationally tractable manner.
Previous agent models have often used more abstract (e.g., symbolic) signals. (e.g., Kirby et al., 2007). We have equipped our agent with a three-dimensional model of the VT (the production module, based on Birkholz, 2005) to which we made numerous adjustments. Specifically, we used a 4th-order Bezier curve that is able to capture hard palate variation on the mid-sagittal plane (XXX, 2015). Using an evolutionary algorithm, we were able to fit the model to human hard palate MRI tracings, yielding high accuracy fits and using as little as two parameters. Finally, we show that the samples map well-dispersed to the parameter-space, demonstrating that the model cannot generate unrealistic profiles. We can thus use this procedure to import palate measurements into our agent’s production module to investigate the effects on acoustics. We can also exaggerate/introduce novel biases.
Our agent is able to control the VT model using the cognition module.
Previous research has focused on detailed neurocomputation (e.g., Kröger et al., 2014) that highlights e.g., neurobiological principles or speech recognition performance. However, the brain is not the focus of our current study. Furthermore, present-day computing throughput likely does not allow for large-scale deployment of these architectures, as required by the population model we are developing. Thus, the question whether a very simple cognition module is able to replicate sounds in a computationally tractable manner, and even generalize over novel stimuli, is one worthy of attention in its own right.
Our agent’s cognition module is based on running an evolutionary algorithm on a large population of feed-forward neural networks (NNs). As such, (anatomical) bias strength can be thought of as an attractor basin area within the parameter-space the agent has to explore. The NN we used consists of a triple-layered (fully-connected), directed graph. The input layer (three neurons) receives the formants frequencies of a target-sound. The output layer (12 neurons) projects to the articulators in the production module. A hidden layer (seven neurons) enables the network to deal with nonlinear dependencies. The Euclidean distance (first three formants) between target and replication is used as fitness measure. Results show that sound replication is indeed possible, with Euclidean distance quickly approaching a close-to-zero asymptote.
Statistical analysis should reveal if the agent can also: a) Generalize: Can it replicate sounds not exposed to during learning? b) Replicate consistently: Do different, isolated agents always converge on the same sounds? c) Deal with consolidation: Can it still learn new sounds after an extended learning phase (‘infancy’) has been terminated? Finally, a comparison with more complex models will be used to demonstrate robustness.
Citation:
Janssen R., Dediu D. and Moisik S. (2016). Simple Agents Are Able To Replicate Speech Sounds Using 3d Vocal Tract Model. In S.G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Fehér & T. Verhoef (eds.) The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Available online: http://evolang.org/neworleans/papers/97.html