This Summer School will provide the opportunity to learn from a group of prestigous researchers different aspects of speech technology evaluation.

Keynote speaker:

davidVanLeeuwen

David van Leeuwen
Netherlands Forensic Institute and 
Radboud Univ. Nijmegen (Netherland)

Keynote Speech:
Speaker Bio

David A. van Leeuwen (MS 1984 Delft University of Technology,PhD 1993 University of Leiden) was with TNO Human Factors since 1994 and is with Radboud University Nijmegen (professor) since 2008. He has worked in various areas in speech technology, with special interests in evaluation of systems. Examples of evaluations he organized are the EU FP5 SQALE project (1995), evaluating large vocabulary speech recognition systems in four languages, the NFI-TNO Forensic Speaker Recognition Evaluation with 11 international participants in 2003, and the N-Best evaluation of Dutch speech recognition systems in 2008.

He has also successfully participated in various NIST Rich Transcription, Speaker and Language Recognition Evaluations. In 2008 he co-authored the EU FP7 Marie Curie ITN project Bayesian Biometrics for Forensics (BBfor2), which he is currently leading. In recent years he has been focusing on speaker diarization and automatic speaker and language recognition, with special interest in highly accurate and efficient systems, forensic application scenarios and calibration. He was re-appointed professor in the Forensic Application of Speech and Language Technology in 2012 at Radboud University.

 


 

Lecturers:

Ldocio

Laura Docío-Fernández
Multimedia Technologies Group (GTM)
AtlantTIC Research Center
University of Vigo  

Session: Audio Segmentation Evaluation
Speaker Bio

Laura Docío-Fernández received the Telecommunication Engineering and the Ph. D. degrees from the University of Vigo (Spain) in 1995 and 2001, respectively. She has participated in more than 15 research projects funded by national or international public institutions and companies. She is author of more than 40 papers published in international conference proceedings. In 2002 she was a postdoctoral fellow in the International Computer Science Institute (ICSI) of Berkeley, USA. She is currently an associate professor in the Department of Signal Theory and Communications at the University of Vigo, Spain, and a member of the Multimedia Technologies Group (GTM). Her research interests lie in the broad field of speech and audio processing, especially the analysis, modeling and recognition of speech, speaker and audio signals in general. She has participated in several Albayzin Evaluation Campaigns organized by Spanish National Network on Speech Technology. Specifically, she has participated in Albayzin 2010 and 2012 Audio Segmentation Evaluations, in Albayzin Speaker Diarization Evaluation and in Albayzin 2010 and 2012 Language Recognition Evaluations.

 


 

JavierGlz

Javier González-Domínguez 
ATVS
Universidad Autónoma de Madrid

 

 

Session: Speaker Recognition Evaluation
Speaker Bio

 Javier Gonzalez-Dominguez received the M.Sc. degree in Computer Science in 2005 and Ph.D. degree "cum-laude" in Electrical Engineering in 2011, from Universidad Autonoma de Madrid, Spain, where he is currently working as an assistant professor and he is a member of the ATVS Biometric Recognition Group. He has carried out different research internships at worldwide leading groups in biometric recognition such as SAIVT-QUT (2008, Brisbane, Australia), TNO (2009, Utrecht, The Netherlands) and Google Inc. Research (2010 New York, U.S.A). His research interests are mainly focused on machine learning applied to automatic speech, speaker and language recognition. He has actively participated and co-led several ATVS systems submitted to the NIST automatic speaker and language recognition evaluations since 2006 and he has been recipient of several awards such as the Microsoft Best student paper at SIG-IL 2009 conference and Google best Thesis award at IberSPEECH 2012. During the year 2013-2014 he will be part of the Google Speech Processing team (U.S.A, New York) as a visiting professor.

 


JoaquinGonzalez-Rodriguez

Joaquin González-Rodríguez 
ATVS
Universidad Autónoma de Madrid 

Session: Speaker Recognition Evaluation 
Speaker Bio

 Joaquin Gonzalez-Rodriguez, received the M.S. degree in 1994; and the Ph.D. degree "cum laude" in 1999, both in electrical engineering, from Univ. Politecnica de Madrid (UPM), Spain. Dr. Gonzalez-Rodriguez is founder and co-director of the Biometric Recognition Group - ATVS. After 15 years of research and lecturing at UPM, he is since May 2011 Professor at the Computer Science Department at Univ. Autonoma de Madrid, Spain, where he leads the Speech group of ATVS. He has led ATVS participations in NIST Speaker (2001, 2002, 2004, 2005, 2006, 2008, 2010 and 2012) and Language (2005, 2007, 2009 and 2011) Recognition Evaluations, and 2003 NFI-TNO Forensic Speaker Recognition Evaluation. Dr. Gonzalez-Rodriguez is since 2000 an invited member of the FSAAWG (Forensic Speech and Audio Analysis Working Group) in ENFSI (European Network of Forensic Science Institutes). His research interests are focused on speaker and language recognition, forensics and biometrics. He is a member of ISCA and the Signal Processing Society of IEEE, and is also a member of the Program Committee of the ISCA Odyssey conferences on Speaker and Language Recognition, being vice-chair of Odyssey 2004 in Toledo (Spain). During academic term 2010-2011, he was a Visiting Scholar at ICSI (International Computer Science Institute) in the University of California at Berkeley.  

 


foto_jmmm5

Juan M Montero 
Speech Technology Group
Universidad Politécnica de Madrid

Session: Speech Synthesis Evaluation
Speaker Bio

Juan M Montero, is an Associate Professor at Electronic Engineering Department at UPM and researcher of the Speech Tecnology Group (http://gth.die.upm.es). Dr. Montero received a M.S. degree in Telecommunication Enginering and a Ph.D. degree "cum laude" from Universidad Politecnica de Madrid (UPM). His main research areas are parametric speech synthesis, affective computing, speaking style modeling and Project Based Learning. He is a member of ISCA and several IEEE societies. Dr. Montero was visiting researcher at the International Computer Science Institute in Berkeley and at DFKI in Saarbrucken, and will visit CSTR in Edinburgh in 2013. He has participated in more than 40 research projects funded by national or international public institutions and companies. He is author of more than 100 papers published in scientific journals or conference proceedings. He has participated in all the Albayzin evaluations on speech synthesis, and was the organiser of the 2012 evaluation campaign. 

 


AlfonsoOrtega Alfonso Ortega
University of Zaragoza
Session: Audio Segmentation Evaluation
Speaker Bio

Alfonso Ortega was born in Teruel, Spain. He received the Telecommunication Engineering and the Ph. D. degrees from the University of Zaragoza in 2000 and 2005, respectively. His Ph. D. Thesis, advised by Dr. Eduardo Lleida, received the PhD Extraordinary Award and the Telefónica Chair Award to the best technological Ph. D. He has participated in more than 40 research projects funded by national or international public institutions and companies. He is author of more than 50 papers published in scientific journals or international conference proceedings. In 2006 he was visiting researcher at the Center for Robust Speech Systems at the University of Texas at Dallas (USA). He is presently Associate Professor at the Department of Electronic Engineering and Communications, University of Zaragoza. His research interests span the areas of digital speech and audio processing, analysis and modeling of speech and speaker, and robust automatic speech recognition.

 


PaulaLopez Paula López-Otero
Multimedia Technologies Group (GTM)
AtlantTIC Research Center
University of Vigo
Session: Audio Segmentation Evaluation
Speaker Bio

Paula López Otero was born in A Coruña. She received the degree in Telecommunication Engineering and the Master of Advanced Studies in 2008 and 2010, respectively, both from the Universidade de Vigo. She is currently finishing her Ph.D. (advised by Carmen García Mateo and Laura Docío Fernández) at the Multimedia Technologies Group  under a research grant. Her research interests focus on audio and speaker segmentation and clustering, field in which she has published several research papers in international conference proceedings. She has participated in Albayzin Audio Segmentation Evaluations in 2010 and 2012, and in Albayzin Speaker Diarization Evaluation in 2010 as well. She is also interested in language recognition, being a participant of Albayzin Language Recognition Evaluations in 2010 and 2012. 

 


MikelPenagarikano Mikel Penagarikano
Departamento de Electricidad y Electrónica
Facultad de Ciencia y Tecnología ZTF/FCT
Universidad del País Vasco UPV/EHU
Barrio Sarriena s/n
48940 Leioa, Spain
Session: Language Recognition Evaluation
Speaker Bio

Mikel Penagarikano was born in Zumarraga, Spain, in 1973. He received the M.Sc. degree in Physics from the University of the Basque Country (UPV/EHU) in 1996. From 1997 to 2000 he was at the Department of Electricity and Electronics (UPV/EHU) under a research grant, and started the Software Technologies Working Group (GTTS, http://gtts.ehu.es) with Germán Bordel. Since 2000, he has been Assistant Professor of Computer Science in the same department. His research interest focuses on developing efficient software architectures for speech processing applications, such as ASR, spoken language recognition, speaker recognition, etc. As part of GTTS, he has participated in all NIST Language and Speaker Recognition Evaluations from 2007 to date, and organized the Albayzin 2008, 2010 and 2012 Language Recognition Evaluations, with the sponsorship of the Spanish National Network on Speech Technology. He has published more than 70 research papers in national and international conferences and journals. 

  


LuisJFuentes

Luis Javier Rodriguez-Fuentes
Departamento de Electricidad y Electrónica
Facultad de Ciencia y Tecnología ZTF/FCT
Universidad del País Vasco UPV/EHU
Barrio Sarriena s/n
 48940 Leioa, Spain

 

 

Session: Language Recognition Evaluation
Speaker Bio

Luis Javier Rodriguez-Fuentes was born in Bilbao in 1968. He received the M.Sc. and Ph.D. degrees in Physics from the University of the Basque Country (UPV/EHU) in 1991 and 2004, respectively. From 1993 to 1996 he was at the Department of Electricity and Electronics (UPV/EHU) under a research grant. Since 1996, he has been Associate Professor of Computer Science in the same department. His past research activities include acoustic modeling, spontaneous speech modeling and speaker adaptation for ASR. In 2006, he joined the Software Technologies Working Group (GTTS, http://gtts.ehu.es) and started research activities with Germán Bordel and Mikel Peñagarikano (later, in 2009, Amparo Varona also joined the group). His current research interests include spoken document retrieval, spoken language recognition and speaker recognition. As part of GTTS, he has participated in all NIST Language and Speaker Recognition Evaluations from 2007 to date, and organized the Albayzin 2008, 2010 and 2012 Language Recognition Evaluations, with the sponsorship of the Spanish National Network on Speech Technology. He has published more than 80 research papers in national and international conferences and journals.