Brief Curriculum Vitae

Employment

2015-
Professor of Speech Science, University College London
1997-2015
Senior Lecturer in Speech Sciences, University College London
1990-1997
Lecturer in Speech Sciences, University College London
1989-2014
Technical Director, Marquesa Search Systems Ltd
1986-1990
SERC Advanced Research Fellow, UCL
1983-1986
Research Assistant, UCL
1979-1983
Software Engineer, Scicon Consultancy International.
1976-1979
Postgraduate Research Student, UCL

Qualifications

1984 PhD University of London
Title: "An Interactive Speech Pattern Audiometer"
1976 BSc (Hons) University of Warwick
Physics

Prizes, Awards And Other Honours

2009
Provost's Teaching Award, UCL

Grants

1983-1996
Acoustic Modelling of Phonetic Segments, MoD University Research Agreement with RSRE, £100k.
1986-1991
Fundamental relations between acoustic and phonetic descriptions of speech, SERC Advanced Fellowship, £125k.
1990-1993
Computational modelling of aspects of speech perception, MRC Cognitive Science Initiative, £150k. (with Andrew Faulkner and Stuart Rosen)
1996-1997
Development of an automatic parsing system, EPSRC £176k. (with Sydney Greenbaum)
1997-2000
Integrated prosodic approach to speech synthesis, EPSRC £190k. (with Jill House, York University and Cambridge University)
1997-2000
Automatic enhancement of speech, EPSRC £170k. (with Valerie Hazan)
1998-2001
Enhanced Language Modelling, EPSRC £160k.
2003-2005
Marie Curie Fellowship €110k. (with Santi Fernandez)
2007-2012
Centre for Law-Enforcement Audio Research (CLEAR), Home Office £1.1m (£400k to UCL). (with Mike Brookes and Patrick Naylor at Imperial College)
2010-2013
Performance-based Measures of Speech Quality, Research in Motion, £70k.
2012-2015
Avatar Therapy, Wellcome Trust, £1.3M.
2013-2014
Integrated voice analysis of satellite communications embedded in time and safety-critical environment (iVOICE), European Space Agency, €250k. (with Iya Whiteley)
2015-2019
Environment and Listener-Optimised Binaural Enhancement of Speech (E-LOBES), EPSRC, £1.2M (£500k to UCL). (with Stuart Rosen, and also Mike Brookes and Patrick Naylor at Imperial College)
2016-2017
Embedded Psychological Support Integrated for LONg duration missions - EPSILON - (phase 1 - VULCAN), European Space Agency, €250k.

Research Summary

My research has been at the intersection of Phonetics and Speech Technology: looking at how technological solutions to speech processing problems can improve our understanding of human speech processing, and how modern phonological theories might be applied in speech synthesis and recognition. Recently I have been exploiting speech technologies in novel clinical applications.

Speech recognition
I have been concerned with how phonological knowledge is exploited in speech recognition (Huckvale, 1990; Huckvale, 1998) and whether more modern non-linear phonological representations could be used as the basis for speech recognition (Huckvale, 1993; Huckvale, 1995). I have considered why particular technological solutions to speech recognition are successful (Holmes & Huckvale, 1994, Huckvale, 1996; Huckvale, 1998), and what this tells us about the human speech recognition task (Huckvale, 1997). I was able to show how the introduction of a morpho-phonological component improved a speech recognition system’s vocabulary (Huckvale & Fang, 2002).
Speech synthesis
I developed a speech synthesis-by-rule system within the ProSynth project (Hawkins et al, 1998; House et al, 1999; Huckvale, 1999; Ogden et al, 2000; Heid et al, 2000) being responsible for system design and implementation. This used novel representational structures for data and knowledge for "all-prosodic" synthesis. I was vice-chair of the COST 258 project "Naturalness of Synthetic Speech", helping to direct the research efforts into improving the expressiveness of speech made within the consortium (Keller et al, 2001; Huckvale, 2001). The outcome of this work led me to question whether speech synthesis technology was less concerned with human speech production than the separate problem of simulating human speech (Huckvale, 2002).
Accents
I contributed to the scientific study of accents through a metric which computes an accent similarity measure even across two different speakers (Huckvale, 2004; Huckvale, 2007a; Huckvale 2007b). The ACCDIST algorithm has been shown to give state of the art performance on accent recognition (Hanani et al, 2013), and has also been shown to be useful in adapting speech recognition systems to accented speech (Najafian, 2014). My colleague Paul Iverson has shown how ACCDIST can be used to predict the mutual intelligibility of second-language learners of English (Pinet et al, 2011). With Kayoko Yanagisawa, I helped create the concept of accent morphing, in which the accent of a speaker could be modified without affecting their speaker identity (Huckvale & Yanagisawa, 2007; Yanagisawa & Huckvale, 2008; Yanagisawa & Huckvale, 2010).
Infant speech acquisition
With Ian Howard I showed how modern machine learning methods could be applied to the computational modelling of infant speech acquisition (Huckvale & Howard, 2005; Howard & Huckvale, 2005). My frustration in the difficulty of performing experiments in the social acquisition of language led to the development of KLAIR: a virtual infant for speech acquisition research (Huckvale, Howard & Fagel, 2009; Huckvale, 2011; Huckvale & Sharma, 2013). KLAIR is a 3D animated head with the ability to hear through a real-time auditory analysis system and speak through a real-time articulatory synthesizer. It is designed to be the computer’s "interface" with caregivers for machine learning of language through social interactions. See www.phon.ucl.ac.uk/project/klair.
Speech signal enhancement
In a joint collaboration with Imperial College London I helped set up the Centre for Law-Enforcement Audio Research (CLEAR) with funding from the UK Home Office. Here we studied methods for the enhancement of poor-quality speech recordings found in law-enforcement. While Imperial College focussed on the speech signal processing aspects, UCL studied the effects of noise and signal enhancement on the intelligibility of speech to human listeners. The CLEAR centre has led to many publications, which can be seen at www.clear-labs.com. UCL's conribution was to the methodology used for collecting and modelling intelligibility data (Hilkhuysen et al, 2012; Hilkhuysen et al, 2014), and also to the idea of performance-based measures of speech quality (Huckvale & Leak, 2009; Huckvale & Frasi, 2010; Huckvale & Hilkhuysen, 2012).
Avatar therapy
With Julian Leff and Geoff Williams I designed and developed an avatar system for use within a novel therapy for schizophrenic patients suffering hearing voices. The system was highly innovative in that both the avatar face and the avatar voice could be customised to suit each patient. As part of this I had to develop a flexible real-time voice conversion system (Huckvale & Williams, 2013). Avatar therapy was shown to be highly effective in a pilot study (Leff et al, 2013; Leff et al, 2014) with some patients losing their voice hallucinations even after suffering them for many years. A larger trial is currently ongoing, funded by the Wellcome Trust and I am now responsible for developing the technical elements into a portable product. See avatartherapy.co.uk.
Voice analysis
A number of projects in voice have been undertaken recently in co-operation with the UCL Centre for Space Medicine. With funding from the European Space Agency and in collaboration with the Russian Gagarin Cosmonaut Training Centre and the Russian Institute of Biomedical Problems, we investigated how characteristics of the voice change with fatigue and cognitive load (Huckvale, 2014; Baykaner et al; 2015a, 2015b). A separate project looked at very long-term changes in the voice using recordings of cosmonauts on a 500-day simulated mission to Mars (Huckvale et al, submitted). Most recently we have looked into exploiting our findings about the effect of fatigue on voice in a commercial venture with Wombatt Fatigue Management Ltd.
Computational paralinguistics
Spinning out of the work on voice has been attempts to use machine learning methods to extract paralinguistic information from speech recordings. So far we have looked at cognitive load (Huckvale, 2014), age (Huckvale & Webb, 2015), fatigue (Baykaner et al, 2015), native language (Huckvale, 2016), effect of the common cold (Huckvale & Beke, 2017) and infant cry (Huckvale, 2018). This work also seems promising to make contributions in the clinical domain, and I am interested in using voice to help with diagnosis and rehabilitation of patients with Parkinson’s Disease, Aphasia and Dementia.

For a list of publications, see my publications page.