News and Updates

  • 10 January 2023

    HEAR-VR: a video database for intelligibility testing in Virtual Reality?

    Announcing our new audiovisual corpus for building listening tests in VR.

    VR Video Demo

    The corpus consists of 200 video and audio files from 10 speakers suitable for compositing into new 360° videos for VR.

  • 1 January 2021

    A new year and a new daily puzzle!

    Introducing my third daily puzzle: The Knights Tour. Find the path of the knight as he traverses the chessboard visiting each cell once. The Knights Tour Puzzle joins my Loopy Puzzle and Starmino Puzzle offering a new free puzzle every day:

    These puzzle pages are now Progressive Web Applications, so can be installed onto your phone & tablet without needing to go through the Play Store.

  • 25 August 2020

    VTDEMO vocal tract synthesis and other educational tools

    One popular windows program of mine is VTDEMO, an implementation of Shinji Maeda's articulatory synthesizer. The recent push to move all our Phonetics lab teaching on-line for the coming term has encouraged me to re-implement this for the web. With the help of emscripten (see below) I've been able to take Maeda's vocal tract synthesis code and run it in the browser. Together with a bit of web audio trickery, the result is a vocal tract that you can change in shape while seeing and hearing the consequences for the sound produced. You can find Web version of VTDEMO here.

    I have also created an up-to-date list of all my browser-based tools for Phonetics and Speech Science education.

  • 29 June 2020

    More WASP updates

    At UCL we are preparing to deliver lab classes on-line for the coming term while social distancing because of coronavirus is still in place. To move things like phonetic transcription on-line, I've updated the annotation facilities in WASP 1.8 to allow saving of annotations in .WAV files and the input of phonetic symbols.

    At the same time I have given the Web version WASP an overhaul. This is what it looks like now:

    The web version has very similar functionality to the Windows version, and uses the same pitch analysis code: RAPT and REAPER written by David Talkin. This has been made possible by the wonderful emscripten system for converting C++ code into web-assembly, which can be called from Javascript.

    On a separate note, a Chinese translation of WASP for Windows has been created by the Education University of Hong Kong for acoustic analysis in Chinese clinical settings.

  • 29 September 2019

    WASP 1.60 - updated speech recording and display program

    WASP (Waveforms, Annotations, Spectrograms and Pitch) is probably the most popular speech program that I've written. It's a Windows program for recording, analysing and displaying speech signals. It is a lightweight program with few analysis options which makes it ideal for quick demonstrations or making quick figures.

    The new version 1.60 released today now incorporates a pitch period marker method that identifies the locations and durations of glottal cycles. In addition version 1.60 also contains a new statistics dialogue that displays information about fundamental frequency and voice quality.

    The program is still free to download and use, I hope that people find it useful. WASP will also run on MacOS using a Windows emulation layer.

  • 22 April 2019

    Notakto - three-board misère Noughts and Crosses

    I've written an implementation of a logic game in the spirit of Tic-Tac-Toe and Nim called Notakto. You can play against another person or against the computer. The computer is pretty good though - can you find out how to beat it?

  • 21 February 2019

    So what is this thing called 'Deep Learning' anyway?

    My talk at the Speech Science Forum today was a presentation about Deep Learning:

    Recent years have seen huge improvements in the performance of artificially intelligent systems for recognising speech, translating between languages, captioning images, driving cars, playing video games, and so on. Underlying these improvements has been an innovation in machine learning from data called "deep learning". This new approach to machine learning puts emphasis on dealing with naturalistic data rather than relying on carefully curated data sets, makes less use of human knowledge about the best way to perform feature extraction or inference, and creates systems that work "end-to-end" from raw input to usable output. The approach has become known as deep learning because it exploits a hierarchical structure of representations over many levels of processing at increasing degrees of abstraction. Deep learning has benefitted from advances in the availability of machine-readable data, the increasing power of computers, improved methods for functional optimisation, and the popularity of competitions with standard training data, test data and evaluation criteria. In this introductory talk, I'll put deep learning in context, show some applications of deep learning, and give some pointers as to how you can get started using deep learning methods for your own applications.

    The dedicated can even look at the slides.

  • 5 January 2019

    Web applications for teaching Speech Science

    I have now converted a couple of my teaching applications to run in the browser:

    • RoboVoice: a tool for creating new utterances by cutting and pasting together excerpts from recordings. Used to demonstrate how speech sounds change in form depending upon their context. You can't just cut out a syllable or word from one place and paste it into another, you have to think about coarticulation, stress, pitch and timing.
    • Pitch Laboratory: a tool for experimenting with the pitch of sounds. You can record a single note, or a speech sound or a sentence and make measurements of pitch. Use it to look at the fundamental frequency of common sounds or a note from a musical instrument. You can also change the pitch of a recording by playing it back slower or faster. You can generate periodic sounds by combining pure tones to create beats.
  • 1 April 2018

    The Find-A-Word Android App

    Not an April fool's joke, my first Android app is now available for download from the Play Store.

    Find-A-Word finds English words that match a given letter pattern or anagram. It contains a huge dictionary of over 500,000 word forms packed into a tiny 3MB download. It also searches super-fast thanks to its use of a finite-state automaton.

    Find-A-Word is also a web site of mine: findaword.net, so I used this as a way to learn how to build apps. Over the years I've tried a number of toolkits designed to make app development "easier", but mostly they are a pain to use. A couple of years ago I experimented with Cordova, a toolkit that allows you to write apps in HTML, CSS and Javascript for multiple mobile platforms. It comes with a library (Phonegap) that allows you to access the phone hardware: GPS, compass, accelerometer, etc. However because it is cross-platform it is also least-common-denominator: you get the most basic access supported by all devices. I found it very unhelpful when trying to build an app that recorded high-quality audio. More recently I looked at a toolkit called Crosswalk, which is a cut-down version of Cordova that bundles a complete web browser with your own code. This means that you can rely on the browser environment where your app is running without having to tailor your code to Android or iOS, etc. This worked OK but (i) the downloads were huge (>20Mb) and (ii) the project stopped being developed in 2017. For the Find-A-Word app, I've switched over to building on top of Android WebView, this is the web browser object found on all Android devices. With help from the many tutorial guides on the web (thank you Stack Overflow!) I was able to take my web site code and package it using Android Studio. I've learned a lot by attempting this. It's not so bad when you dive in.

  • 30 December 2017

    How alert are you?

    Needing a simple psychophysical test to generate reference values of alertness, I looked around for existing tests. The most commonly referenced was the Psychomotor Vigilance Task, which has been used in a number of studies of sleepiness and seems to correlate well with subjective scores of sleepiness as well as with EEG measurements. However it is basically just a reaction time task and it is not clear how best to deal with errors of omission or false alarms. It also seems to me to be the kind of thing that wakes you up - exactly the opposite of what you need when testing alertness!

    Inspired by an earlier vigilance task based on watching a clock which occasionally skips ahead (the Mackworth Clock), I've instead created an alertness test of my own, based on tracking a moving object on a touch screen. In this test, the object appears to follow a sinusoidal path, but in fact speeds up and slows down at random times so that the subject needs to be vigilant to maintain tracking accuracy. The test is easy to perform and deliberately monotonous, which I hope will allow for a good measure of alertness. The test provides simple metrics in the form of mean lag, overall accuracy, accuracy in the non-standard cycles, and maximum tracking error. Early results show encouraging correlation of tracking accuracy with self-reported sleepiness.

    You can try out the alertness test here.

  • 24 November 2017

    Avatar Therapy in the News

    Avatar Therapy is a new approach to the treatment of certain mental health conditions, particularly auditory hallucinations (hearing voices). The therapy was invented by Prof. Julian Leff to improve the lives of schizophrenic patients suffering from persecutory voice hallucinations despite the best available drug treatment.

    We have recently completed a major clinical trial of Avatar Therapy at the Institute of Psychiatry funded by the Wellcome Trust. This trial involved 150 patients with persecutory voices divided between two therapy approaches. The results of this trial has recently been published in the The Lancet Psychiatry. A short video about the Avatar Therapy trial can be seen in this BBC News report.

    To find out more about the current state of Avatar Therapy, go to www.avatartherapy.co.uk.

  • 13 November 2016

    Ripple tank simulation

    A new animated ripple tank demonstration in Javascript. Demonstrates propagation of wave motion in two-dimensions. Includes slit obstacles and target shapes.

  • 25 September 2016

    Starmino - a new daily logic puzzle

    Following on from Loopy Puzzle is another logic puzzle with a daily challenge. Starmino puzzles are variants of Fillomino puzzles with a new twist and a simple, clean user interface. The challenge is to cover the puzzle board with polyomino tiles labelled with digits:

  • 18 September 2016

    Find-a-Word - find English words for crosswords and other puzzles.

    Fed up with on-line tools for finding words for crosswords, I've built my own system at findaword.net. You select the length of word you are looking for and whatever letters you know already and it reports the words that match from a 500,000 word dictionary. You can specify letters in given positions, in any position or some combination. It's also super-fast - at least 10× faster than other on-line systems.

  • 18 August 2016

    Embedding web tools into presentations

    For a few years now I've been writing my lecture handouts as web pages, but still somehow remained stuck with PowerPoint for lecture slides. Recently I've been exploring how lecture and talk presentations can be created in HTML using reveal.js. A great benefit is the possibility to integrate my web-based speech analysis tools directly into the slides using iframes. Here is a demonstration.

  • 26 June 2016

    Bayesian Statistics in Javascript

    I've become excited about Bayesian methods for statistical analysis of data after reading John Kruschke's book on Bayesian methods. To encourage and enthuse others, I thought I might build some web demonstrations of how Bayesian sampling can be used to estimate the range of credible model parameters that fit some data set. I then came across bayes.js, a Javascript library that allows you to perform Bayesian sampling.

    The Bayesian sampling demonstrations use the bayes.js library to show how a t-distribution can be fitted to a sample of data, to show how to compare the means of two samples, and to show how linear regression can be performed. The demonstrations feature live animations of the sampling process, and you can even cut and paste your own data.

  • 1 May 2016

    Loopy Puzzles - a daily challenge

    Loopy Puzzles are simple pen-and-paper puzzles in which you must connect lines to create a loop within the given grid of cells. The LoopyPuzzle.com web site now has a different puzzle every day that you can solve using your phone.

  • 1 March 2016

    Audio3D - a virtual audio simulation system

    Audio3D is a free Windows program for simulating 3D audio. Audio3D takes the specification for a room and the position of the listener and some sound sources and generates a binaural audio signal that simulates what the listener would hear in the room. You can then experience the sound by listening over headphones. In addition Audio3D supports the use of ahead tracker, so that the room stays stationary while you move your head.

    I wrote Audio3D as part of our E-Lobes project into advanced hearing aids. We plan to run listening experiments in the virtual room which will simulate the kind of problem listening environments for hearing-impaired listeners. Our goal is to develop "3D-aware" signal processing for hearing aids which will unlock the ability of the brain to deal with audio coming from different directions even when the listener's hearing is impaired.

  • 17 December 2015

    VULCAN and Principia Mission

    The iVOICE and VULCAN projects are mentioned on the Principia Mission web pages that describe the science experiments that will be performed by Tim Peake in his mission to the International Space Station.

    We hope that Tim will contribute to VULCAN by making some test recordings for us to explore the practicalities of obtaining high quality audio recordings in space and to analyze how microgravity affects the voice.

    The excitement over Tim's mission has lead to interest by the media in the voice analysis work. See this article from the Daily Telegraph.

  • 15 December 2015

    VULCAN voice analysis project

    The VULCAN project is a new feasibility study also funded by the European Space Agency under the Artes 20 programme. The project partners are UCL Speech, Hearing and Phonetic Sciences, UCL Mullard Space Sciences Laboratory Centre for Space Medicine and the Institute for Biomedical Problems (IBMP) in Moscow, Russia. It will run from January 2016 to January 2017.

    The VULCAN project is part of a larger endeavour investigating how psychological support may be given to astronauts undertaking a long-term mission, for example a mission to Mars that might take up to two years. VULCAN builds on the outcomes of the iVOICE project that showed how signal analysis and machine learning methods may be applied to the prediction of speaker fatigue and cognitive load from voice recordings. The idea of VULCAN is to develop a technology capable of monitoring the general health and well-being of astronauts on long-term missions from speech recordings.

    At the heart of VULCAN is a new technology for Longitudinal Voice Analysis. This is a combination of innovative signal analysis methods together with statistical modelling of a sequence of recordings to uncover either anomalous recordings or long-term trends in the voice. We will demonstrate the effectiveness of the technique by applying it to several thousand spoken messages recorded as part of the Mars500 simulated mission to Mars experiment conducted by IBMP in 2010/11.

    Read more about our applied voice research projects.

  • 23 July 2015

    ESYSTEM web application

    The web version of ESystem, the signals & systems learning tool, has been updated with the ability to upload signals and implement user-designed systems. You can find it on SpeechAndHearing.net.

  • 11 July 2015

    AmPitch web application

    The combination of more powerful computers and the web audio API means we can do much mroe signal processing within web applications. I'v been meaning to update my RTPItch program for a while, so I've taken the opportunity to re-imagine it as a web application. AmPitch is a real-time scrolling amplitude and pitch display designed for speech. It works best when configured for the speaker's normal speaking pitch range. Try it out here.

  • 26 June 2015

    Seahaven Towers Solitaire

    My favourite solitaire game is Seahaven Towers, so I've written a web version using only HTML, CSS and Javascript. It has some novel features, including a guarantee that all games can be solved. Try it out here. Click on the logo for playing instructions.