A joint project between five Australian universities, called AusKidTalk, has been launched to improve the performance of voice recognition systems when being used by children.
The universities involved in AusKidTalk are University of New South Wales, the University of Sydney, Western Sydney University, Macquarie University, and the University of Melbourne.
As part of the project, they aim to build a database of Australian children's voices by recording samples of typical speech, including the repeating of words, digits, and sentences, as well as disordered speech such as unscripted storytelling spoken by 750 children aged between three to 12.
UNSW School of Electrical Engineering and Telecommunications senior lecturer Beena Ahmed explained that up until now, speech recognition software, which underpins virtual assistant technologies such as Google Assistant, Alex, and Siri, have always relied on samples of adult voices and the accuracy of these systems had been poor when it came to interacting with children.
"The main reason for this is because children's speech is quite different from adults' speech," she said.
"Children's language skills aren't as sophisticated as adults'. They might mispronounce or leave sounds or words out, or change the expected order of words. Then there are physiological differences -- their vocal tract isn't fully developed, and until they hit puberty, they speak in much higher pitches. All this makes their speech very different from adults and therefore harder for speech recognition systems to process."
Ahmed added that by improving speech recognition systems, it could be used to potentially detect if a child was encountering speaking difficulties, as well be used as a tool to help provide immediate and ongoing feedback in speech training.
"Speech therapy is a very costly business," she said.
"You've got parents spending up to AU$200 for a session