PMID- 26388721 OWN - NLM STAT- PubMed-not-MEDLINE DCOM- 20150921 LR - 20231104 IS - 1662-4548 (Print) IS - 1662-453X (Electronic) IS - 1662-453X (Linking) VI - 9 DP - 2015 TI - Sound stream segregation: a neuromorphic approach to solve the "cocktail party problem" in real-time. PG - 309 LID - 10.3389/fnins.2015.00309 [doi] LID - 309 AB - The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the "cocktail party effect." It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation and speech recognition. FAU - Thakur, Chetan Singh AU - Thakur CS AD - Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western Sydney Sydney, NSW, Australia. FAU - Wang, Runchun M AU - Wang RM AD - Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western Sydney Sydney, NSW, Australia. FAU - Afshar, Saeed AU - Afshar S AD - Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western Sydney Sydney, NSW, Australia. FAU - Hamilton, Tara J AU - Hamilton TJ AD - Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western Sydney Sydney, NSW, Australia. FAU - Tapson, Jonathan C AU - Tapson JC AD - Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western Sydney Sydney, NSW, Australia. FAU - Shamma, Shihab A AU - Shamma SA AD - Department of Electrical and Computer Engineering and Institute for Systems Research, University of Maryland College Park, MD, USA. FAU - van Schaik, Andre AU - van Schaik A AD - Biomedical Engineering and Neuroscience, The MARCS Institute, University of Western Sydney Sydney, NSW, Australia. LA - eng GR - R01 DC007657/DC/NIDCD NIH HHS/United States PT - Journal Article DEP - 20150902 PL - Switzerland TA - Front Neurosci JT - Frontiers in neuroscience JID - 101478481 PMC - PMC4557082 OTO - NOTNLM OT - FPGA OT - cochlea OT - cocktail party problem OT - machine-based speech recognition OT - temporal coherence EDAT- 2015/09/22 06:00 MHDA- 2015/09/22 06:01 PMCR- 2015/01/01 CRDT- 2015/09/22 06:00 PHST- 2015/05/19 00:00 [received] PHST- 2015/08/18 00:00 [accepted] PHST- 2015/09/22 06:00 [entrez] PHST- 2015/09/22 06:00 [pubmed] PHST- 2015/09/22 06:01 [medline] PHST- 2015/01/01 00:00 [pmc-release] AID - 10.3389/fnins.2015.00309 [doi] PST - epublish SO - Front Neurosci. 2015 Sep 2;9:309. doi: 10.3389/fnins.2015.00309. eCollection 2015.