Loading...
hidden

Mobile-Version anzeigen

Meta-Navigation

Startseite – Hochschule Luzern

Sprachwahl und wichtige Links

  • Zum Inhalt springen
  • Kontakt
  • Login
  • De
Suche starten

Hauptnavigation

Departementsnavigation

  • Technik & Architektur
  • Wirtschaft
  • Informatik
  • Soziale Arbeit
  • Design & Kunst
  • Musik

Unternavigation

  • Studium
  • Weiterbildung
  • Forschung
  • Institute
  • Über uns

Unternavigation

  • Acoustic Scene and Room Classification for Real-Time Applications
  • A Scalable Hardware Accelerator for Binary Approximated CNNs

Breadcrumbs-Navigation

  1. Technik & Architektur Technik & Architektur
  2. Über uns Über uns
  3. Institute Institute
  4. Institute im Bereich Technik Institute im Bereich Technik
  5. Elektrotechnik Elektrotechnik
  6. Acoustic Scene and Room Classification for Real-Time Applications Acoustic Scene and Room Classification for Real-Time Applications

Acoustic Scene and Room Classification for Real-Time Applications Seeking optimal speech intelligibility for the hearing impaired

To provide optimal speech intelligibility in constantly changing environments, hearing aid parameters must be adapted in real-time to acoustic scenes and soundscapes.

Processing of acoustic signals for hearing aids seeks optimal speech intelligibility within constantly changing acoustic scenes and soundscapes. This requires the adjustment of processing algorithm parameters to be performed in real-time. This work introduces a system which is able to recognize acoustic environments continuously using Artificial Intelligence (AI) in the form of a Deep Convolutional Neural Network (CNN) with a focus on real-time implementation. Inspired by VGGNet-16, the CNN architecture was modified to a multi-label multi-output model which is able to predict combinations of scene and soundscape labels simultaneously while sharing the same feature extraction. For training we acquired a custom dataset consisting of 23.8 hours of high-quality binaural audio data including five classes per label which are clearly distinguishable by humans. Using a manual Grid Search method, we were able to optimize three models in different complexity domains for choosing a trade-off between accuracy and throughput. CNNs were then post-quantized to 8-bit which achieved an overall accuracy of 99.07% in the best case. After reducing the number of Multiply-Accumulate (MAC) operations 154x and parameters 18x, the classifier was still able to detect scenes and soundscapes with an acceptable accuracy of 94.82%. This compressed model allows real-time inference at the edge on discrete low-cost hardware with a clock speed of 10 MHz and one inference per second.

publication

hidden

Prof. Dr. Jürgen Wassner

Co-Leiter CC Intelligent Sensors and Networks

+41 41 349 33 56

E-Mail anzeigen

Footer

FH Zentralschweiz

Links zu den Social-Media-Kanälen

  •  Facebook
  •  Instagram
  •  Twitter
  •  LinkedIn
  •  YouTube
  •  Flickr

Kontakt

Logo Technik & Architektur

Hochschule Luzern

Technik & Architektur

Technikumstrasse 21
CH- 6048 Horw

+41 41 349 33 11

technik-architektur@hslu.ch

Direkteinstieg

  • Für Studierende
  • Weiterbildungsinteressierte
  • Für Mitarbeitende
  • Medienschaffende

Quicklink

  • Personensuche
  • Jobs & Karriere
  • Organisation des Departements Technik & Architektur
  • Facts & Figures
  • Diversity
  • Räume mieten
  • Bibliothek

Statische Links

  • Newsletter abonnieren
  • Datenschutzerklärung
  • Impressum
  • Institutionell akkreditiert nach HFKG 2019–2026
Logo Swissuniversities

QrCode

QrCode
Wir verwenden Cookies, um Ihnen eine optimale Nutzung der Website zu ermöglichen und um Ihnen auf unserer Website, auf anderen Websites und in sozialen Netzwerken personalisierte Werbung anzuzeigen. Indem Sie diesen Hinweis schliessen oder mit dem Besuch der Seite fortfahren, akzeptieren Sie die Verwendung von Cookies. Weitere Informationen zu diesen Cookies und wie Sie die Datenbearbeitung durch sie ablehnen können, finden Sie in unserer Datenschutzerklärung.
OK