Transfer Learning for Automatic Speech Recognition in Low-Resource Languages: A Case Study of Palestinian Arabic

Supervisor Name

Hamed Abdelhaq

Supervisor Email

hamed@najah.edu

University

An-Najah National University

Research field

Data Science

Bio

Hamed Abdelhaq is an Assistant Professor of Computer Science at An-Najah National University, Palestine. He received his PhD in 2016 from Heidelberg University in Germany, where his research focused on spatio-temporal data analysis, social media mining, and event detection. His doctoral thesis, supervised by Prof. Dr. Michael Gertz, explored methods for mining spatio-temporal patterns from social media streams to support the real-time identification of real-world localized events. Hamed earned both his BSc (2005) and MSc (2007) degrees in Computer Science from the University of Jordan. He later served for about three years as a lecturer in the Computer Science Department at An-Najah National University before pursuing his PhD under a DAAD scholarship. His current research interests include the application of Large Language Models (LLMs) and generative AI across various domains, with a particular emphasis on intelligent systems for healthcare. Hamed worked as a senior data analyst at moovel group GmbH that provides a wide range of mobility services. His main role was to build recommendation systems that improve mobility. In addition, he worked remotely as a part-time data mining consultant at SocialDice, USA, with the main goal of building a smart resume ranking system.

Description

State-of-the-art automatic speech recognition (ASR) systems achieve strong performance in high-resource languages; however, they remain underdeveloped for many Arabic dialects, including Palestinian Arabic. The lack of curated, high-quality speech datasets and the limited number of deployment-oriented studies hinder the advancement of practical, locally deployable speech technologies for applications such as education, legal proceedings, and privacy-sensitive environments. This research aims to develop a carefully curated Palestinian Arabic speech dataset and to systematically investigate methods for adapting and optimizing modern ASR models to enable efficient, real-time, and privacy-aware local deployment. The outcomes of this research enables impactful applications such as: (1) Healthcare and privacy-sensitive environments, where hospitals can preserve patient privacy by enabling locally transcribed voice messages. (2) Live transcription tools for courts and legal environments. (3) A unified Palestinian university audio hub containing searchable lectures and recordings. (4) Foundations for multimodal educational archives supporting Palestinian students and researchers.

Dr. Yousef Najajreh

Dr. Reham Nazal

Dr. Rana Samara

Dr. Ahmed Bassalat

Dr. Nidal Farhat

Prof. Haynes Miller