Abstract
|
Large Language Models (LLMs) have significantly advanced Automatic Speech Recognition (ASR) by improving transcription accuracy, handling diverse linguistic contexts, and enabling cross-lingual and low-resource applications. This paper reviews the integration of LLMs into ASR systems, analyzing 19 research papers and 24 datasets. The aim is to examine key methodologies, including fine-tuning, transfer learning, and prompt engineering, and highlight their impact on phoneme recognition and contextual understanding. The datasets reviewed span diverse languages, tasks, and domains, reflecting the growing emphasis on creating inclusive ASR systems. This review also provides an overview of the main ASR architecture with its main 4 modules, to provide a concise synthesis of current advancements, identify existing limitations, and suggest future research directions to enhance the robustness, efficiency, and accessibility of ASR systems powered by LLMs.
|
Keywords
|
Spam review detection, CNN-LSTM, CNN-RNN, CNN-GRU, Big data, Deep Learning, Amazon Product Review Dataset
|