Optimalisasi TinyBERT-Vosk Untuk Kontrol Suara Real-Time Robot Mecanum Raspberry Pi

Gogor Christtmas Setyawan(1*),Surjawirawan Dwiputranto(2),Kristian Juri Damai Lase(3)
(1) Universitas Kristen Immanuel
(2) Universitas Kristen Immanuel
(3) Universitas Kristen Immanuel
(*) Corresponding Author
DOI : 10.35889/jutisi.v14i3.3327

Keywords


Vosk/ASR; TinyBERT; Kuantisasi INT8; Kontrol suara; Raspberry Pi.

References


A. Baevski, H. Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A framework for self-supervised learning of speech representations,” Adv. Neural Inf. Process. Syst., 2020, doi: 10.48550/arXiv.2006.11477.

A. Radford, J. W. Kim, T. Xu, G. Brockman, C. McLeavey, and I. Sutskever, “Robust Speech Recognition via Large-Scale Weak Supervision,” Dec. 06, 2022, arXiv: arXiv:2212.04356. doi: 10.48550/arXiv.2212.04356.

M. Sharma, S. Joshi, T. Chatterjee, and R. Hamid, “A comprehensive empirical review of modern voice activity detection approaches for movies and TV shows,” Neurocomputing, vol. 494, pp. 116–131, July 2022, doi: 10.1016/j.neucom.2022.04.084.

G. Xiao, J. Lin, M. Seznec, H. Wu, J. Demouth, and S. Han, “SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models,” Mar. 29, 2024, arXiv: arXiv:2211.10438. doi: 10.48550/arXiv.2211.10438.

X. Jiao and et al, “TinyBERT: Distilling BERT for natural language understanding,” in Findings of EMNLP, in 04. 2020. doi: 10.48550/arXiv.1909.10351.

Z. Sun, H. Yu, X. Song, R. Liu, Y. Yang, and D. Zhou, “MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices,” Apr. 14, 2020, arXiv: arXiv:2004.02984. doi: 10.48550/arXiv.2004.02984.

S. Shen and et al, “Q-BERT: Hessian based ultra low precision quantization of BERT,” 2019, doi: 10.48550/arXiv.1909.05840.

P. Kusnerik and et al, “Intent detection and slot filling: A survey,” ACM Comput. Surv., vol. 57, no. 6, 2024, doi: 10.1145/3547138.

M. Firdaus, A. Ekbal, and E. Cambria, “Multitask learning for multilingual intent detection and slot filling in dialogue systems,” Inf. Fusion, vol. 91, pp. 299–315, Mar. 2023, doi:10.1016/j.inffus.2022.09.029.

A. Laskar and et al, “Automatic speech recognition: A survey,” Eng. Appl. Artif. Intell., 2025.

M. Minderer et al., “Revisiting the Calibration of Modern Neural Networks,” Oct. 26, 2021, arXiv: arXiv:2106.07998. doi: 10.48550/arXiv.2106.07998.

N. R. Ke and et al, “Sparsity in deep learning: Pruning and growth for efficient inference and training,” J. Mach. Learn. Res., vol. 22, no. 241, pp. 1–124, 2021, doi: 10.48550/arXiv.2102.00554.

V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter,” Mar. 01, 2020, arXiv: arXiv:1910.01108. doi: 10.48550/arXiv.1910.01108.

A. Fan and et al, “EdgeBERT: Sentence-Level Energy Optimizations for On-Device NLP Inference,” in IEEE/ACM MICRO, in 14. 2021. doi: 10.48550/arXiv.2011.14203.

E. Frantar and et al, “GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers,” 2022, doi: 10.48550/arXiv.2210.17323.

J. Li and et al, “On-Device End-to-End Automatic Speech Recognition for Multilingual Spoken Queries,” in Interspeech, in 16. 2022. doi: 10.21437/Interspeech.2022-10006.

Y. Shangguan and et al, “Analyzing the Quality and Stability of a Streaming End-to-End On-Device Speech Recognizer,” 2020, doi: 10.48550/arXiv.2006.01416.

B. Desplanques and et al, “Voice Activity Detection with Self-Supervised Representations,” 2022, doi: 10.48550/arXiv.2209.11061.

W. Wang and et al, “MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers,” 2020, doi: 10.48550/arXiv.2002.10957.

Z. Yao and et al, “ZeroQuant: Efficient and Affordable Post-Training Quantization for Large Language Models,” NeurIPS, 2022, doi: 10.48550/arXiv.2206.01861.

L. Hou and et al, “DynaBERT: Dynamic BERT with Adaptive Width and Depth,” 2020, doi: 10.48550/arXiv.2004.04037.

Y. He and et al, “Streaming End-to-End Speech Recognition for Mobile Devices,” 2018, doi: 10.48550/arXiv.1811.06621.

U. Evci and et al, “Rigging the Lottery: Making All Tickets Winners (RigL),” in ICLR, in 23 27. 2020. doi: 10.48550/arXiv.1911.11134.

T. Dettmers and et al, “LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale,” NeurIPS, 2022, doi: 10.48550/arXiv.2208.07339.

V. Sanh, T. Wolf, and A. Rush, “Movement Pruning: Adaptive Sparsity by Fine-Tuning,” in NeurIPS, in 25. 2020. doi: 10.48550/arXiv.2005.07683.

O. Zafrir and et al, “Q8BERT: Quantized 8-Bit BERT,” 2019, doi: 10.48550/arXiv.1910.06188.

M. Kull and et al, “Beyond Temperature Scaling: Dirichlet Calibration,” in NeurIPS, in 24. 2019. doi: 10.48550/arXiv.1910.12656.


The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off

Full Text: File PDF

How To Cite This :

Refbacks

  • There are currently no refbacks.