Skip to main navigation Skip to search Skip to main content

SLLM: A Memory-Efficient Fine-Tuning and Evaluation Pipeline for Medical Large Language Models

  • Seohyun Yoo
  • , Joonseo Hyeon
  • , Jaehyuk Cho*
  • , Eunkyung Shin
  • *Corresponding author for this work
  • Jeonbuk National University

Research output: Contribution to conferenceConference paperpeer-review

Abstract

Recent large language models (LLMs) have shown great performance in medical question answering (QA), but are still limited in use due to challenges such as training and inference costs, medical domain prompt sensitivity, and lack of evaluation frameworks. To address this, a system has been built for fine-Tuning and evaluating medical LLMs. Using QLoRA for low-memory fine-Tuning, the system integrates the Hugging Face Accelerate framework with multi-GPU distributed training. The lm-eval-harness ensures robust automated evaluation. The validity of the system is demonstrated using the MedGemma 2B model and KorMedMCQA benchmarks. The experimental results show that sLLM can achieve 78.87% accuracy on MedQA, while maintaining training efficiency. This suggests that prompt engineering can outperform meticulously calibrated models, offering a cost-effective way to implement medical LLMs. This work presents a scalable, efficient, and reproducible approach for developing high-performance LLMs, laying the foundation for future clinical integration using transparent systems.

Original languageEnglish
Title of host publication2025 International Conference on Platform Technology and Service, PlatCon 2025 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6-8
Number of pages3
Edition2025
ISBN (Electronic)9798331576226
DOIs
StatePublished - 2025
Event2025 International Conference on Platform Technology and Service, PlatCon 2025 - Jeju, Korea, Republic of
Duration: 2025.08.252025.08.27

Conference

Conference2025 International Conference on Platform Technology and Service, PlatCon 2025
Country/TerritoryKorea, Republic of
CityJeju
Period25.08.2525.08.27

Keywords

  • Distributed Training (FSDP)
  • Medical Large Language Models(Medical LLMs)
  • Medical Question Answering (Medical QA)
  • Parameter-Efficient Fine-Tuning (PEFT)
  • QLoRA

Fingerprint

Dive into the research topics of 'SLLM: A Memory-Efficient Fine-Tuning and Evaluation Pipeline for Medical Large Language Models'. Together they form a unique fingerprint.

Cite this