PromptLessSAM: From Foundational Model to Domain Expert via Lightweight Decoder Adaptation for Crack Segmentation

Mohammed Al-Mustafa Hendo; Israa H. Ali

doi:10.31436/iiumej.v27i2.4124

Authors

Mohammed Al-Mustafa Hendo University of Babylon https://orcid.org/0009-0006-8237-1623
Israa H. Ali University of Babylon https://orcid.org/0000-0003-0173-6071

DOI:

https://doi.org/10.31436/iiumej.v27i2.4124

Keywords:

Crack Semantic Segmentation, Segment Anything Model (SAM), Deep Learning, Structural Health Monitoring

Abstract

Maintaining infrastructure such as roads and bridges is very important for safety. This is called Structural Health Monitoring (SHM). A key part of SHM is automatically finding cracks in pavements. Recent foundation models, such as the Segment Anything Model (SAM), can segment a wide range of object types. However, they are not well-suited for specific tasks such as detecting thin cracks in complex environments. They also require user input to function, making them unsuitable for fully automatic inspection systems. Therefore, this paper proposes PromptLessSAM to transform the general SAM model into a domain expert for crack segmentation. We use a lightweight method in which we freeze the powerful image encoder of SAM, except for the encoder’s neck, which remains trainable, and we add a new, small decoder. Our model is very efficient and has only about 1.1 million trainable parameters. We trained and tested our model on the public Crack500 dataset. The experimental results show that our model achieved high performance (a Dice score of 72.7% and an IoU of 57.11% on Crack500), outperforming many state-of-the-art models, including TransUNet and PaveSAM. Also, our model showed good generalization capabilities on the DeepCrack dataset (Dice score of 74.7% and IoU of 62.43%), which it did not see during training. PromptLessSAM provides a new, efficient, and effective method. It shows how to adapt large foundation models for a specific task. Our work provides a strong, high-performance solution trained on a very small number of parameters.

ABSTRAK: Penyelenggaraan infrastruktur seperti jalan raya dan jambatan sangat penting bagi tujuan keselamatan. Ini dikenali sebagai Pemantauan Kesihatan Struktur (Structural Health Monitoring, SHM). Salah satu komponen utama pada SHM ialah mengesan rekahan pada turapan secara automatik. Model asas terkini, seperti Model Sesuatu Bahagian (SAM), boleh melakukan segmentasi pelbagai objek. Namun, model ini kurang berkesan bagi tugas khusus seperti mengesan rekahan yang nipis pada persekitaran kompleks. Selain itu, SAM memerlukan arahan daripada pengguna untuk berfungsi, menjadikannya kurang sesuai bagi sistem pemeriksaan sepenuhnya automatik. Oleh itu, kajian ini mencadangkan PromptLessSAM untuk menukarkan model SAM yang umum kepada model pakar domain bagi segmentasi rekahan. Kajian ini menggunakan kaedah yang ringan dengan membekukan pengekod imej SAM yang berkuasa, kecuali pada bahagian ‘leher’ pengekod yang kekal dilatih, dan menambah sebuah penyahkod baharu yang kecil. Model ini sangat cekap dan hanya mempunyai kira-kira 1.1 juta parameter boleh latih. Model ini dilatih dan diuji menggunakan set data awam Crack500. Keputusan eksperimen menunjukkan model ini mencapai skor prestasi tinggi (iaitu skor Dice sebanyak 72.7% dan IoU sebanyak 57.11% pada set data Crack500), mengatasi banyak model tercanggih (state-of-the-art) seperti TransUNet dan PaveSAM. Selain itu, model ini menunjukkan keupayaan generalisasi yang baik pada set data DeepCrack (skor Dice 74.7% dan IoU 62.43%), yang tidak pernah dilihat semasa latihan. PromptLessSAM menyediakan kaedah baharu yang cekap dan berkesan, serta menunjukkan cara menyesuaian model asas berskala besar bagi tugas khusus. Kajian ini menawarkan penyelesaian kukuh dengan prestasi tinggi namun dilatih menggunakan bilangan parameter paling rendah.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

Sundravel KV, Jagadeesan R, Dhanush A, Kumar AS (2025). A Review of Innovations, Challenges, and Future Directions in Structural Health Monitoring Using Smart Materials. Institute for Environmental Nanotechnology. doi:10.13074/jent.2025.03.2511163.

Zhang X, Wang H, Hsieh YA, Yang Z, Yezzi A, Tsai YC (2025). Deep Learning for Crack Detection: A Review of Learning Paradigms, Generalizability, and Datasets. Retrieved from http://arxiv.org/abs/2508.10256

Benmhahe B, Chentoufi JA (2021). Automated Pavement Distress Detection, Classification and Measurement: A Review. International Journal of Advanced Computer Science and Applications, 12(8):708–718. https://doi.org/10.14569/IJACSA.2021.0120882

Ronneberger O, Fischer P, Brox T (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp 234–241. Retrieved from http://arxiv.org/abs/1505.04597

Chen X, Liu C, Chen L, Zhu X, Zhang Y, Wang C (2024). A Pavement Crack Detection and Evaluation Framework for a UAV Inspection System Based on Deep Learning. Applied Sciences, 14(3). doi:10.3390/app14031157.

Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021). TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. Retrieved from https://arxiv.org/pdf/2102.04306

Liu Y, Yao J, Lu X, Xie R, Li L (2019). DeepCrack: A deep hierarchical feature learning architecture for crack segmentation. Neurocomputing, 338:139–153. doi:10.1016/j.neucom.2019.01.036.

Kirillov A, Mintun E, Ravi N, Mao H, Rolland C, Gustafson L, Xiao T, Whitehead S, Berg AC, Lo WY, Dollár P, Girshick R (2023). Segment Anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 4015-4026. Retrieved from http://arxiv.org/abs/2304.02643

Jakisa N, Student OP, Adu-Gyamfi Y, Aboah A, Amo-Boateng M (2024). PaveSAM-Segment Anything for Pavement Distress.

Wang Y, He J, Yu S (2025). CrackESS: A Self-Prompting Crack Segmentation System for Edge Devices. Retrieved from http://arxiv.org/abs/2412.07205

Zhang D, Feng T, Xue L, Wang Y, Dong Y, Tang J (2025). Parameter-Efficient Fine-Tuning for Foundation Models. Retrieved from http://arxiv.org/abs/2501.13787

Huang S, Chen H, Yan L, Zou X, Li B, Bi Y (2025). A review of the progress in machine vision-based crack detection and identification technology for asphalt pavements. Digital Transportation and Safety, 4(1):65–79. doi:10.48130/dts-0025-0006.

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations (ICLR). Retrieved from http://arxiv.org/abs/2010.11929

Li X, Ding H, Yuan H, Zhang W, Pang J, Cheng G, Chen K, Liu Z, Loy CC (2024). Transformer-Based Visual Segmentation: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12):10138-10159. Retrieved from http://arxiv.org/abs/2304.09854

Zhao X, Ding W, An Y, Du Y, Yu T, Li M, Tang M, Wang J (2023). Fast Segment Anything. Retrieved from http://arxiv.org/abs/2306.12156

Ma N, Fan R, Xie L. (2024). UP-CrackNet: Unsupervised Pixel-Wise Road Crack Detection via Adversarial Image Restoration. IEEE Transactions on Intelligent Transportation Systems, 25(10):13926–13936. doi: 10.1109/TITS.2024.3398037

Ma N, Song Z, Hu Q, Tang X, Zhang C, Fan R, Xie L. (2026). WP-CrackNet: A collaborative adversarial learning framework for end-to-end weakly-supervised road crack detection. Neurocomputing, 659:131845. doi: 10.1016/j.neucom.2025.131845

Wang Y, Jin J, Chen X, Wu Z, Zhang L. (2025). A lightweight crack segmentation network based on the importance-enhanced Mamba model. Scientific Reports, 15(1):25504. doi: 10.1038/s41598-025-25504-4

Tang W, Wu Z, Wang W, Pan Y, Gan W. (2025). VM-UNet++ research on crack image segmentation based on improved VM-UNet. Scientific Reports, 15(1):92994. doi: 10.1038/s41598-025-92994-7

Wang C, Liu H, An X, Gong Z, Deng F. (2024). SwinCrack: Pavement crack detection using convolutional swin-transformer network. Digital Signal Processing, 145:104297. doi: 10.1016/j.dsp.2023.104297

Ruan J, Li J, Xiang S. (2025). VM-UNet: Vision Mamba UNet for Medical Image Segmentation. ACM Transactions on Multimedia Computing, Communications, and Applications, 20(23):1-15. doi: 10.1145/3767748

Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. Retrieved from http://arxiv.org/abs/1912.01703

Kingma DP, Ba J (2017). Adam: A Method for Stochastic Optimization. Retrieved from http://arxiv.org/abs/1412.6980

Jadon S (2020). A survey of loss functions for semantic segmentation. In IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). doi:10.1109/CIBCB48159.2020.9277638.

Rajput V (2021). Robustness of different loss functions and their impact on network’s learning capability. Retrieved from https://arxiv.org/abs/2110.08322

Hussan PH, Ali IH (2025). ECGANCOVID: Efficient Conditional GAN Architecture for Covid-19 Disease Segmentation. Baghdad Science Journal, 22(2):706–729. doi:10.21123/bsj.2024.9335.

Al Akabi H, Al-Assadi T (2025). CLASSIFICATION OF FAULT SIGNALS BASED ON DCT AND DEEP LEARNING. Kufa Journal of Engineering, 16(4):217–234. doi:10.30572/2018/KJE/160413.

Runpod. Runpod | The cloud built for AI. Retrieved from https://www.runpod.io/

Yatnalkar Y (2025). SAM-Promptless-Task-Specific-Finetuning. Retrieved from https://github.com/yogendra-yatnalkar/SAM-Promptless-Task-Specific-Finetuning