CONVR 2023 - Proceedings of the 23rd International Conference on Construction Applications of Virtual Reality
Edited by Pietro Capone, Vito Getuli, Farzad Pour Rahimian, Nashwan Dawood, Alessandro Bruttini, Tommaso Sorbi

Book Chapter

A Comparative Study of Deep Learning Models for Symbol Detection in Technical Drawings

Benedikt Faltin
Damaris Gann
Markus König

download PDF

CC BY-NC 4.0
DOI: 10.36253/979-12-215-0289-3.87

Symbols are a universal way to convey complex information in technical drawings since they can represent a wide range of elements, including components, materials, or relationships, in a concise and space-saving manner. Therefore, to enable a digital and automatic interpretation of pixel-based drawings, accurate detection of symbols is a crucial step. To enhance the efficiency of the digitization process, current research focuses on automating this symbol detection using deep learning models. However, the ever-increasing repertoire of model architectures poses a challenge for researchers and practitioners alike in retaining an overview of the latest advancements and selecting the most suitable model architecture for their respective use cases. To provide guidance, this contribution conducts a comparative study of prevalent and state-of-the-art model architectures for the task of symbol detection in pixel-based construction drawings. Therefore, this study evaluates six different object detection model architectures, including YOLOv5, YOLOv7, YOLOv8, Swin-Transformer, ConvNeXt, and Faster-RCNN. These models are trained and tested on two distinct datasets from the bridge and residential building domains, both representing substantial sub-sectors of the construction industry. Furthermore, the models are evaluated based on five criteria, i.e., detection accuracy, robustness to data scarcity, training time, inference time, and model size. In summary, our comparative study highlights the performance and capabilities of different deep learning models for symbol detection in construction drawings. Through the comprehensive evaluation and practical insights, this research facilitates the advancement of automated symbol detection by showing the strengths and weaknesses of the model architectures, thus providing users with valuable guidance in choosing the most appropriate model for their real-world applications

Keywords:
Computer Vision,
Technical Drawings,
Symbol Detection,
Comparative Study,

+ Show More

Benedikt Faltin

Ruhr-University Bochum, Germany - ORCID: 0000-0003-1354-7817

Damaris Gann

Ruhr-University Bochum, Germany

Markus König

Ruhr-University Bochum, Germany - ORCID: 0000-0002-2729-7743

Adam, S., Ogier, J. M., Cariou, C., Mullot, R., Labiche, J., & Gardes, J. (2000). Symbol and character recognition: application to engineering drawings. International Journal on Document Analysis and Recognition, 3(2), 89–101. DOI: 10.1007/s100320000033
Ah-Soon, C. (1998). A constraint network for symbol detection in architectural drawings. In K. Tombre & A.K. Chhabra (Eds.), Lecture Notes in Computer Science. Springer. DOI: 10.1007/3-540-64381-8_41
Brößner, P., Hohlmann, B., & Radermacher, K. (2022). Transformer vs. CNN: A Comparison on Knee Segmentation in Ultrasound Images. In F. Rodriguez Y Baena, J. W. Giles & E. Stindel (Eds.), Proceedings of the 20th Annual Meeting of the International Society for Computer Assisted Orthopaedic Surgery, Vol. 5, 31–36. DOI: 10.29007/cqcv
Deng, J., Dong, W., Socher, R., Li, L.-J., Kai Li, & Li Fei-Fei (2009 ImageNet: A large-scale hierarchical image database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248–255. DOI: 10.1109/CVPR.2009.5206848
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv. DOI: 10.48550/arXiv.2010.11929
Elyan, E., Jamieson, L., & Ali-Gombe, A. (2020). Deep learning for symbols detection and classification in engineering drawings. Neural networks, Vol. 129, 91–102. DOI: 10.1016/j.neunet.2020.05.025
Elyan, E., Moreno-García, C. F., & Johnston, P. (2020). Symbols in Engineering Drawings (SiED): An Imbalanced Dataset Benchmarked by Convolutional Neural Networks. In L. Iliadis, P. P. Angelov, C. Jayne, & E. Pimenidis (Eds.), Proceedings of the 21st EANN (Engineering Applications of Neural Networks) 2020 Conference, 215–224. Springer. DOI: 10.1007/978-3-030-48791-1_16
Faltin, B., Schönfelder, P., & König, M. (2023). Inferring Interconnections of Construction Drawings for Bridges Using Deep Learning-based Methods. In E. Hjelseth, S. F. Sujan & R. J. Scherer (Eds.), ECPPM 2022-eWork and eBusiness in Architecture, Engineering and Construction 2022, 343-350. CRC Press. DOI: 10.1201/9781003354222
Faltin, B., Schönfelder, P., & König, M. (2023). Improving Symbol Detection on Engineering Drawings Using a Keypoint-Based Deep Learning Approach. The 30th EG-ICE: International Conference on Intelligent Computing in Engineering. https://www.ucl.ac.uk/bartlett/construction/sites/bartlett_construction/files/1889.pdf
Gudigar, A., Chokkadi, S., & U, R. (2016). A review on automatic detection and recognition of traffic sign. Multimedia Tools and Applications, 75(1), 333–364. DOI: 10.1007/s11042-014-2293-7
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778. DOI: 10.1109/CVPR.2016.90
Huang, W., Sun, Q., Yu, A., Guo, W., Xu, Q., Wen, B., & Xu, L. (2023). Leveraging Deep Convolutional Neural Network for Point Symbol Recognition in Scanned Topographic Maps. ISPRS International Journal of Geo-Information, 12(3), 128. DOI: 10.3390/ijgi12030128
Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1), Article 2. DOI: 10.3390/technologies9010002
Jocher, G., Chaurasia, A., Stoken, A., Borovec, J., Kwon, Y., Michael, K., TaoXie, Fang, J. imyhxy, Lorna, Zan Yifu, Wong, C., V, A., Montes, D., Wang, Z., Fati, C., Nadar, J., Laughing, … Jain, M. (2022). ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation. Zenodo. DOI: 10.5281/zenodo.3908559
Jocher, G., Chaurasia, A., & Qiu, J. (2023). YOLO by Ultralytics (Version 8.0.0). https://github.com/ultralytics/ultralytics
Kalervo, A., Ylioinas, J., Häikiö, M., Karhu, A., & Kannala, J. (2019). CubiCasa5K: A Dataset and an Improved Multi-task Model for Floorplan Image Analysis. In M. Felsberg, P.-E. Forssén, I.-M. Sintorn & J. Unger (Eds.), Image Analysis: 21st Scandinavian Conference, Vol. 11482, 28-40. Springer. DOI: 10.1007/978-3-030-20205-7_3
Lin, T. Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L., & Piotr, D. (2014). Microsoft COCO: Common Objects in Context. In D. Fleet, T. Pajdla, B.Schiele & T. Tuytelaars (Eds.), Computer Vision – ECCV 2014, Vol. 13, 740-755. Springer. DOI: 10.1007/978-3-319-10602-1_48
Lim, J.-S., Astrid, M., Yoon, H.-J., & Lee, S.-I. (2021). Small Object Detection using Context and Attention. 2021 International Conference on Artificial Intelligence in Information and Communication, 181–186. DOI: 10.1109/ICAIIC51459.2021.9415217
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. Proceedings of the IEEE/CVF International Conference on Computer Vision, 9992-10002. DOI: 10.1109/ICCV48922.2021.00986
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., & Xie, S. (2022). A ConvNet for the 2020s. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 11976-11986. DOI: 10.1109/CVPR52688.2022.01167
Loshchilov, I., & Hutter, F. (2017). Decoupled weight decay regularization. arXiv. DOI: 10.48550/arXiv.1711.05101
Mani, S., Haddad, M. A., Constantini, D., Douhard, W., Li, Q., & Poirier, L. (2020). Automatic Digitization of Engineering Diagrams Using Deep Learning and Graph Search. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 176-177. DOI: 10.1109/CVPRW50498.2020.00096
Moutik, O., Sekkat, H., Tigani, S., Chehri, A., Saadane, R., Tchakoucht, T. A., & Paul, A. (2023). Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?. Sensors, 23(2), 734. DOI: 10.3390/s23020734
Padilla, R., Passos, W. L., Dias, T. L. B., Netto, S. L., & da Silva, E. A. B. (2021). A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit. Electronics, 10(3), 279. DOI: 10.3390/electronics10030279
Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Proceedings of the IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137-1149. DOI: 10.1109/TPAMI.2016.2577031
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10684-10695. DOI: 10.1109/CVPR52688.2022.01042
Schmidt, S., Rao, Q., Tatsch, J., & Knoll, A. (2020). Advanced Active Learning Strategies for Object Detection. Proceedings of the IEEE Intelligent Vehicles Symposium. 871–876. DOI: 10.1109/IV47402.2020.9304565
Wang, D., Zhang, J., Du, B., Xia, G. S., & Tao, D. (2023). An Empirical Study of Remote Sensing Pretraining. Proceedings of the IEEE Transactions on Geoscience and Remote Sensing, 61. DOI: 10.1109/TGRS.2022.3176603
Wang, C.Y., Bochkovskiy, A., & Liao, H.Y. (2022). YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. arXiv. DOI: 10.48550/arXiv.2207.02696
Zaidi, S. S. A., Ansari, M. S., Aslam, A., Kanwal, N., Asghar, M., & Lee, B. (2022). A survey of modern deep learning based object detection models. Digital Signal Processing, 126, Article 103514. DOI: 10.1016/j.dsp.2022.103514
Ziran, Z., & Marinai, S. (2018). Object Detection in Floor Plan Images. In: L. Pancioni, F. Schwenker, E. Trentin, (Eds.), Artificial Neural Networks in Pattern Recognition, 383-394. Springer. DOI: 10.1007/978-3-319-99978-4_30

PDF

Publication Year: 2023
Pages: 877-886

Content License: CC BY-NC 4.0
© 2023 Author(s)

Download PDF

XML

Publication Year: 2023

Content License: CC BY-NC 4.0
© 2023 Author(s)

Download XML

Chapter Information

Chapter Title

A Comparative Study of Deep Learning Models for Symbol Detection in Technical Drawings

Authors

Benedikt Faltin, Damaris Gann, Markus König

DOI

10.36253/979-12-215-0289-3.87

Peer Reviewed

Publication Year

2023

Content License

CC BY-NC 4.0

Metadata License

CC0 1.0

Bibliographic Information

Book Title

CONVR 2023 - Proceedings of the 23rd International Conference on Construction Applications of Virtual Reality

Book Subtitle

Managing the Digital Transformation of Construction Industry

Editors

Pietro Capone, Vito Getuli, Farzad Pour Rahimian, Nashwan Dawood, Alessandro Bruttini, Tommaso Sorbi

Peer Reviewed

Publication Year

2023

Content License

CC BY-NC 4.0

Metadata License

CC0 1.0

Publisher Name

Firenze University Press

DOI

10.36253/979-12-215-0289-3

eISBN (pdf)

979-12-215-0289-3

eISBN (xml)

979-12-215-0257-2

Series Title

Proceedings e report

Series ISSN

2704-601X

Series E-ISSN

2704-5846

386

Fulltext
downloads

568

Views

Export Citation

1,758

Books in the Catalogue

1,459

Open Access Books

in the Catalogue

3,015

Book Chapters

5,184,798

Fulltext
downloads

5,337

Authors

from 1156 Research Institutions

of 66 Nations

74

scientific boards

from 403 Research Institutions

of 45 Nations

1,314

Referees

from 404 Research Institutions

of 39 Nations

Catalogue

Scientific Cloud

Best Practice

A Comparative Study of Deep Learning Models for Symbol Detection in Technical Drawings

Chapter Information

Bibliographic Information

Fulltextdownloads

Views

Export Citation

1,758

Books in the Catalogue

1,459

Open Access Books

3,015

Book Chapters

5,184,798

Fulltextdownloads

5,337

Authors

74

scientific boards

1,314

Referees

Fulltext
downloads

Fulltext
downloads