{"title":"使用BERT和InceptionV3增强图书类型分类:图书馆的深度学习方法。","authors":"Xinting Yang, Zehua Zhang","doi":"10.7717/peerj-cs.2934","DOIUrl":null,"url":null,"abstract":"<p><p>Accurate book genre classification is essential for library organization, information retrieval, and personalized recommendations. Traditional classification methods, often reliant on manual categorization and metadata-based approaches, struggle with the complexities of hybrid genres and evolving literary trends. To address these limitations, this study proposes a hybrid deep learning model that integrates visual and textual features for enhanced genre classification. Specifically, we employ InceptionV3, an advanced convolutional neural network architecture, to extract visual features from book cover images and bidirectional encoder representations from transformers (BERT) to analyze textual data from book titles. A scaled dot-product attention mechanism is used to effectively fuse these multimodal features, dynamically weighting their contributions based on contextual relevance. Experimental results on the BookCover30 dataset demonstrate that our proposed model outperforms baseline approaches, achieving a balanced accuracy of 0.7951 and an F1-score of 0.7920, surpassing both standalone image- and text-based classifiers. This study highlights the potential of deep learning in improving automated genre classification, offering a scalable and adaptable solution for libraries and digital platforms. Future research may focus on expanding dataset diversity, optimizing computational efficiency, and addressing biases in classification models.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e2934"},"PeriodicalIF":3.5000,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12193415/pdf/","citationCount":"0","resultStr":"{\"title\":\"Enhancing book genre classification with BERT and InceptionV3: a deep learning approach for libraries.\",\"authors\":\"Xinting Yang, Zehua Zhang\",\"doi\":\"10.7717/peerj-cs.2934\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Accurate book genre classification is essential for library organization, information retrieval, and personalized recommendations. Traditional classification methods, often reliant on manual categorization and metadata-based approaches, struggle with the complexities of hybrid genres and evolving literary trends. To address these limitations, this study proposes a hybrid deep learning model that integrates visual and textual features for enhanced genre classification. Specifically, we employ InceptionV3, an advanced convolutional neural network architecture, to extract visual features from book cover images and bidirectional encoder representations from transformers (BERT) to analyze textual data from book titles. A scaled dot-product attention mechanism is used to effectively fuse these multimodal features, dynamically weighting their contributions based on contextual relevance. Experimental results on the BookCover30 dataset demonstrate that our proposed model outperforms baseline approaches, achieving a balanced accuracy of 0.7951 and an F1-score of 0.7920, surpassing both standalone image- and text-based classifiers. This study highlights the potential of deep learning in improving automated genre classification, offering a scalable and adaptable solution for libraries and digital platforms. Future research may focus on expanding dataset diversity, optimizing computational efficiency, and addressing biases in classification models.</p>\",\"PeriodicalId\":54224,\"journal\":{\"name\":\"PeerJ Computer Science\",\"volume\":\"11 \",\"pages\":\"e2934\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12193415/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PeerJ Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.7717/peerj-cs.2934\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2934","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Enhancing book genre classification with BERT and InceptionV3: a deep learning approach for libraries.
Accurate book genre classification is essential for library organization, information retrieval, and personalized recommendations. Traditional classification methods, often reliant on manual categorization and metadata-based approaches, struggle with the complexities of hybrid genres and evolving literary trends. To address these limitations, this study proposes a hybrid deep learning model that integrates visual and textual features for enhanced genre classification. Specifically, we employ InceptionV3, an advanced convolutional neural network architecture, to extract visual features from book cover images and bidirectional encoder representations from transformers (BERT) to analyze textual data from book titles. A scaled dot-product attention mechanism is used to effectively fuse these multimodal features, dynamically weighting their contributions based on contextual relevance. Experimental results on the BookCover30 dataset demonstrate that our proposed model outperforms baseline approaches, achieving a balanced accuracy of 0.7951 and an F1-score of 0.7920, surpassing both standalone image- and text-based classifiers. This study highlights the potential of deep learning in improving automated genre classification, offering a scalable and adaptable solution for libraries and digital platforms. Future research may focus on expanding dataset diversity, optimizing computational efficiency, and addressing biases in classification models.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.