Faizhal Arif Santosa, Manika Lamba, Crissandra George, J. Stephen Downie
{"title":"椰子 Libtool:为非程序员缩小文本分析差距","authors":"Faizhal Arif Santosa, Manika Lamba, Crissandra George, J. Stephen Downie","doi":"arxiv-2406.05949","DOIUrl":null,"url":null,"abstract":"In the era of big and ubiquitous data, professionals and students alike are\nfinding themselves needing to perform a number of textual analysis tasks.\nHistorically, the general lack of statistical expertise and programming skills\nhas stopped many with humanities or social sciences backgrounds from performing\nand fully benefiting from such analyses. Thus, we introduce Coconut Libtool\n(www.coconut-libtool.com/), an open-source, web-based application that utilizes\nstate-of-the-art natural language processing (NLP) technologies. Coconut\nLibtool analyzes text data from customized files and bibliographic databases\nsuch as Web of Science, Scopus, and Lens. Users can verify which functions can\nbe performed with the data they have. Coconut Libtool deploys multiple\nalgorithmic NLP techniques at the backend, including topic modeling (LDA,\nBiterm, and BERTopic algorithms), network graph visualization, keyword\nlemmatization, and sunburst visualization. Coconut Libtool is the people-first\nweb application designed to be used by professionals, researchers, and students\nin the information sciences, digital humanities, and computational social\nsciences domains to promote transparency, reproducibility, accessibility,\nreciprocity, and responsibility in research practices.","PeriodicalId":501285,"journal":{"name":"arXiv - CS - Digital Libraries","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Coconut Libtool: Bridging Textual Analysis Gaps for Non-Programmers\",\"authors\":\"Faizhal Arif Santosa, Manika Lamba, Crissandra George, J. Stephen Downie\",\"doi\":\"arxiv-2406.05949\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the era of big and ubiquitous data, professionals and students alike are\\nfinding themselves needing to perform a number of textual analysis tasks.\\nHistorically, the general lack of statistical expertise and programming skills\\nhas stopped many with humanities or social sciences backgrounds from performing\\nand fully benefiting from such analyses. Thus, we introduce Coconut Libtool\\n(www.coconut-libtool.com/), an open-source, web-based application that utilizes\\nstate-of-the-art natural language processing (NLP) technologies. Coconut\\nLibtool analyzes text data from customized files and bibliographic databases\\nsuch as Web of Science, Scopus, and Lens. Users can verify which functions can\\nbe performed with the data they have. Coconut Libtool deploys multiple\\nalgorithmic NLP techniques at the backend, including topic modeling (LDA,\\nBiterm, and BERTopic algorithms), network graph visualization, keyword\\nlemmatization, and sunburst visualization. Coconut Libtool is the people-first\\nweb application designed to be used by professionals, researchers, and students\\nin the information sciences, digital humanities, and computational social\\nsciences domains to promote transparency, reproducibility, accessibility,\\nreciprocity, and responsibility in research practices.\",\"PeriodicalId\":501285,\"journal\":{\"name\":\"arXiv - CS - Digital Libraries\",\"volume\":\"23 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Digital Libraries\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2406.05949\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.05949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
在无处不在的大数据时代,专业人士和学生都发现自己需要执行大量文本分析任务。从历史上看,由于普遍缺乏统计专业知识和编程技能,许多具有人文或社会科学背景的人无法执行此类分析并从中充分受益。因此,我们引入了 Coconut Libtool(www.coconut-libtool.com/),它是一个开源的、基于网络的应用程序,采用了最先进的自然语言处理(NLP)技术。CoconutLibtool 可以分析来自定制文件和书目数据库(如 Web of Science、Scopus 和 Lens)的文本数据。用户可以验证他们所拥有的数据可以执行哪些功能。Coconut Libtool 在后端部署了多种算法的 NLP 技术,包括主题建模(LDA、Biterm 和 BERTopic 算法)、网络图可视化、关键词格式化和旭日可视化。Coconut Libtool 是一款以人为本的网络应用程序,旨在供信息科学、数字人文和计算社会科学领域的专业人士、研究人员和学生使用,以提高研究实践的透明度、可复制性、可访问性、互惠性和责任感。
Coconut Libtool: Bridging Textual Analysis Gaps for Non-Programmers
In the era of big and ubiquitous data, professionals and students alike are
finding themselves needing to perform a number of textual analysis tasks.
Historically, the general lack of statistical expertise and programming skills
has stopped many with humanities or social sciences backgrounds from performing
and fully benefiting from such analyses. Thus, we introduce Coconut Libtool
(www.coconut-libtool.com/), an open-source, web-based application that utilizes
state-of-the-art natural language processing (NLP) technologies. Coconut
Libtool analyzes text data from customized files and bibliographic databases
such as Web of Science, Scopus, and Lens. Users can verify which functions can
be performed with the data they have. Coconut Libtool deploys multiple
algorithmic NLP techniques at the backend, including topic modeling (LDA,
Biterm, and BERTopic algorithms), network graph visualization, keyword
lemmatization, and sunburst visualization. Coconut Libtool is the people-first
web application designed to be used by professionals, researchers, and students
in the information sciences, digital humanities, and computational social
sciences domains to promote transparency, reproducibility, accessibility,
reciprocity, and responsibility in research practices.