{"title":"METODE VECTOR SPACE MODEL UNTUK WEB SCRAPING PADA WEBSITE FREELANCE","authors":"Andi Nurkholis, Yusra Fernando, Faris Arkans Ans","doi":"10.33480/inti.v18i1.4266","DOIUrl":null,"url":null,"abstract":"Abstract— In digitalization era, internet is at the center of all lines of community activity, just like the field of work. Currently, many platforms provide job vacancies, especially for freelancers. To obtain this information, users usually need to open several websites to find information about suitable job vacancies. Web scraping offers solution to overcome these problems. Based on research that has been done, the BeautifulSoup and Selenium libraries will be used to collect data. To search for data, vector space model method is used to find the level of data similarity between the query and the document. In exploring data, the average near-perfect recall value is 98%, while the average precision value is 56%. This is because data search uses three parameters, so the possibility of retrieving irrelevant data is more significant if the document contains a word in the user's query, even though the context does not match. Utilizing the Streamlit framework in Python can display the data processing results and help users navigate the web scraping process, data processing, and data search. This study aims to implement the web scraping method to retrieve data from freelance websites: Freelance, Project, and Sribulancer. By applying the vector space model method, users can search data from several websites without opening freelance websites one by one. Using data visualization in the form of a web application using the Streamlit framework, the web scraping results can also be processed to be presented in a more helpful form and save the user's time","PeriodicalId":197142,"journal":{"name":"INTI Nusa Mandiri","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"INTI Nusa Mandiri","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33480/inti.v18i1.4266","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract— In digitalization era, internet is at the center of all lines of community activity, just like the field of work. Currently, many platforms provide job vacancies, especially for freelancers. To obtain this information, users usually need to open several websites to find information about suitable job vacancies. Web scraping offers solution to overcome these problems. Based on research that has been done, the BeautifulSoup and Selenium libraries will be used to collect data. To search for data, vector space model method is used to find the level of data similarity between the query and the document. In exploring data, the average near-perfect recall value is 98%, while the average precision value is 56%. This is because data search uses three parameters, so the possibility of retrieving irrelevant data is more significant if the document contains a word in the user's query, even though the context does not match. Utilizing the Streamlit framework in Python can display the data processing results and help users navigate the web scraping process, data processing, and data search. This study aims to implement the web scraping method to retrieve data from freelance websites: Freelance, Project, and Sribulancer. By applying the vector space model method, users can search data from several websites without opening freelance websites one by one. Using data visualization in the form of a web application using the Streamlit framework, the web scraping results can also be processed to be presented in a more helpful form and save the user's time