Joshua Sobel, Noah Bertram, C. Ding, F. Nargesian, D. Gildea
{"title":"AWLCO:全窗口长度共现","authors":"Joshua Sobel, Noah Bertram, C. Ding, F. Nargesian, D. Gildea","doi":"10.4230/LIPIcs.CPM.2021.24","DOIUrl":null,"url":null,"abstract":"Analyzing patterns in a sequence of events has applications in text analysis, computer programming, and genomics research. In this paper, we consider the all-window-length analysis model which analyzes a sequence of events with respect to windows of all lengths. We study the exact co-occurrence counting problem for the all-window-length analysis model. Our first algorithm is an offline algorithm that counts all-window-length co-occurrences by performing multiple passes over a sequence and computing single-window-length co-occurrences. This algorithm has the time complexity $O(n)$ for each window length and thus a total complexity of $O(n^2)$ and the space complexity $O(|I|)$ for a sequence of size n and an itemset of size $|I|$. We propose AWLCO, an online algorithm that computes all-window-length co-occurrences in a single pass with the expected time complexity of $O(n)$ and space complexity of $O( \\sqrt{ n|I| })$. Following this, we generalize our use case to patterns in which we propose an algorithm that computes all-window-length co-occurrence with expected time complexity $O(n|I|)$ and space complexity $O( \\sqrt{n|I|} + e_{max}|I|)$, where $e_{max}$ is the length of the largest pattern.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"AWLCO: All-Window Length Co-Occurrence\",\"authors\":\"Joshua Sobel, Noah Bertram, C. Ding, F. Nargesian, D. Gildea\",\"doi\":\"10.4230/LIPIcs.CPM.2021.24\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Analyzing patterns in a sequence of events has applications in text analysis, computer programming, and genomics research. In this paper, we consider the all-window-length analysis model which analyzes a sequence of events with respect to windows of all lengths. We study the exact co-occurrence counting problem for the all-window-length analysis model. Our first algorithm is an offline algorithm that counts all-window-length co-occurrences by performing multiple passes over a sequence and computing single-window-length co-occurrences. This algorithm has the time complexity $O(n)$ for each window length and thus a total complexity of $O(n^2)$ and the space complexity $O(|I|)$ for a sequence of size n and an itemset of size $|I|$. We propose AWLCO, an online algorithm that computes all-window-length co-occurrences in a single pass with the expected time complexity of $O(n)$ and space complexity of $O( \\\\sqrt{ n|I| })$. Following this, we generalize our use case to patterns in which we propose an algorithm that computes all-window-length co-occurrence with expected time complexity $O(n|I|)$ and space complexity $O( \\\\sqrt{n|I|} + e_{max}|I|)$, where $e_{max}$ is the length of the largest pattern.\",\"PeriodicalId\":236737,\"journal\":{\"name\":\"Annual Symposium on Combinatorial Pattern Matching\",\"volume\":\"66 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annual Symposium on Combinatorial Pattern Matching\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4230/LIPIcs.CPM.2021.24\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Symposium on Combinatorial Pattern Matching","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.CPM.2021.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Analyzing patterns in a sequence of events has applications in text analysis, computer programming, and genomics research. In this paper, we consider the all-window-length analysis model which analyzes a sequence of events with respect to windows of all lengths. We study the exact co-occurrence counting problem for the all-window-length analysis model. Our first algorithm is an offline algorithm that counts all-window-length co-occurrences by performing multiple passes over a sequence and computing single-window-length co-occurrences. This algorithm has the time complexity $O(n)$ for each window length and thus a total complexity of $O(n^2)$ and the space complexity $O(|I|)$ for a sequence of size n and an itemset of size $|I|$. We propose AWLCO, an online algorithm that computes all-window-length co-occurrences in a single pass with the expected time complexity of $O(n)$ and space complexity of $O( \sqrt{ n|I| })$. Following this, we generalize our use case to patterns in which we propose an algorithm that computes all-window-length co-occurrence with expected time complexity $O(n|I|)$ and space complexity $O( \sqrt{n|I|} + e_{max}|I|)$, where $e_{max}$ is the length of the largest pattern.