Ivor van der Hoog, Thijs van der Horst, Tim Ophelders
{"title":"更快的确定性子轨迹聚类","authors":"Ivor van der Hoog, Thijs van der Horst, Tim Ophelders","doi":"arxiv-2402.13117","DOIUrl":null,"url":null,"abstract":"Given a trajectory $T$ and a distance $\\Delta$, we wish to find a set $C$ of\ncurves of complexity at most $\\ell$, such that we can cover $T$ with subcurves\nthat each are within Fr\\'echet distance $\\Delta$ to at least one curve in $C$.\nWe call $C$ an $(\\ell,\\Delta)$-clustering and aim to find an\n$(\\ell,\\Delta)$-clustering of minimum cardinality. This problem was introduced\nby Akitaya $et$ $al.$ (2021) and shown to be NP-complete. The main focus has\ntherefore been on bicriterial approximation algorithms, allowing for the\nclustering to be an $(\\ell, \\Theta(\\Delta))$-clustering of roughly optimal\nsize. We present algorithms that construct $(\\ell,4\\Delta)$-clusterings of\n$\\mathcal{O}(k \\log n)$ size, where $k$ is the size of the optimal $(\\ell,\n\\Delta)$-clustering. For the discrete Fr\\'echet distance, we use $\\mathcal{O}(n\n\\ell \\log n)$ space and $\\mathcal{O}(k n^2 \\log^3 n)$ deterministic worst case\ntime. For the continuous Fr\\'echet distance, we use $\\mathcal{O}(n^2 \\log n)$\nspace and $\\mathcal{O}(k n^3 \\log^3 n)$ time. Our algorithms significantly\nimprove upon the clustering quality (improving the approximation factor in\n$\\Delta$) and size (whenever $\\ell \\in \\Omega(\\log n)$). We offer deterministic\nrunning times comparable to known expected bounds. Additionally, in the\ncontinuous setting, we give a near-linear improvement upon the space usage.\nWhen compared only to deterministic results, we offer a near-linear speedup and\na near-quadratic improvement in the space usage. When we may restrict ourselves\nto only considering clusters where all subtrajectories are vertex-to-vertex\nsubcurves, we obtain even better results under the continuous Fr\\'echet\ndistance. Our algorithm becomes near quadratic and uses space that is near\nlinear in $n \\ell$.","PeriodicalId":501570,"journal":{"name":"arXiv - CS - Computational Geometry","volume":"43 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Faster and Deterministic Subtrajectory Clustering\",\"authors\":\"Ivor van der Hoog, Thijs van der Horst, Tim Ophelders\",\"doi\":\"arxiv-2402.13117\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given a trajectory $T$ and a distance $\\\\Delta$, we wish to find a set $C$ of\\ncurves of complexity at most $\\\\ell$, such that we can cover $T$ with subcurves\\nthat each are within Fr\\\\'echet distance $\\\\Delta$ to at least one curve in $C$.\\nWe call $C$ an $(\\\\ell,\\\\Delta)$-clustering and aim to find an\\n$(\\\\ell,\\\\Delta)$-clustering of minimum cardinality. This problem was introduced\\nby Akitaya $et$ $al.$ (2021) and shown to be NP-complete. The main focus has\\ntherefore been on bicriterial approximation algorithms, allowing for the\\nclustering to be an $(\\\\ell, \\\\Theta(\\\\Delta))$-clustering of roughly optimal\\nsize. We present algorithms that construct $(\\\\ell,4\\\\Delta)$-clusterings of\\n$\\\\mathcal{O}(k \\\\log n)$ size, where $k$ is the size of the optimal $(\\\\ell,\\n\\\\Delta)$-clustering. For the discrete Fr\\\\'echet distance, we use $\\\\mathcal{O}(n\\n\\\\ell \\\\log n)$ space and $\\\\mathcal{O}(k n^2 \\\\log^3 n)$ deterministic worst case\\ntime. For the continuous Fr\\\\'echet distance, we use $\\\\mathcal{O}(n^2 \\\\log n)$\\nspace and $\\\\mathcal{O}(k n^3 \\\\log^3 n)$ time. Our algorithms significantly\\nimprove upon the clustering quality (improving the approximation factor in\\n$\\\\Delta$) and size (whenever $\\\\ell \\\\in \\\\Omega(\\\\log n)$). We offer deterministic\\nrunning times comparable to known expected bounds. Additionally, in the\\ncontinuous setting, we give a near-linear improvement upon the space usage.\\nWhen compared only to deterministic results, we offer a near-linear speedup and\\na near-quadratic improvement in the space usage. When we may restrict ourselves\\nto only considering clusters where all subtrajectories are vertex-to-vertex\\nsubcurves, we obtain even better results under the continuous Fr\\\\'echet\\ndistance. Our algorithm becomes near quadratic and uses space that is near\\nlinear in $n \\\\ell$.\",\"PeriodicalId\":501570,\"journal\":{\"name\":\"arXiv - CS - Computational Geometry\",\"volume\":\"43 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computational Geometry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2402.13117\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computational Geometry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2402.13117","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Given a trajectory $T$ and a distance $\Delta$, we wish to find a set $C$ of
curves of complexity at most $\ell$, such that we can cover $T$ with subcurves
that each are within Fr\'echet distance $\Delta$ to at least one curve in $C$.
We call $C$ an $(\ell,\Delta)$-clustering and aim to find an
$(\ell,\Delta)$-clustering of minimum cardinality. This problem was introduced
by Akitaya $et$ $al.$ (2021) and shown to be NP-complete. The main focus has
therefore been on bicriterial approximation algorithms, allowing for the
clustering to be an $(\ell, \Theta(\Delta))$-clustering of roughly optimal
size. We present algorithms that construct $(\ell,4\Delta)$-clusterings of
$\mathcal{O}(k \log n)$ size, where $k$ is the size of the optimal $(\ell,
\Delta)$-clustering. For the discrete Fr\'echet distance, we use $\mathcal{O}(n
\ell \log n)$ space and $\mathcal{O}(k n^2 \log^3 n)$ deterministic worst case
time. For the continuous Fr\'echet distance, we use $\mathcal{O}(n^2 \log n)$
space and $\mathcal{O}(k n^3 \log^3 n)$ time. Our algorithms significantly
improve upon the clustering quality (improving the approximation factor in
$\Delta$) and size (whenever $\ell \in \Omega(\log n)$). We offer deterministic
running times comparable to known expected bounds. Additionally, in the
continuous setting, we give a near-linear improvement upon the space usage.
When compared only to deterministic results, we offer a near-linear speedup and
a near-quadratic improvement in the space usage. When we may restrict ourselves
to only considering clusters where all subtrajectories are vertex-to-vertex
subcurves, we obtain even better results under the continuous Fr\'echet
distance. Our algorithm becomes near quadratic and uses space that is near
linear in $n \ell$.