arXiv - CS - Information Theory最新文献_第3页

Properties of Shannon and Rényi entropies of the Poisson distribution as the functions of intensity parameter 泊松分布的香农熵和雷尼熵作为强度参数函数的特性

arXiv - CS - Information Theory Pub Date : 2024-02-06 DOI: arxiv-2403.08805

Volodymyr Braiman, Anatoliy Malyarenko, Yuliya Mishura, Yevheniia Anastasiia Rudyk

引用次数: 0

TexShape: Information Theoretic Sentence Embedding for Language Models TexShape：语言模型的信息论句子嵌入

arXiv - CS - Information Theory Pub Date : 2024-02-05 DOI: arxiv-2402.05132

H. Kaan Kale, Homa Esfahanizadeh, Noel Elias, Oguzhan Baser, Muriel Medard, Sriram Vishwanath

{"title":"TexShape: Information Theoretic Sentence Embedding for Language Models","authors":"H. Kaan Kale, Homa Esfahanizadeh, Noel Elias, Oguzhan Baser, Muriel Medard, Sriram Vishwanath","doi":"arxiv-2402.05132","DOIUrl":"https://doi.org/arxiv-2402.05132","url":null,"abstract":"With the exponential growth in data volume and the emergence of\u0000data-intensive applications, particularly in the field of machine learning,\u0000concerns related to resource utilization, privacy, and fairness have become\u0000paramount. This paper focuses on the textual domain of data and addresses\u0000challenges regarding encoding sentences to their optimized representations\u0000through the lens of information-theory. In particular, we use empirical\u0000estimates of mutual information, using the Donsker-Varadhan definition of\u0000Kullback-Leibler divergence. Our approach leverages this estimation to train an\u0000information-theoretic sentence embedding, called TexShape, for (task-based)\u0000data compression or for filtering out sensitive information, enhancing privacy\u0000and fairness. In this study, we employ a benchmark language model for initial\u0000text representation, complemented by neural networks for information-theoretic\u0000compression and mutual information estimations. Our experiments demonstrate\u0000significant advancements in preserving maximal targeted information and minimal\u0000sensitive information over adverse compression ratios, in terms of predictive\u0000accuracy of downstream models that are trained using the compressed data.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139761406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Characterization of the Distortion-Perception Tradeoff for Finite Channels with Arbitrary Metrics 具有任意指标的有限信道的失真与感知权衡特征

arXiv - CS - Information Theory Pub Date : 2024-02-03 DOI: arxiv-2402.02265

Dror Freirich, Nir Weinberger, Ron Meir

引用次数: 0

The Information of Large Language Model Geometry 大语言模型几何的信息

arXiv - CS - Information Theory Pub Date : 2024-02-01 DOI: arxiv-2402.03471

Zhiquan Tan, Chenghai Li, Weiran Huang

引用次数: 0

Extracting and visualizing a new classification system for Colombia's National Administrative Department of Statistics. A visual analytics framework case study 哥伦比亚国家统计局新分类系统的提取和可视化。可视化分析框架案例研究

arXiv - CS - Information Theory Pub Date : 2024-01-29 DOI: arxiv-2401.15994

Pierre RaimbaudUNIANDES, Jaime Camilo Espitia CastilloUNIANDES, John Guerra-GomezNortheastern University, Silicon Valley Campus

{"title":"Extracting and visualizing a new classification system for Colombia's National Administrative Department of Statistics. A visual analytics framework case study","authors":"Pierre RaimbaudUNIANDES, Jaime Camilo Espitia CastilloUNIANDES, John Guerra-GomezNortheastern University, Silicon Valley Campus","doi":"arxiv-2401.15994","DOIUrl":"https://doi.org/arxiv-2401.15994","url":null,"abstract":"In a world filled with data, it is expected for a nation to take decisions\u0000informed by data. However, countries need to first collect and publish such\u0000data in a way meaningful for both citizens and policy makers. A good thematic\u0000classification could be instrumental in helping users navigate and find the\u0000right resources on a rich data repository as the one collected by Colombia's\u0000National Administrative Department of Statistics (DANE). The Visual Analytics\u0000Framework is a methodology for conducting visual analysis developed by T.\u0000Munzner et al. [T. Munzner, Visualization Analysis and Design, A K Peters\u0000Visualization Series, 1, 2014] that could help with this task. This paper\u0000presents a case study applying such framework conducted to help the DANE better\u0000visualize their data repository, and present a more understandable\u0000classification of it. It describes three main analysis tasks identified, the\u0000proposed solutions and the collection of insights generated from them.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139584743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predictability and Randomness 可预测性和随机性

arXiv - CS - Information Theory Pub Date : 2024-01-23 DOI: arxiv-2401.13066

Lenhart K. Schubert

{"title":"Predictability and Randomness","authors":"Lenhart K. Schubert","doi":"arxiv-2401.13066","DOIUrl":"https://doi.org/arxiv-2401.13066","url":null,"abstract":"Algorithmic theories of randomness can be related to theories of\u0000probabilistic sequence prediction through the notion of a predictor, defined as\u0000a function which supplies lower bounds on initial-segment probabilities of\u0000infinite sequences. An infinite binary sequence $z$ is called unpredictable iff\u0000its initial-segment \"redundancy\" $n+log p(z(n))$ remains sufficiently low\u0000relative to every effective predictor $p$. A predictor which maximizes the\u0000initial-segment redundancy of a sequence is called optimal for that sequence.\u0000It turns out that a sequence is random iff it is unpredictable. More generally,\u0000a sequence is random relative to an arbitrary computable distribution iff the\u0000distribution is itself an optimal predictor for the sequence. Here \"random\" can\u0000be taken in the sense of Martin-L\"{o}f by using weak criteria of\u0000effectiveness, or in the sense of Schnorr by using stronger criteria of\u0000effectiveness. Under the weaker criteria of effectiveness it is possible to\u0000construct a universal predictor which is optimal for all infinite sequences.\u0000This predictor assigns nonvanishing limit probabilities precisely to the\u0000recursive sequences. Under the stronger criteria of effectiveness it is\u0000possible to establish a law of large numbers for sequences random relative to a\u0000computable distribution, which may be useful as a criterion of \"rationality\"\u0000for methods of probabilistic prediction. A remarkable feature of effective\u0000predictors is the fact that they are expressible in the special form first\u0000proposed by Solomonoff. In this form sequence prediction reduces to assigning\u0000high probabilities to initial segments with short and/or numerous encodings.\u0000This fact provides the link between theories of randomness and Solomonoff's\u0000theory of prediction.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139561948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rate-Distortion-Perception Tradeoff Based on the Conditional-Distribution Perception Measure 基于条件分布感知测量的速率-失真-感知权衡

arXiv - CS - Information Theory Pub Date : 2024-01-22 DOI: arxiv-2401.12207

Sadaf Salehkalaibar, Jun Chen, Ashish Khisti, Wei Yu

{"title":"Rate-Distortion-Perception Tradeoff Based on the Conditional-Distribution Perception Measure","authors":"Sadaf Salehkalaibar, Jun Chen, Ashish Khisti, Wei Yu","doi":"arxiv-2401.12207","DOIUrl":"https://doi.org/arxiv-2401.12207","url":null,"abstract":"We study the rate-distortion-perception (RDP) tradeoff for a memoryless\u0000source model in the asymptotic limit of large block-lengths. Our perception\u0000measure is based on a divergence between the distributions of the source and\u0000reconstruction sequences conditioned on the encoder output, which was first\u0000proposed in [1], [2]. We consider the case when there is no shared randomness\u0000between the encoder and the decoder. For the case of discrete memoryless\u0000sources we derive a single-letter characterization of the RDP function, thus\u0000settling a problem that remains open for the marginal metric introduced in Blau\u0000and Michaeli [3] (with no shared randomness). Our achievability scheme is based\u0000on lossy source coding with a posterior reference map proposed in [4]. For the\u0000case of continuous valued sources under squared error distortion measure and\u0000squared quadratic Wasserstein perception measure we also derive a single-letter\u0000characterization and show that a noise-adding mechanism at the decoder suffices\u0000to achieve the optimal representation. For the case of zero perception loss, we\u0000show that our characterization interestingly coincides with the results for the\u0000marginal metric derived in [5], [6] and again demonstrate that zero perception\u0000loss can be achieved with a $3$-dB penalty in the minimum distortion. Finally\u0000we specialize our results to the case of Gaussian sources. We derive the RDP\u0000function for vector Gaussian sources and propose a waterfilling type solution.\u0000We also partially characterize the RDP function for a mixture of vector\u0000Gaussians.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139558511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Near-Field Localization with $1$-bit Quantized Hybrid A/D Reception 利用 1 美元位量化混合 A/D 接收进行近场定位

arXiv - CS - Information Theory Pub Date : 2024-01-22 DOI: arxiv-2401.12029

Ioannis Gavras, Italo Atzeni, George C. Alexandropoulos

引用次数: 0

Generalization and Informativeness of Conformal Prediction 共形预测的普遍性和信息量

arXiv - CS - Information Theory Pub Date : 2024-01-22 DOI: arxiv-2401.11810

Matteo Zecchin, Sangwoo Park, Osvaldo Simeone, Fredrik Hellström

{"title":"Generalization and Informativeness of Conformal Prediction","authors":"Matteo Zecchin, Sangwoo Park, Osvaldo Simeone, Fredrik Hellström","doi":"arxiv-2401.11810","DOIUrl":"https://doi.org/arxiv-2401.11810","url":null,"abstract":"The safe integration of machine learning modules in decision-making processes\u0000hinges on their ability to quantify uncertainty. A popular technique to achieve\u0000this goal is conformal prediction (CP), which transforms an arbitrary base\u0000predictor into a set predictor with coverage guarantees. While CP certifies the\u0000predicted set to contain the target quantity with a user-defined tolerance, it\u0000does not provide control over the average size of the predicted sets, i.e.,\u0000over the informativeness of the prediction. In this work, a theoretical\u0000connection is established between the generalization properties of the base\u0000predictor and the informativeness of the resulting CP prediction sets. To this\u0000end, an upper bound is derived on the expected size of the CP set predictor\u0000that builds on generalization error bounds for the base predictor. The derived\u0000upper bound provides insights into the dependence of the average size of the CP\u0000set predictor on the amount of calibration data, the target reliability, and\u0000the generalization performance of the base predictor. The theoretical insights\u0000are validated using simple numerical regression and classification tasks.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"146 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139559339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Data-oriented Coordinated Uplink Transmission for Massive IoT System 面向数据的大规模物联网系统协调上行链路传输

arXiv - CS - Information Theory Pub Date : 2024-01-22 DOI: arxiv-2401.11761

Jyri Hämäläinen, Rui Dinis, Mehmet C. Ilter

{"title":"Data-oriented Coordinated Uplink Transmission for Massive IoT System","authors":"Jyri Hämäläinen, Rui Dinis, Mehmet C. Ilter","doi":"arxiv-2401.11761","DOIUrl":"https://doi.org/arxiv-2401.11761","url":null,"abstract":"Recently, the paradigm of massive ultra-reliable low-latency IoT\u0000communications (URLLC-IoT) has gained growing interest. Reliable delay-critical\u0000uplink transmission in IoT is a challenging task since low-complex devices\u0000typically do not support multiple antennas or demanding signal processing\u0000tasks. However, in many IoT services the data volumes are small and deployments\u0000may include massive number of devices. We consider on a clustered uplink\u0000transmission with two cooperation approaches: First, we focus on scenario where\u0000location-based channel knowledge map (CKM) is applied to enable cooperation.\u0000Second, we consider a scenario where scarce channel side-information is applied\u0000in transmission. In both scenarios we also model and analyse the impact of\u0000erroneous information. In the performance evaluation we apply the recently\u0000introduced data-oriented approach that has gathered significant attention in\u0000the context of short-packet transmissions. Specifically, it introduces a\u0000transient performance metric for small data transmissions, where the amount of\u0000data and available bandwidth play crucial roles. Results show that cooperation\u0000between clustered IoT devices may provide notable benefits in terms of\u0000increased range. It is noticed that the performance is heavily depending on the\u0000strength of the static channel component in the CKM based cooperation. The\u0000channel side-information based cooperation is robust against changes in the\u0000radio environment but sensitive to possible errors in the channel\u0000side-information. Even with large IoT device clusters, side-information errors\u0000may set a limit for the use of services assuming high-reliability and\u0000low-latency. Analytic results are verified against simulations, showing only\u0000minor differences at low probability levels.","PeriodicalId":501433,"journal":{"name":"arXiv - CS - Information Theory","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139558663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0