arXiv (Cornell University)最新文献

筛选
英文 中文
Can Authorship Attribution Models Distinguish Speakers in Speech Transcripts? 作者归因模型能否区分语音记录中的说话人?
arXiv (Cornell University) Pub Date : 2023-11-13 DOI: 10.48550/arxiv.2311.07564
Aggazzotti, Cristina, Andrews, Nicholas, Smith, Elizabeth Allyn
{"title":"Can Authorship Attribution Models Distinguish Speakers in Speech\u0000 Transcripts?","authors":"Aggazzotti, Cristina, Andrews, Nicholas, Smith, Elizabeth Allyn","doi":"10.48550/arxiv.2311.07564","DOIUrl":"https://doi.org/10.48550/arxiv.2311.07564","url":null,"abstract":"Authorship verification is the problem of determining if two distinct writing samples share the same author and is typically concerned with the attribution of written text. In this paper, we explore the attribution of transcribed speech, which poses novel challenges. The main challenge is that many stylistic features, such as punctuation and capitalization, are not available or reliable. Therefore, we expect a priori that transcribed speech is a more challenging domain for attribution. On the other hand, other stylistic features, such as speech disfluencies, may enable more successful attribution but, being specific to speech, require special purpose models. To better understand the challenges of this setting, we contribute the first systematic study of speaker attribution based solely on transcribed speech. Specifically, we propose a new benchmark for speaker attribution focused on conversational speech transcripts. To control for spurious associations of speakers with topic, we employ both conversation prompts and speakers' participating in the same conversation to construct challenging verification trials of varying difficulties. We establish the state of the art on this new benchmark by comparing a suite of neural and non-neural baselines, finding that although written text attribution models achieve surprisingly good performance in certain settings, they struggle in the hardest settings we consider.","PeriodicalId":496270,"journal":{"name":"arXiv (Cornell University)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing importance sampling on a quantum annealer for strong coupling SU(3) gauge theory 强耦合SU(3)规范理论在量子退火机上的重要抽样测试
arXiv (Cornell University) Pub Date : 2023-11-13 DOI: 10.48550/arxiv.2311.07209
Kim, Jangho, Luu, Thomas, Unger, Wolfgang
{"title":"Testing importance sampling on a quantum annealer for strong coupling\u0000 SU(3) gauge theory","authors":"Kim, Jangho, Luu, Thomas, Unger, Wolfgang","doi":"10.48550/arxiv.2311.07209","DOIUrl":"https://doi.org/10.48550/arxiv.2311.07209","url":null,"abstract":"$SU(N_c)$ gauge theories in the strong coupling limit can be described by integer variables representing monomers, dimers and baryon loops. We demonstrate how the D-wave quantum annealer can perform importance sampling on $U(N_c)$ gauge theory in the strong coupling formulation of this theory. In addition to causing a sign problem in importance sampling, baryon loops induce a complex QUBO matrix which cannot be optimized by the D-Wave annealer. Instead we show that simulating the sign-problem free quenched action on the D-Wave is sufficient when combined with a sign reweighting method. As the first test on $SU(3)$ gauge theory, we simulate on $2 times 2$ lattice and compare the results with its analytic solutions.","PeriodicalId":496270,"journal":{"name":"arXiv (Cornell University)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MonoDiffusion: Self-Supervised Monocular Depth Estimation Using Diffusion Model 使用扩散模型的自监督单目深度估计
arXiv (Cornell University) Pub Date : 2023-11-13 DOI: 10.48550/arxiv.2311.07198
Shao, Shuwei, Pei, Zhongcai, Chen, Weihai, Sun, Dingchi, Chen, Peter C. Y., Li, Zhengguo
{"title":"MonoDiffusion: Self-Supervised Monocular Depth Estimation Using\u0000 Diffusion Model","authors":"Shao, Shuwei, Pei, Zhongcai, Chen, Weihai, Sun, Dingchi, Chen, Peter C. Y., Li, Zhengguo","doi":"10.48550/arxiv.2311.07198","DOIUrl":"https://doi.org/10.48550/arxiv.2311.07198","url":null,"abstract":"Over the past few years, self-supervised monocular depth estimation that does not depend on ground-truth during the training phase has received widespread attention. Most efforts focus on designing different types of network architectures and loss functions or handling edge cases, e.g., occlusion and dynamic objects. In this work, we introduce a novel self-supervised depth estimation framework, dubbed MonoDiffusion, by formulating it as an iterative denoising process. Because the depth ground-truth is unavailable in the training phase, we develop a pseudo ground-truth diffusion process to assist the diffusion in MonoDiffusion. The pseudo ground-truth diffusion gradually adds noise to the depth map generated by a pre-trained teacher model. Moreover,the teacher model allows applying a distillation loss to guide the denoised depth. Further, we develop a masked visual condition mechanism to enhance the denoising ability of model. Extensive experiments are conducted on the KITTI and Make3D datasets and the proposed MonoDiffusion outperforms prior state-of-the-art competitors. The source code will be available at https://github.com/ShuweiShao/MonoDiffusion.","PeriodicalId":496270,"journal":{"name":"arXiv (Cornell University)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Time-Frequency Localization Characteristics of the Delay-Doppler Plane Orthogonal Pulse 延迟-多普勒平面正交脉冲的时频定位特性
arXiv (Cornell University) Pub Date : 2023-11-13 DOI: 10.48550/arxiv.2311.07238
Shafie, Akram, Yuan, Jinhong, Yang, Nan, Lin, Hai
{"title":"Time-Frequency Localization Characteristics of the Delay-Doppler Plane\u0000 Orthogonal Pulse","authors":"Shafie, Akram, Yuan, Jinhong, Yang, Nan, Lin, Hai","doi":"10.48550/arxiv.2311.07238","DOIUrl":"https://doi.org/10.48550/arxiv.2311.07238","url":null,"abstract":"The orthogonal delay-Doppler (DD) division multiplexing (ODDM) modulation has recently been proposed as a promising solution for ensuring reliable communications in high mobility scenarios. In this work, we investigate the time-frequency (TF) localization characteristics of the DD plane orthogonal pulse (DDOP), which is the prototype pulse of ODDM modulation. The TF localization characteristics examine how concentrated or spread out the energy of a pulse is in the joint TF domain. We first derive the TF localization metric, TF area (TFA), for the DDOP. Based on this result, we provide insights into the energy spread of the DDOP in the joint TF domain. Then, we delve into the potential advantages of the DDOP due to its energy spread, particularly in terms of leveraging both time and frequency diversities, and enabling high-resolution sensing. Furthermore, we determine the TFA for the recently proposed generalized design of the DDOP. Finally, we validate our analysis based on numerical results and show that the energy spread for the generalized design of the DDOP in the joint TF domain exhibits a step-wise increase as the duration of sub-pulses increases.","PeriodicalId":496270,"journal":{"name":"arXiv (Cornell University)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High Rectification Ratio at Room Temperature in Rhenium(I) Compound 室温下铼(I)化合物的高整流比
arXiv (Cornell University) Pub Date : 2023-11-13 DOI: 10.48550/arxiv.2311.07258
Rajbangshi, Subas, Pal, Nila, Rahman, Robinur, Nesterov, Vladimir N., Roy, Lisa, Ghosh, Shishir, Mondal, Prakash Chandra
{"title":"High Rectification Ratio at Room Temperature in Rhenium(I) Compound","authors":"Rajbangshi, Subas, Pal, Nila, Rahman, Robinur, Nesterov, Vladimir N., Roy, Lisa, Ghosh, Shishir, Mondal, Prakash Chandra","doi":"10.48550/arxiv.2311.07258","DOIUrl":"https://doi.org/10.48550/arxiv.2311.07258","url":null,"abstract":"Electrical current rectification is an interesting electronic feature, popularly known as a diode. Achieving a high rectification ratio in a molecular junction has been a long-standing goal in molecular electronics. The present work describes mimicking electrical current rectification with pi-stacked rhenium(I) compound sandwiched between two electrical contacts. Among the two mononuclear rhenium compounds studied here, [Re(CO)4(PPh3){(N)-saccharinate}] (1) and [Re(CO)3(phen){(N)-saccharinate}] (2), the latter show strong pi-pi interactions-induced high rectification ratio of ~ 4000 at 2.0 V at room temperature. Alternating current (AC)-based electrical measurements ensuring AC to DC electrical signal conversion at a frequency f of 1 KHz showing 2 can act as an excellent half-wave rectifier. Asymmetric charge injection barrier height at the electrode/Re(I) interfaces of the devices with a stacking configuration of p++-Si/Re compound31nm(2)/ITO originates the flow of electrical current unidirectionally. The charge transport mechanism governed by thermally activated hopping phenomena, and charge carrier propagation is explained through an energy profile considering the Fermi levels of two electrodes, and the energy of frontier molecular orbitals, HOMO, and LUMO, confirming rectification is of a molecular origin. The present work paves the way to combine different organometallic compounds as circuit elements in nanoelectronic devices to achieve numerous exciting electronic features.","PeriodicalId":496270,"journal":{"name":"arXiv (Cornell University)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and Design 第一:文本驱动时装合成与设计的百万条目数据集
arXiv (Cornell University) Pub Date : 2023-11-13 DOI: 10.48550/arxiv.2311.07414
Huang, Zhen, Li, Yihao, Pei, Dong, Zhou, Jiapeng, Ning, Xuliang, Han, Jianlin, Han, Xiaoguang, Chen, Xuejun
{"title":"FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and\u0000 Design","authors":"Huang, Zhen, Li, Yihao, Pei, Dong, Zhou, Jiapeng, Ning, Xuliang, Han, Jianlin, Han, Xiaoguang, Chen, Xuejun","doi":"10.48550/arxiv.2311.07414","DOIUrl":"https://doi.org/10.48550/arxiv.2311.07414","url":null,"abstract":"Text-driven fashion synthesis and design is an extremely valuable part of artificial intelligence generative content(AIGC), which has the potential to propel a tremendous revolution in the traditional fashion industry. To advance the research on text-driven fashion synthesis and design, we introduce a new dataset comprising a million high-resolution fashion images with rich structured textual(FIRST) descriptions. In the FIRST, there is a wide range of attire categories and each image-paired textual description is organized at multiple hierarchical levels. Experiments on prevalent generative models trained over FISRT show the necessity of FIRST. We invite the community to further develop more intelligent fashion synthesis and design systems that make fashion design more creative and imaginative based on our dataset. The dataset will be released soon.","PeriodicalId":496270,"journal":{"name":"arXiv (Cornell University)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Machine Learning For Beamline Steering 光束转向的机器学习
arXiv (Cornell University) Pub Date : 2023-11-13 DOI: 10.48550/arxiv.2311.07519
Kante, Isaac
{"title":"Machine Learning For Beamline Steering","authors":"Kante, Isaac","doi":"10.48550/arxiv.2311.07519","DOIUrl":"https://doi.org/10.48550/arxiv.2311.07519","url":null,"abstract":"Beam steering is the process involving the calibration of the angle and position at which a particle accelerator's electron beam is incident upon the x-ray target with respect to the rotation axis of the collimator. Beam Steering is an essential task for light sources. In the case under study, the LINAC To Undulator (LTU) section of the beamline is difficult to aim. Each use of the accelerator requires re-calibration of the magnets in this section. This involves a substantial amount of time and effort from human operators, while reducing scientific throughput of the light source. We investigate the use of deep neural networks to assist in this task. The deep learning models are trained on archival data and then validated on simulation data. The performance of the deep learning model is contrasted against that of trained human operators.","PeriodicalId":496270,"journal":{"name":"arXiv (Cornell University)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lattice relaxation, electronic structure and continuum model for twisted bilayer MoTe$_2$ 扭曲双分子层MoTe$_2$的晶格弛豫、电子结构和连续介质模型
arXiv (Cornell University) Pub Date : 2023-11-13 DOI: 10.48550/arxiv.2311.07533
Mao, Ning, Xu, Cheng, Li, Jiangxu, Bao, Ting, Liu, Peitao, Xu, Yong, Felser, Claudia, Fu, Liang, Zhang, Yang
{"title":"Lattice relaxation, electronic structure and continuum model for twisted\u0000 bilayer MoTe$_2$","authors":"Mao, Ning, Xu, Cheng, Li, Jiangxu, Bao, Ting, Liu, Peitao, Xu, Yong, Felser, Claudia, Fu, Liang, Zhang, Yang","doi":"10.48550/arxiv.2311.07533","DOIUrl":"https://doi.org/10.48550/arxiv.2311.07533","url":null,"abstract":"We investigate the lattice relaxation effect on moir'e band structures in twisted bilayer MoTe$_2$ with two approaches: (a) large-scale plane-wave basis first principle calculation down to $2.88^{circ}$, (b) transfer learning structure relaxation + local-basis first principles calculation down to $1.1^{circ}$. Two types of van der Waals corrections have been examined: the D2 method of Grimme and the density-dependent energy correction. We note the density-dependent energy correction yields a continuous evolution of bandwidth with twist angles. Including second harmonic of intralayer potential/interlayer tunneling and the strain induced gauge field, we develop a more complete continuum model with a single set of parameters for a wide range of twist angles, providing a useful starting point for many body simulation.","PeriodicalId":496270,"journal":{"name":"arXiv (Cornell University)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CASTER: A Computer-Vision-Assisted Wireless Channel Simulator for Gesture Recognition CASTER:用于手势识别的计算机视觉辅助无线通道模拟器
arXiv (Cornell University) Pub Date : 2023-11-13 DOI: 10.48550/arxiv.2311.07169
Ren, Zhenyu, Li, Guoliang, Ji, Chenqing, Yu, Chao, Wang, Shuai, Wang, Rui
{"title":"CASTER: A Computer-Vision-Assisted Wireless Channel Simulator for\u0000 Gesture Recognition","authors":"Ren, Zhenyu, Li, Guoliang, Ji, Chenqing, Yu, Chao, Wang, Shuai, Wang, Rui","doi":"10.48550/arxiv.2311.07169","DOIUrl":"https://doi.org/10.48550/arxiv.2311.07169","url":null,"abstract":"In this paper, a computer-vision-assisted simulation method is proposed to address the issue of training dataset acquisition for wireless hand gesture recognition. In the existing literature, in order to classify gestures via the wireless channel estimation, massive training samples should be measured in a consistent environment, consuming significant efforts. In the proposed CASTER simulator, however, the training dataset can be simulated via existing videos. Particularly, a gesture is represented by a sequence of snapshots, and the channel impulse response of each snapshot is calculated via tracing the rays scattered off a primitive-based hand model. Moreover, CASTER simulator relies on the existing videos to extract the motion data of gestures. Thus, the massive measurements of wireless channel can be eliminated. The experiments demonstrate a 90.8% average classification accuracy of simulation-to-reality inference.","PeriodicalId":496270,"journal":{"name":"arXiv (Cornell University)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SponTTS: modeling and transferring spontaneous style for TTS SponTTS:为TTS塑造和传递自发性风格
arXiv (Cornell University) Pub Date : 2023-11-13 DOI: 10.48550/arxiv.2311.07179
Li, Hanzhao, Zhu, Xinfa, Xue, Liumeng, Song, Yang, Chen, Yunlin, Xie, Lei
{"title":"SponTTS: modeling and transferring spontaneous style for TTS","authors":"Li, Hanzhao, Zhu, Xinfa, Xue, Liumeng, Song, Yang, Chen, Yunlin, Xie, Lei","doi":"10.48550/arxiv.2311.07179","DOIUrl":"https://doi.org/10.48550/arxiv.2311.07179","url":null,"abstract":"Spontaneous speaking style exhibits notable differences from other speaking styles due to various spontaneous phenomena (e.g., filled pauses, prolongation) and substantial prosody variation (e.g., diverse pitch and duration variation, occasional non-verbal speech like smile), posing challenges to modeling and prediction of spontaneous style. Moreover, the limitation of high-quality spontaneous data constrains spontaneous speech generation for speakers without spontaneous data. To address these problems, we propose SponTTS, a two-stage approach based on bottleneck (BN) features to model and transfer spontaneous style for TTS. In the first stage, we adopt a Conditional Variational Autoencoder (CVAE) to capture spontaneous prosody from a BN feature and involve the spontaneous phenomena by the constraint of spontaneous phenomena embedding prediction loss. Besides, we introduce a flow-based predictor to predict a latent spontaneous style representation from the text, which enriches the prosody and context-specific spontaneous phenomena during inference. In the second stage, we adopt a VITS-like module to transfer the spontaneous style learned in the first stage to target speakers. Experiments demonstrate that SponTTS is effective in modeling spontaneous style and transferring the style to the target speakers, generating spontaneous speech with high naturalness, expressiveness, and speaker similarity. The zero-shot spontaneous style TTS test further verifies the generalization and robustness of SponTTS in generating spontaneous speech for unseen speakers.","PeriodicalId":496270,"journal":{"name":"arXiv (Cornell University)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136353280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信