{"title":"On Searching for Minimal Integer Representation of Undirected Graphs","authors":"V. Parque, Tomoyuki Miyashita","doi":"10.1007/978-981-99-8132-8_7","DOIUrl":"https://doi.org/10.1007/978-981-99-8132-8_7","url":null,"abstract":"","PeriodicalId":281152,"journal":{"name":"International Conference on Neural Information Processing","volume":"261 4","pages":"82-94"},"PeriodicalIF":0.0,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139005871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FIT: Frequency-based Image Translation for Domain Adaptive Object Detection","authors":"Siqi Zhang, Lu Zhang, Zhiyong Liu, Hangtao Feng","doi":"10.48550/arXiv.2303.03698","DOIUrl":"https://doi.org/10.48550/arXiv.2303.03698","url":null,"abstract":"Domain adaptive object detection (DAOD) aims to adapt the detector from a labelled source domain to an unlabelled target domain. In recent years, DAOD has attracted massive attention since it can alleviate performance degradation due to the large shift of data distributions in the wild. To align distributions between domains, adversarial learning is widely used in existing DAOD methods. However, the decision boundary for the adversarial domain discriminator may be inaccurate, causing the model biased towards the source domain. To alleviate this bias, we propose a novel Frequency-based Image Translation (FIT) framework for DAOD. First, by keeping domain-invariant frequency components and swapping domain-specific ones, we conduct image translation to reduce domain shift at the input level. Second, hierarchical adversarial feature learning is utilized to further mitigate the domain gap at the feature level. Finally, we design a joint loss to train the entire network in an end-to-end manner without extra training to obtain translated images. Extensive experiments on three challenging DAOD benchmarks demonstrate the effectiveness of our method.","PeriodicalId":281152,"journal":{"name":"International Conference on Neural Information Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116915653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Emotion Detection in Unfix-length-Context Conversation","authors":"Xiaochen Zhang, Daniel Tang","doi":"10.48550/arXiv.2302.06029","DOIUrl":"https://doi.org/10.48550/arXiv.2302.06029","url":null,"abstract":"We leverage different context windows when predicting the emotion of different utterances. New modules are included to realize variable-length context: 1) two speaker-aware units, which explicitly model inner- and inter-speaker dependencies to form distilled conversational context, and 2) a top-k normalization layer, which determines the most proper context windows from the conversational context to predict emotion. Experiments and ablation studies show that our approach outperforms several strong baselines on three public datasets.","PeriodicalId":281152,"journal":{"name":"International Conference on Neural Information Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116988744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rethinking Voxelization and Classification for 3D Object Detection","authors":"Youshaa Murhij, A. Golodkov, D. Yudin","doi":"10.48550/arXiv.2301.04058","DOIUrl":"https://doi.org/10.48550/arXiv.2301.04058","url":null,"abstract":"The main challenge in 3D object detection from LiDAR point clouds is achieving real-time performance without affecting the reliability of the network. In other words, the detecting network must be confident enough about its predictions. In this paper, we present a solution to improve network inference speed and precision at the same time by implementing a fast dynamic voxelizer that works on fast pillar-based models in the same way a voxelizer works on slow voxel-based models. In addition, we propose a lightweight detection sub-head model for classifying predicted objects and filter out false detected objects that significantly improves model precision in a negligible time and computing cost. The developed code is publicly available at: https://github.com/YoushaaMurhij/RVCDet.","PeriodicalId":281152,"journal":{"name":"International Conference on Neural Information Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125084288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Yudin, Yaroslav K. Solomentsev, R. Musaev, A. Staroverov, A. Panov
{"title":"HPointLoc: Point-based Indoor Place Recognition using Synthetic RGB-D Images","authors":"D. Yudin, Yaroslav K. Solomentsev, R. Musaev, A. Staroverov, A. Panov","doi":"10.48550/arXiv.2212.14649","DOIUrl":"https://doi.org/10.48550/arXiv.2212.14649","url":null,"abstract":"We present a novel dataset named as HPointLoc, specially designed for exploring capabilities of visual place recognition in indoor environment and loop detection in simultaneous localization and mapping. The loop detection sub-task is especially relevant when a robot with an on-board RGB-D camera can drive past the same place (``Point\") at different angles. The dataset is based on the popular Habitat simulator, in which it is possible to generate photorealistic indoor scenes using both own sensor data and open datasets, such as Matterport3D. To study the main stages of solving the place recognition problem on the HPointLoc dataset, we proposed a new modular approach named as PNTR. It first performs an image retrieval with the Patch-NetVLAD method, then extracts keypoints and matches them using R2D2, LoFTR or SuperPoint with SuperGlue, and finally performs a camera pose optimization step with TEASER++. Such a solution to the place recognition problem has not been previously studied in existing publications. The PNTR approach has shown the best quality metrics on the HPointLoc dataset and has a high potential for real use in localization systems for unmanned vehicles. The proposed dataset and framework are publicly available: https://github.com/metra4ok/HPointLoc.","PeriodicalId":281152,"journal":{"name":"International Conference on Neural Information Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124640567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rong Qin, Luwen Huangfu, Devon Hood, James Ma, Shengyue Huang
{"title":"Kernel Inversed Pyramidal Resizing Network for Efficient Pavement Distress Recognition","authors":"Rong Qin, Luwen Huangfu, Devon Hood, James Ma, Shengyue Huang","doi":"10.48550/arXiv.2212.01790","DOIUrl":"https://doi.org/10.48550/arXiv.2212.01790","url":null,"abstract":"Pavement Distress Recognition (PDR) is an important step in pavement inspection and can be powered by image-based automation to expedite the process and reduce labor costs. Pavement images are often in high-resolution with a low ratio of distressed to non-distressed areas. Advanced approaches leverage these properties via dividing images into patches and explore discriminative features in the scale space. However, these approaches usually suffer from information loss during image resizing and low efficiency due to complex learning frameworks. In this paper, we propose a novel and efficient method for PDR. A light network named the Kernel Inversed Pyramidal Resizing Network (KIPRN) is introduced for image resizing, and can be flexibly plugged into the image classification network as a pre-network to exploit resolution and scale information. In KIPRN, pyramidal convolution and kernel inversed convolution are specifically designed to mine discriminative information across different feature granularities and scales. The mined information is passed along to the resized images to yield an informative image pyramid to assist the image classification network for PDR. We applied our method to three well-known Convolutional Neural Networks (CNNs), and conducted an evaluation on a large-scale pavement image dataset named CQU-BPDD. Extensive results demonstrate that KIPRN can generally improve the pavement distress recognition of these CNN models and show that the simple combination of KIPRN and EfficientNet-B3 significantly outperforms the state-of-the-art patch-based method in both performance and efficiency.","PeriodicalId":281152,"journal":{"name":"International Conference on Neural Information Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124517554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiran Huang, Yexu Zhou, Michael Hefenbrock, T. Riedel, Likun Fang, M. Beigl
{"title":"Universal Distributional Decision-based Black-box Adversarial Attack with Reinforcement Learning","authors":"Yiran Huang, Yexu Zhou, Michael Hefenbrock, T. Riedel, Likun Fang, M. Beigl","doi":"10.48550/arXiv.2211.08384","DOIUrl":"https://doi.org/10.48550/arXiv.2211.08384","url":null,"abstract":"The vulnerability of the high-performance machine learning models implies a security risk in applications with real-world consequences. Research on adversarial attacks is beneficial in guiding the development of machine learning models on the one hand and finding targeted defenses on the other. However, most of the adversarial attacks today leverage the gradient or logit information from the models to generate adversarial perturbation. Works in the more realistic domain: decision-based attacks, which generate adversarial perturbation solely based on observing the output label of the targeted model, are still relatively rare and mostly use gradient-estimation strategies. In this work, we propose a pixel-wise decision-based attack algorithm that finds a distribution of adversarial perturbation through a reinforcement learning algorithm. We call this method Decision-based Black-box Attack with Reinforcement learning (DBAR). Experiments show that the proposed approach outperforms state-of-the-art decision-based attacks with a higher attack success rate and greater transferability.","PeriodicalId":281152,"journal":{"name":"International Conference on Neural Information Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129416203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Lemaire, Loic Cordone, A. Castagnetti, Pierre-Emmanuel Novac, Jonathan Courtois, Benoît Miramond
{"title":"An Analytical Estimation of Spiking Neural Networks Energy Efficiency","authors":"E. Lemaire, Loic Cordone, A. Castagnetti, Pierre-Emmanuel Novac, Jonathan Courtois, Benoît Miramond","doi":"10.1007/978-3-031-30105-6_48","DOIUrl":"https://doi.org/10.1007/978-3-031-30105-6_48","url":null,"abstract":"","PeriodicalId":281152,"journal":{"name":"International Conference on Neural Information Processing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115397943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Methodology for the Prediction of Drug Target Interaction using CDK Descriptors","authors":"Tanya Liyaqat, T. Ahmad, Chandni Saxena","doi":"10.48550/arXiv.2210.11482","DOIUrl":"https://doi.org/10.48550/arXiv.2210.11482","url":null,"abstract":"Detecting probable Drug Target Interaction (DTI) is a critical task in drug discovery. Conventional DTI studies are expensive, labor-intensive, and take a lot of time, hence there are significant reasons to construct useful computational techniques that may successfully anticipate possible DTIs. Although certain methods have been developed for this cause, numerous interactions are yet to be discovered, and prediction accuracy is still low. To meet these challenges, we propose a DTI prediction model built on molecular structure of drugs and sequence of target proteins. In the proposed model, we use Simplified Molecular Input Line Entry System (SMILES) to create CDK descriptors, Molecular ACCess System (MACCS) fingerprints, Electrotopological state (Estate) fingerprints and amino acid sequences of targets to get Pseudo Amino Acid Composition (PseAAC). We target to evaluate performance of DTI prediction models using CDK descriptors. For comparison, we use benchmark data and evaluate models performance on two widely used fingerprints, MACCS fingerprints and Estate fingerprints. The evaluation of performances shows that CDK descriptors are superior at predicting DTIs. The proposed method also outperforms other previously published techniques significantly.","PeriodicalId":281152,"journal":{"name":"International Conference on Neural Information Processing","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114310406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}