{"title":"Direct minimization on the complex Stiefel manifold in Kohn-Sham density functional theory for finite and extended systems","authors":"Kai Luo , Tingguang Wang , Xinguo Ren","doi":"10.1016/j.cpc.2025.109596","DOIUrl":"10.1016/j.cpc.2025.109596","url":null,"abstract":"<div><div>Direct minimization method on the complex Stiefel manifold in Kohn-Sham density functional theory is formulated to treat both finite and extended systems in a unified manner. This formulation is well-suited for scenarios where straightforward iterative diagonalization becomes challenging, especially when the Aufbau principle is not applicable. We present the theoretical foundation and numerical implementation of the Riemannian conjugate gradient (RCG) within a localized non-orthogonal basis set. Riemannian Broyden-Fletcher-Goldfarb-Shanno (RBFGS) method is tentatively implemented. Extensive testing compares the performance of the proposed methods and highlights that the quasi-Newton method is more efficient. However, for extended systems, the computational time required grows rapidly with respect to the number of <strong>k</strong>-points.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"312 ","pages":"Article 109596"},"PeriodicalIF":7.2,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143738237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Runze Mao , Xinyu Dong , Xuan Bai , Ziheng Wu , Guanlin Dang , Han Li , Zhi X. Chen
{"title":"DeepFlame 2.0: A new version for fully GPU-native machine learning accelerated reacting flow simulations under low-Mach conditions","authors":"Runze Mao , Xinyu Dong , Xuan Bai , Ziheng Wu , Guanlin Dang , Han Li , Zhi X. Chen","doi":"10.1016/j.cpc.2025.109595","DOIUrl":"10.1016/j.cpc.2025.109595","url":null,"abstract":"<div><div>This paper presents <em>DeepFlame</em> v2.0, a significant computational framework upgrade designed for high-performance combustion simulations on GPU-based heterogeneous architectures. The updated version implements a comprehensive CUDA-accelerated architecture incorporating fundamental combustion modelling components, including: implicit/explicit finite volume method (FVM) discretisation schemes, chemical kinetics integrators, thermophysical property models, and subgrid-scale closures for both fluid dynamics and combustion processes. The redesigned code supports diverse boundary conditions and discretisation schemes for broad applicability across combustion configurations. Key performance optimisations integrate advanced CUDA features including data coalescing techniques, CUDA Graphs for kernel scheduling, and NCCL-based multi-GPU communication. Validation studies employing the fully-implicit low-Mach solver demonstrate two-order-of-magnitude acceleration compared to conventional CPU implementations across canonical test cases, while maintaining numerical accuracy.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"312 ","pages":"Article 109595"},"PeriodicalIF":7.2,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143705879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angel Rosado, Mario Benites, Efstratios Manousakis
{"title":"QUANTUM ESPRESSO implementation of the RPA-based functional","authors":"Angel Rosado, Mario Benites, Efstratios Manousakis","doi":"10.1016/j.cpc.2025.109594","DOIUrl":"10.1016/j.cpc.2025.109594","url":null,"abstract":"<div><div>We detail our implementation of the random-phase-approximation based functional (RPAF) derived in Ref. <span><span>[1]</span></span> for the QUANTUM ESPRESSO (QE) package. We also make available in the <em>Computer Physics Communications</em> library the source files which are required in order to apply this functional within QE. We also provide the corresponding RPAF projector augmented wave (PAW) and ultrasoft pseudopotentials for most elements. Lastly, we benchmark the performance of the RPAF by calculating the equilibrium lattice constant and bulk modulus of a set of the same 60 crystals used by other authors to benchmark other functionals for both PAW and ultrasoft pseudopotentials. We find that the RPAF performs better overall as compared to the other most popular functionals.</div></div><div><h3>Program summary</h3><div><em>Program Title:</em> Implementation of RPAF functional in QUANTUM ESPRESSO</div><div><em>CPC Library link to program files:</em> <span><span>https://doi.org/10.17632/y96kpb2dpd.1</span><svg><path></path></svg></span></div><div><em>Developer's repository link:</em> <span><span>https://data.mendeley.com/datasets/bg45fjkz2t</span><svg><path></path></svg></span></div><div><em>Licensing provisions:</em> GPLv3</div><div><em>Programming language:</em> Fortran 90</div><div><em>Nature of problem:</em> To make the RPAF available to be used in DFT calculations.</div><div><em>Solution method:</em> Implementation of RPAF in QUANTUM ESPRESSO.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"312 ","pages":"Article 109594"},"PeriodicalIF":7.2,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143738238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaxin Wu , Min Luo , Boo Cheong Khoo , Dunhui Xiao , Pengzhi Lin
{"title":"Fractal-constrained deep learning for super-resolution of turbulence with zero or few label data","authors":"Jiaxin Wu , Min Luo , Boo Cheong Khoo , Dunhui Xiao , Pengzhi Lin","doi":"10.1016/j.cpc.2025.109548","DOIUrl":"10.1016/j.cpc.2025.109548","url":null,"abstract":"<div><div>The super-resolution of turbulence is of paramount importance and still remains challenging due to the inefficiency of the current technologies in retaining the intrinsic physics like the multi-scale flow structures and energy cascades. To address this challenge, this work proposes a fractal-constrained deep learning super-resolution model termed SKFSR-FIC. The model is characterized by two distinctive designs: (1) a SKip-connected Feature-reuse Super-Resolution (SKFSR) network that learns and retains multi-scale flow structures and multi-frequency dynamics, achieving efficient upscaling of flow fields while reconstructing self-similar physics; (2) a fractal invariance constraint (FIC) that utilizes the self-similarities of flow properties invariant to scales to substitute label data information in super resolution, thus achieving accurate reconstruction of multi-scale dynamics and energy cascades. The SKFSR-FIC model, for the first time, leverages fractal dimensions to guide the turbulent flow reconstruction and significantly reduces the reliance on label data and, especially, achieves zero-shot (i.e., unsupervised) super-resolution that cannot be handled by existing deep learning models. The results from five self-affine fractal images and two turbulent flow cases demonstrate the enhanced efficiency (up to 2 times) and accuracy (up to 100 times) of the unsupervised SKFSR-FIC model compared to the conventional interpolation method and deep learning models. Moreover, the SKFSR network is compatible with both FIC and label data, thereby adaptively enabling unsupervised, supervised and semi-supervised learning strategies. In particular, the semi-supervised SKFSR-FIC model, even by using one snapshot, achieves the best accuracy among the three learning strategies due to the combination of physics and data.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"312 ","pages":"Article 109548"},"PeriodicalIF":7.2,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingjun Xiang , Hui Yuan , Kai Zhou , Hartmut G. Roskos
{"title":"Hybrid multi-head physics-informed neural network for depth estimation in terahertz imaging","authors":"Mingjun Xiang , Hui Yuan , Kai Zhou , Hartmut G. Roskos","doi":"10.1016/j.cpc.2025.109586","DOIUrl":"10.1016/j.cpc.2025.109586","url":null,"abstract":"<div><div>Terahertz (THz) imaging is a topic in the field of optics, that is intensively investigated not least due to its potential for recording three-dimensional (3D) images, useful e.g., for the detection of hidden objects, nondestructive testing, and radar-like imaging in conjunction with automotive systems. Depth information retrieval is a key factor to recover the three-dimensional shape of objects. Impressive results for depth determination in the visible and infrared spectral range have been demonstrated through deep learning (DL). Among them, most DL methods are merely data-driven, lacking relevant physical priors, which thus requires a large amount of experimental data to train the DL models. However, acquiring large training data in the THz domain is challenging due to the time-consuming data acquisition process and environmental and system stability requirements during this lengthy process. To overcome this limitation, this paper incorporates a complete physical model representing the THz image formation process into a DL neural network(NN). Having addressed phase retrieval and image reconstruction of planar objects in an earlier paper, we focus here on the task to retrieve the distance information of objects. A significant goal of our work is to be able to use the DL NNs without pre-training, eliminating the need for tens of thousands of labeled data. Through experimental validation, we demonstrate that by providing diffraction patterns of planar objects, with their upper and lower halves sequentially masked to overcome the trapping of the NN's computational iterations in local minima, the proposed physics-informed NN can automatically reconstruct the depth of the object through interaction between the NN and the physical model. Compared to traditional DL methods and back-propagation methods, our approach not only reduces data dependency and operational costs but also improves imaging speed and stability. The obtained results also represent the initial steps towards achieving fast holographic THz imaging using reference-free beams and low-cost power detection.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"312 ","pages":"Article 109586"},"PeriodicalIF":7.2,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143697067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pu Ren , Jialin Song , Chengping Rao , Qi Wang , Yike Guo , Hao Sun , Yang Liu
{"title":"Learning spatiotemporal dynamics from sparse data via a high-order physics-encoded network","authors":"Pu Ren , Jialin Song , Chengping Rao , Qi Wang , Yike Guo , Hao Sun , Yang Liu","doi":"10.1016/j.cpc.2025.109582","DOIUrl":"10.1016/j.cpc.2025.109582","url":null,"abstract":"<div><div>Learning unknown or partially known dynamics has gained significant attention in scientific machine learning (SciML). This research is mainly driven by the inherent sparsity and noise in scientific data, which poses challenges to accurately modeling spatiotemporal systems. While recent physics-informed learning strategies have attempted to address this problem by incorporating physics knowledge as soft constraints, they often encounter optimization and scalability issues. To this end, we present a novel physics-encoded learning framework for capturing the intricate dynamical patterns of spatiotemporal systems from limited sensor measurements. Our approach centers on a deep convolutional-recurrent network, termed Π<span>-block</span>, which hard-encodes known physical laws (e.g., PDE structure and boundary conditions) into the learning architecture. Moreover, the high-order time marching scheme (e.g., Runge-Kutta fourth-order) is introduced to model the temporal evolution. We conduct comprehensive numerical experiments on a variety of complex systems to evaluate our proposed approach against baseline algorithms across two tasks: reconstructing high-fidelity data and identifying unknown system coefficients. We also assess the performance of our method under various noisy levels and using different finite difference kernels. The comparative results demonstrate the superiority, robustness, and stability of our framework in addressing these critical challenges in SciML.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"312 ","pages":"Article 109582"},"PeriodicalIF":7.2,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing the Nektar++ spectral/hp element framework for parallel-in-time simulations","authors":"Jacques Y. Xing , Chris D. Cantwell , David Moxey","doi":"10.1016/j.cpc.2025.109584","DOIUrl":"10.1016/j.cpc.2025.109584","url":null,"abstract":"<div><div>We describe the efficient implementation of the Parareal algorithm in the <em>Nektar++</em> software, an open-source spectral/hp element framework for the solution of partial differential equations, which has been designed to achieve high-scalability on high-performance computing (HPC) clusters using distributed parallelism. Recently, time-parallel integration techniques are being recognized as a potential solution to further increase concurrency and computational speed-up beyond the limits of strong scaling obtained from a pure spatial domain decomposition. Amongst the various time-parallel approaches proposed in the literature, the Parareal algorithm is a non-intrusive and iterative approach, exploiting a fine and a coarse solvers to achieve time-parallelism, and can be applied to both linear and non-linear problems. We discuss the details of the implementation and discuss the specific techniques used to adapt the code to a time-parallel framework. We demonstrate the application of these methods to multiple linear and non-linear problems provided by the existing <em>Nektar++</em> solvers.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"312 ","pages":"Article 109584"},"PeriodicalIF":7.2,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Grid-free evaluation of phonon-limited electronic relaxation times and transport properties","authors":"Nenad Vukmirović","doi":"10.1016/j.cpc.2025.109583","DOIUrl":"10.1016/j.cpc.2025.109583","url":null,"abstract":"<div><div>Present calculations of electrical transport properties of materials require evaluations of electron-phonon coupling constants on dense predefined grids of electron and phonon momenta and performing the sums over these momenta. In this work, we present the methodology for calculation of carrier relaxation times and electrical transport properties without the use of a predefined grid. The relaxation times are evaluated by integrating out the delta function that ensures energy conservation and performing an average over the angular components of phonon momentum. The charge carrier mobility is then evaluated as a sum over appropriately sampled electronic momenta. We illustrate our methodology by applying to the Fröhlich model and to a real semiconducting material ZnTe. We find that rather accurate results can be obtained with a modest number of electron and phonon momenta, on the order of one hundred each, regardless of the carrier effective mass.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"312 ","pages":"Article 109583"},"PeriodicalIF":7.2,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143697066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GPU acceleration of overbridging boundary matching method without Green's functions based on real-space finite-difference method","authors":"Takanori Akamatsu , Mitsuharu Uemoto , Yoshiyuki Egami , Tomoya Ono","doi":"10.1016/j.cpc.2025.109585","DOIUrl":"10.1016/j.cpc.2025.109585","url":null,"abstract":"<div><div>We present the graphics processing unit (GPU) acceleration of the overbridging boundary matching method for electron-transport property calculations, which is based on the density functional theory using the real-space finite-difference method. The execution of the implemented code using OpenACC and CUDA libraries on GPU is computationally more efficient and faster than a central processing unit. Furthermore, we achieve the ideal scalability in parallel execution from one to thirty-two nodes by adopting multiprocess parallelization schemes for two types of supercomputer with different configurations. To demonstrate the applicability of the accelerated code, the complex band structures of graphene and armchair carbon nanotubes with chiral indices (6, 6), (9, 9), and (12, 12) are calculated and compared with those obtained by the tight-binding method. We discuss the effects of the dispersion of evanescent waves on roll-up and axial strain in the carbon nanotubes.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"312 ","pages":"Article 109585"},"PeriodicalIF":7.2,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J.T. Zhang, X.L. Tu, Y. Huang, L.Y. Li, G.Q. Zhang, Z.H. Li
{"title":"SDGMPS: A spin-dependent Glauber model program for elastic proton-nucleus scattering","authors":"J.T. Zhang, X.L. Tu, Y. Huang, L.Y. Li, G.Q. Zhang, Z.H. Li","doi":"10.1016/j.cpc.2025.109587","DOIUrl":"10.1016/j.cpc.2025.109587","url":null,"abstract":"<div><div>SDGMPS is a Fortran program that calculates differential cross sections of elastic proton-nucleus scattering at intermediate energies based on the spin-dependent Glauber model. In the program, the Glauber model explicitly takes into account spin effects by using the spin-dependent nucleon-nucleon scattering amplitude, where the spin-orbit amplitude parameters are needed as input. It is particularly useful for analyses of the elastic proton scattering at both low and high momentum transfers and studies of the inner density distributions in nuclei. Such studies are an important part of the physics research program of the radiation beam facilities, such as the Heavy Ion Research Facility in Lanzhou (HIRFL).</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"312 ","pages":"Article 109587"},"PeriodicalIF":7.2,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}