Polynomial preconditioning accelerates iterative methods for large-scale sparse linear systems by optimizing the spectral distribution and decreasing reduction communication overhead. The Neumann polynomial is notable for its simple construction and stable performance, making it easy to combine with other preconditioners and widely used in high-performance computing. The choice of scaling parameter within the Neumann series is critical for polynomial acceleration, requiring an accurate estimate of the eigenvalue bounds of the preconditioned system. In preconditioned systems, the clustering of the largest eigenvalues often slows the convergence of iterative methods used to estimate the maximum eigenvalue, leading to an underestimated scaling parameter. We address this issue by using a Least-Squares model with linear inequality constraints to learn effective combination weights of Ritz values from training samples. While the Rayleigh-Ritz process (the current best eigen-estimation approach) requires 20-30 iterations and systematically underestimates extremal eigenvalues due to Ritz values' interior spectral distribution, our constrained optimization approach achieves comparable accuracy in 10 iterations by learning optimal combination weights from Ritz value distributions and corrects the systematic underestimation while preserving positive definitenessa critical stability requirement that ensures robust preconditioning performance across diverse problem configurations. Our implementation of the Neumann polynomial with the proposed scaling scheme achieved acceleration ratios of 2.61 and 3.52 for ILU (Incomplete LU factorization) and block-ILU preconditioned systems, respectively. It achieves comparable acceleration with the recent state-of-the-art minimum residual polynomial in the ILU-preconditioned systems frequently providing better convergence acceleration in numerous practical scenarios.