Localization of tea shoots is essential for achieving intelligent plucking. However, accurately identifying the plucking point within the unstructured field environment remains challenging. This study proposes a method for three-dimensional (3D) localization of tea shoots utilizing binocular stereo vision for robotic plucking in such environments. Initially, tea shoot masks from each binocular image are extracted using the You Only Look Once segmentation network and paired by calculating image similarity through the combined use of Scale-Invariant Feature Transform features and color histograms. The Selective AD-Census-HSI stereo-matching algorithm was subsequently developed specifically to generate disparity maps for instance-segmented tea shoots. This approach also incorporated enhancements in the initial cost calculation and the cross-construction modules to improve the algorithm's performance. The point cloud is generated via triangulation to identify the plucking points using V-shaped template matching. Disparity evaluation results indicate that the proposed stereo-matching algorithm enhances accuracy compared with the original AD-Census, especially in scenarios with significant luminance contrast between the left and right views. Results from the indoor 3D localization experiment show that the average localization error of the tea shoot plucking point is 5.78 mm. Lastly, a robotic tea shoot plucking experiment conducted in the field achieved a success rate of 62%. These results demonstrate that the proposed tea shoot localization method satisfies the requirements for robotic tea plucking, providing a novel solution for intelligent harvesting of tea.