We develop and evaluate a novel hybrid tractography algorithm for improved segmentation of complex fiber bundles from diffusion magnetic resonance imaging datasets. We propose an approach inspired by reinforcement learning that combines the strengths of both probabilistic and deterministic tractography to better resolve pathways dominated by crossing fibers. Given a fiber bundle query, our approach first explores an array of possible pathways probabilistically, and then exploits this information with streamline tractography using globally optimal fiber compartment assignment in a conditional random field. We quantitatively evaluated our approach in comparison with deterministic and probabilistic approaches using a realistic phantom with Tractometer and 88 test-retest scans from the Human Connectome Project. We found that the proposed hybrid method offers improved accuracy with phantom data and more biologically plausible topographic organization and higher reliability with in vivo data. This demonstrates the benefits of combining tractography approaches and indicates opportunities for integrating reinforcement learning strategies into tractography algorithms.