Species that are evolutionarily distinct have long been valued for their unique and irreplaceable contribution to biodiversity. About 30 years ago, this idea was extended to the concept of phylogenetic diversity (PD): a quantitative, continuous-scale index of conservation value for a set of species, calculated by summing the phylogenetic branch lengths that connect them. This way of capturing evolutionary history has opened new opportunities for analysis, and has therefore generated a huge academic literature, but to date has had only limited impact on conservation practice or policy. In this review, I present a brief historical overview of PD research. I then examine the empirical evidence for the primary rationale of PD that it is the best proxy for “feature diversity,” which includes both known and unknown phenotypic characters, contributing to utilitarian value, ecosystem function, future resilience, and evolutionary potential. Surprisingly, it is only relatively recently that this rationale has been subject to systematic empirical scrutiny, and to date, there are mixed results on the connection between PD and phenotypic diversity. Finally, I examine the least well-studied, but potentially greatest challenge for PD: its dependence on the reliability of phylogenetic inference itself. The very few studies that have investigated this so far show that the ranking of species assemblages by their PD values can vary substantially under alternative, routine, phylogenetic methods and assumptions. If PD is to become more widely adopted into conservation decision-making, it will be important to better understand the conditions under which it performs well, and those under which it performs poorly.