Abstract
We describe rbstar, a toolkit of software for carrying out measurements when the goal is to determine how similar a system observation is to a gold-standard reference output. The resource covers all four combinations that arise when each of observation and reference can be either an unordered finite set in which element ordering is unimportant, or a finite prefix of an arbitrarily long ranking in which early elements are more important than later ones. Specifically, the package realizes four “rank-biased” measurement approaches that have been presented in a sequence of papers over a 15-year span, bringing them together into a single location with a uniform interface and efficient reference implementations. The provision of all of rank-biased precision, rank-biased overlap, rank-biased recall, and rank-biased alignment, with the latter two recent additions to the family, allows a wide range of measurement scenarios to be handled in a consistent manner.