Abstract. Data on patent families is used in economic and statistical studies for many purposes, including the analysis of patenting strategies of applicants, the monitoring of the globalization of inventions and the comparison of the inventive performance and stock of technological knowledge of different countries. Most of these studies take family data as given, as a sort of black box, without going into the details of their underlying methodologies and patent linkages. However, different definitions of patent families may lead to different results. One of the purposes of this paper is to compare the most commonly used definitions of patent families and identify factors causing differences in family outcomes. Another objective is to shed light into the internal structure of patent families and see how it affects patent family outcomes based on different definitions. An automated characterization of the internal structures of all extended families with earliest priorities in the 1990s, as recorded in PATSTAT, found that family counts are not affected by the choice of patent family definitions in 75% of families. However, different definitions may really matter for the 25% of families with complex structures and lead to different family compositions, which might have an impact, for instance, on econometric studies using family size as a proxy of patent value.