Models of tolerance are commonly derived on empirical grounds, because of lack of knowledge about the mechanism of tolerance or because of the difficulty of appropriately simplifying complex physiological processes. The present study was performed to evaluate the interchangeability of tolerance models used in the literature and to address some determinants for selection of an appropriate design and data analysis strategy. Seven models were chosen (noncompetitive antagonist model, partial agonist model, reverse agonist model, direct moderator model, indirect moderator model, pool model and adaptive pool model) along with their corresponding parameter estimates, representing a wide range of empirical models. The performance of the models on various data sets was evaluated. Data were simulated from each original model and were further analysed by the other models. The effect-time course of each and every data set could be described well by at least 2 different empirical tolerance models, but no model could describe all the data sets adequately. However, all models could adequately describe at least 2 different data sets. This indicates that, without additional knowledge or assumptions, it is unlikely that reliable mechanistic information can be deduced from the mere fact that 1 (or more) of these models can describe the data. Generally, data expressing only limited tolerance can be described by a wide variety of models, whereas few models will be appropriate for data characterised by extensive tolerance. The models that gave an adequate description of a data set were selected for further study that investigated their predictive capacity based on the parameters previously determined. Predictions were made for 4 different administration schemes. The selected models gave similar predictions for the extended designs of 3 data sets for which the original study designs characterised tolerance well. For the other 4 data sets, the selected models gave disparate predictions, although the models described the original data set well. Thus, the predictive capability of a model was linked to the original study design, whereas the correlation between predictive performance and the type of model was weak or absent. Based on the results, factors of importance for the design and evaluation of studies of tolerance were identified and discussed.