Species distribution models (SDMs) are widely used in ecology and conservation. Presence-only SDMs such as MaxEnt frequently use natural history collections (NHCs) as occurrence data, given their huge numbers and accessibility. NHCs are often spatially biased which may generate inaccuracies in SDMs. Here, we test how the distribution of NHCs and MaxEnt predictions relates to a spatial abundance model, based on a large plot dataset for Amazonian tree species, using inverse distance weighting (IDW). We also propose a new pipeline to deal with inconsistencies in NHCs and to limit the area of occupancy of the species. We found a significant but weak positive relationship between the distribution of NHCs and IDW for 66% of the species. The relationship between SDMs and IDW was also significant but weakly positive for 95% of the species, and sensitivity for both analyses was high. Furthermore, the pipeline removed half of the NHCs records. Presence-only SDM applications should consider this limitation, especially for large biodiversity assessments projects, when they are automatically generated without subsequent checking. Our pipeline provides a conservative estimate of a species' area of occupancy, within an area slightly larger than its extent of occurrence, compatible to e.g. IUCN red list assessments.