Nanoformulations of therapeutic drugs are transforming our ability to effectively deliver and treat a myriad of conditions. Often, however, they are complex to produce and exhibit low drug loading, except for nanoparticles formed via co-assembly of drugs and small molecular dyes, which display drug-loading capacities of up to 95%. There is currently no understanding of which of the millions of small-molecule combinations can result in the formation of these nanoparticles. Here we report the integration of machine learning with high-throughput experimentation to enable the rapid and large-scale identification of such nanoformulations. We identified 100 self-assembling drug nanoparticles from 2.1 million pairings, each including one of 788 candidate drugs and one of 2,686 approved excipients. We further characterized two nanoparticles, sorafenib-glycyrrhizin and terbinafine-taurocholic acid both ex vivo and in vivo. We anticipate that our platform can accelerate the development of safer and more efficacious nanoformulations with high drug-loading capacities for a wide range of therapeutics.