PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains multiple substance descriptions and small molecules with fewer than 100 atoms and 1,000 bonds. More than 80 database vendors contr
PubChem is a free, publicly accessible database that contains information about millions of chemical molecules and how they interact with biological systems, maintained by the U.S. National Institutes of Health. It matters because researchers, students, and the public can freely access and download detailed chemical data to support drug discovery, scientific research, and education without cost or restrictions.
AI-generated from the Wikipedia summary — may contain errors.
via Wikipedia infobox
PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains multiple substance descriptions and small molecules with fewer than 100 atoms and 1,000 bonds. More than 80 database vendors contribute to the growing PubChem database.
==History== PubChem was released in 2004 as a component of the Molecular Libraries Program (MLP) of the NIH. As of November 2015, PubChem contains more than 150 million depositor-provided substance descriptions, 60 million unique chemical structures, and 225 million biological activity test results (from over 1 million assay experiments performed on more than 2 million small-molecules covering almost 10,000 unique protein target sequences that correspond to more than 5,000 genes). It also contains RNA interference (RNAi) screening assays that target over 15,000 genes.
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).