PubChem is a free chemical database and an open archive of the biological activities of millions of substances. PubChem is a part of the National Center for Biotechnology Information (NCBI), a division of the U.S. National Library of Medicine.
The periodic table of chemical elements is one of the most recognized tools in science. As we mark the 150th anniversary of the periodic table, the scientific community has declared 2019 to be “The International Year of the Periodic Table”. PubChem is celebrating by launching the PubChem Periodic Table and corresponding Element pages.
While PubChem provides each chemical its own page, you can find elements there too. Such pages are not suited for displaying information specific to elements (such as electronegativity and electron configuration). The PubChem Periodic Table and Element pages help you navigate the abundant chemical element data available within PubChem, while providing a convenient entry point to explore additional information, such as bioactivities, health and safety data, available in PubChem Compound pages for specific elements and their isotopes.
The PubChem Periodic Table provides three distinct views. Table View is the traditional periodic table any scientist would instantly recognize. List View provides a summary view, allowing you to see all properties available for each element at once. Game View, added as an educational feature, helps test your knowledge of element names and symbols.
Clicking an element in the PubChem Periodic Table directs you to the corresponding Element page. This page presents a wide variety of element information, including atomic properties (electron affinity, electronegativity, ionization potential, oxidation states, electron configuration, etc.) as well as isotopes, history, uses, and, most importantly, information source. The element page can also be reached directly via URLs that includes atomic number, symbol, or name (all case insensitive). For example, the following URLs are for the Element page for carbon:
We’ve redesigned PubChem’s homepage to give you easier access to the information you need, where you need it. The mobile-friendly, responsive design works on the device you want to use. And the streamlined, intuitive interface puts the data you need at your fingertips.
Here are some of the changes you can expect to see at the new PubChem homepage:
The menus at the top of the page and the sidebar have been replaced with a minimal set of important links. These links include “About,” “Blog,” “Submit,” and “Contact.” The “About” link will bring you to the PubChem Docs site, where you can find an exhaustive list of PubChem services and documentation.
In addition, data count and data source statistics have been highlighted. Each also includes a link you can follow to get more information on these statistics.
Finally, we’ve improved PubChem’s search capabilities. The three search boxes for compounds, substances and bioassays have been replaced with a single search box that covers all search types. Search results from the formerly separate search types (compound, substance, and bioassay) have also been integrated into a single search results display. In addition, search in PubChem now directly supports formula and structure search. We have many more details to share with you about the new PubChem search in a separate blog post, so keep your eyes open for that!
We want to know what you think!
PubChem’s new look and feel is a big step forward for PubChem, and we’re excited to share all of the improvements we’re making across PubChem with you!
On March 31-April 4, 2019, the 257th American Chemical Society National Meeting will be held in Orlando, FL, the theme of which is “Chemistry for New Frontiers”. The PubChem team will be at the ACS meeting to present new developments and recent changes in PubChem. Below is a list of presentations that will be given by the PubChem staff.
We’ve redesigned PubChem’s summary and record pages with a series of updates both behind the scenes and to your own user experience. These changes will make finding the information you’re looking for easier and faster.
Behind the scenes, we’ve changed the way PubChem models and serves information. Most people won’t see these changes, but the changes do allow us to create pages better suited to your needs, faster. If you’re a programmatic user and you need information on how the data model changes affect you, you can find more information in our blog.
To go along with the behind the scenes changes, we’ve redesigned your user experience with an all new look and feel. To begin, we’ve started color-theming our pages to make it easier to recognize what kind of page you’re on. For example, compound pages may have a light blue theme, while substance pages a yellow theme. There will be themes for bioassay and other page types as well.
We’ve also looked over volumes of usage data and user feedback to improve page layout and navigation, and many of the changes you’ll see are a direct response to your feedback. For example, the table of contents has been moved and improved. It now appears on the right side of the page, and higher up on the page. Redundant navigational icons are being removed.
Another result of your feedback is that we’re emphasizing chemical safety information in the summary area at the top of the record. We’re also adding thumbnails for all the available structure types of a given compound (for example, 2D, 3D, crystal) to the summary area. In addition, the graphics quality of the compound 3D Conformer interactive model has been significantly improved.
PubChem is updating the data model for objects returned by the PUG View server. These objects are used by both programmatic users and by PubChem web pages. PubChem web users will not be directly affected by the data model changes. Programmatic users, however, will need to update the programs that retrieve and interpret data from PUG View. The following major changes are being made to the data model of the PUG View JSON/XML blobs:
No more HTML markup within strings; instead, we will have an explicit markup object that separates primary strings from the various markup types.
All values are lists, having separate fields for individual values.
No more embedded tables in the data blobs.
No more HTML markup within strings.
PubChem is making a major effort to remove all embedded HTML from within the various strings in the data blobs. Such embedded markup is difficult for parsers to deal with when only a plain string is desired. For example, this is the old model:
"StringValue": "Flipo RM: [Are the NSAIDs able to compromising the cardio-preventive efficacy of <a class=\"pubchem-internal-link CID-2244\" href=\"https://pubchem.ncbi.nlm.nih.gov/compound/aspirin\">aspirin</a>?]. Presse Med. 2006 Sep;35(9 Spec No 1):1S53-60.",
In the new data model, the main string is in plain text, and the URL links (or other types of markup) are separate, with the character location of the markup on the original string indicated by start and length values. For example:
"String": "Flipo RM: [Are the NSAIDs able to compromising the cardio-preventive efficacy of aspirin?]. Presse Med. 2006 Sep;35(9 Spec No 1):1S53-60. [PMID: 17078596]",
"Type": "General link"
"Type": "PubChem Internal Link",
This new format will make it easier for parsers to get at the relevant text data without a lot of programming overhead. Please note that this removal of embedded HTML also includes escaped entities in HTML. These will instead be represented by a single UTF8 character (for example, “°” à “°”) within the base string.
All values are lists.
In the new PubChem data model, all values are being converted to list types. We’ve done this to avoid the cumbersome necessity for data parsers to have to check separate fields for single values vs. lists. We’ll use JSON format for the following examples, but the XML data model is parallel to the JSON. Here are two examples of the old format:
In the examples above, note the fields “NumValue” vs. “NumValueList” – this is cumbersome to code against. In the new system, the examples above would look like this, “Number” being used in both cases within a list structure:
These changes will make it easier to code your data parsers.
No more embedded tables in the data blobs.
The use of embedded tables in the old system made it difficult for programmatic users to extract specific fields from within the table. The format required you to dig down into the rows and cells of the table to try and find the needed value. For example:
In the new data below, the fields are more explicitly labeled with section names, the same way as other (non-table) values in the data:
"TOCHeading": "Molecular Weight",
"Description": "Molecular weight or molecular mass refers to the mass of a molecule. It is calculated as the sum of the mass of each constituent atom multiplied by the number of atoms of that element in the molecular formula.",
"Name": "Molecular Weight",
These changes will make it easier to retrieve data from tables without a lot of programming overhead.
We want to know what you think!
In summary, PubChem’s new data model makes it easier to retrieve the data you need. As the data model is updated and released, you’ll be able to find detailed information on the schema here: https://pubchemdocs.ncbi.nlm.nih.gov/pug-view.
More than a million links to scientific articles with a focus on chemical synthesis have been added to PubChem, thanks to contributions from the publisher Thieme Chemistry with support from their technology partner InfoChem. (Read Thieme’s press release about it.)
The Thieme Chemistry information in PubChem covers nearly 700,000 chemical substance records, nearly 700,000 scientific article descriptions, and over 1.2 million links between chemicals and articles. The document descriptions include information such as a digital object identifier (DOI), publication title, name of the journal or book, publication type, language, and publication year.
The Thieme Chemistry contribution dramatically increases the number of chemical structures in PubChem with links to the scientific literature from nearly 1.0 million to 1.6 million. Of the approximately 700,000 Thieme Chemistry chemical structures contributed to PubChem, 42% are new to PubChem, and 89% previously lacked literature links.
Within the PubChem Classification Browser, the PubChem Compound TOC (Table of Contents) classification tree allows you to find all chemicals with a given annotation section. You can click “Literature” to view the subset fields under literature and find the “Thieme References” section. Clicking on the number will then show compound records with that section.
Each chemical record with a Literature / Thieme References section includes a table containing document links from Thieme Chemistry. The figure below shows the Thieme References section of the Compound record for ciprofloxacin (CID 2764).
Figure 1. The Literature / Thieme References section of the ciprofloxacin Compound record (CID 2764). Clicking the title (red circle) loads the article at the Thieme Chemistry site.
The article title links to the article on the Thieme-Chemistry site. You can download all references for a chemical record in CSV format through the “Download” button at the top right of the table. You can also expand to the full table by clicking the icon, where you can see additional data columns. By default, the articles are ordered by Publication Date as provided by Thieme Chemistry, but you can easily change the sorting order through the pulldown menu.
PubChem, along with contributors such as Thieme Chemistry, is helping to fuel a modern, data-driven research ecosystem. Literature links from Thieme Chemistry dramatically expand the findability, accessibility, interoperability, and reusability (FAIR) of synthesis-related chemical information. In addition, this contributed content helps to further enhance global open science by allowing researchers to locate key information about chemicals.
On August 19-23, 2018, the 256th American Chemical Society National Meeting will be held in Boston, MA, the theme of which is “Nanoscience, Nanotechnology & Beyond”. The PubChem team will be at the ACS meeting to present new developments and recent changes in PubChem. Below is a list of presentations that will be given by the PubChem staff.
PubChem BioAssay Tools (https://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi), a legacy collection of bioactivity analysis services, are being retired as part of an on-going technology refresh. Alternative approaches now exist to perform most of the same tasks, and these newer technologies provide expanded features and capabilities. The PubChem BioAssay Tools services will be no longer accessible after November 1, 2018.
Why are we phasing out the BioAssay Tools?
The BioAssay tools were developed when PubChem was much smaller. At PubChem’s current size, the tools do not scale sufficiently to handle most analysis tasks. Consequently, many users download data for use in their assay analysis workflows instead of using the tools. Deprecating these BioAssay Tools will free up resources allowing PubChem to develop new and better approaches to accessing bioactivity content.
What replaces the BioAssay Tools?
There are easier-to-find PubChem services that offer the same or similar functionalities for most BioAssay Tools. For instance, let’s say you have a compound, like aspirin, for which you would like to analyze all reported bioactivity data. In the past, you would need to first find the dedicated tool (the BioActivity Summary service) to retrieve the data. Now you will find this information in a BioAssay Results section on the widely accessed Compound Summary page:
Though some tools (like Structure Clustering and Structure-Activity Relationship Analysis) do not currently have direct alternatives in PubChem, most PubChem pages indicate related records such as structurally similar chemicals or assays performed against a given target. They can be further aggregated using commonly available third-party tools.
Will my old link work to the BioAssay Tools?
URL redirection from BioAssay Tools to their corresponding replacement will be provided for a period beyond November 1, 2018. Eventually, the redirection links will be removed.
On March 18-22, 2018, the 255th American Chemical Society National Meeting will be held in New Orleans, LA, the theme of which is “Nexus of Food, Energy & Water”. The PubChem team will be at the ACS meeting to present new developments and recent changes in PubChem. Below are a list of presentations that will be given by the PubChem staff.
PubChem added more than 26 million links to scientific articles, thanks to contributions from the publisher Springer Nature. Of these, 1.6 million links point to open access or free-to-read documents! (Read Springer Nature’s press release and presentation about it.)
Springer Nature includes the SpringerLink, SpringerOpen, and BioMed Central research platforms as well as the nature.com website. Combined, they include more than 10 million scientific documents spanning the primary literature, book chapters, and reference works. InfoChem, a subsidiary of Springer Nature, identified the chemicals mentioned in these scientific articles using a proprietary approach.
What was contributed?
The Springer Nature data collection in PubChem covers over 600 thousand chemical substance records, and contains nearly 4 million scientific article descriptions (of which almost 300 thousand are open- or free-access) and 26.8 million links between chemicals and articles. The document descriptions include information such as a document object identifier (DOI), publication title, name of the journal or book, document type, subject matter classification, language, open/free access availability, and publication year.
Why is this important?
This contribution, which doubles the number of chemical structures in PubChem with links to the scientific literature, improves the accessibility and discoverability of information about chemicals. Nearly all link content provided by Springer Nature is novel to PubChem, with only 10% of the provided chemical structures having a previous link to the scientific literature.
Integration of the Springer Nature links and data within PubChem has opened new possibilities for organizations and researchers. As a result of the contribution, PubChem added the capability to handle DOI-based annotation content. Additional appropriate DOI-based linked content (articles, data sets, and more) can now be added to PubChem.
What is Springer Nature?
Springer Nature is a scientific publishing company and a leading global research, educational and professional publisher formed through the merger of Nature Publishing Group, Palgrave Macmillan, Macmillan Education, and Springer Science+Business media.
The SpringerLink research platform provides access to more than 6 million journal articles, 3.7 million book chapters, and more than 480,000 reference works primarily in the areas of science, technology, and medicine.
Where can I access the contributed content?
Each chemical record with a Literature “Springer Nature References” section includes a table containing document links from Springer Nature. As an example, below are the links to the Springer Nature References section for aspirin (Compound ID 2244 and Substance ID 341138876). (Read this blog if you are not familiar with how Compounds and Substances in PubChem are different from each other.)
Click an article title to access the document on a Springer Nature website. To download all the contributed document data for a chemical record in CSV format, click the “Download” button at the top right of the table (see image). There is a full table data view accessible (by clicking the icon), where you can see additional data columns such as the DOI. By default, the articles are ordered by degree of “relevance” to the chemical as provided by Springer Nature, but the sorting field is easily changed through a pulldown menu, and sort direction also may be changed.
How to find chemical records with the Springer Nature references?
There are multiple ways to get a complete list of PubChem Substance or Compound records with “Springer Nature References”. One can:
The PubChem Classification Browser provides the means to navigate PubChem contents using various hierarchical classification trees. The PubChem Compound TOC (Table of Contents) classification tree allows you to find all chemicals with a given annotation section. In this case, one can click ‘Literature’ to view the subset fields under literature and find the ‘Springer Nature References’ section. Clicking on the number will then show compound records with that section.