Scientific Computing World is the only global publication dedicated to the computing and information technology needs of scientists and engineers. It covers computing for engineering, science, and technology, grouped under the headings of Informatics; High-Performance Computing; and Application software.
DeepMind has announced a new tool in AI research. The system, called AlphaFold builds on years of genomics research by taking the data and using it to predict protein structure. AlphaFold has been developed over the last two years but is built on many years of prior research using vast genomic data to predict protein structure.
This technology could have significant implications for healthcare and medicine as it will enable scientists to gain insight into the way that diseases develop and possible preventions. The ability to predict a protein’s shape is useful to scientists because it is fundamental to understanding its role within the body, as well as diagnosing and treating diseases believed to be caused by misfolded proteins, such as Alzheimer’s, Parkinson’s, Huntington’s and cystic fibrosis.
A protein’s properties are determined by its 3D structure. For example, antibody proteins that make up our immune systems are ‘Y-shaped’, and are akin to unique hooks. By latching on to viruses and bacteria, antibody proteins are able to detect and tag disease-causing microorganisms for extermination.
Similarly, collagen proteins are shaped like cords, which transmit tension between cartilage, ligaments, bones, and skin. Other types of proteins include CRISPR and Cas9, which act like scissors and cut and paste DNA; antifreeze proteins, whose 3D structure allows them to bind to ice crystals and prevent organisms from freezing; and ribosomes that act like a programmed assembly line, which help build proteins themselves.
But figuring out the 3D shape of a protein purely from its genetic sequence is a complex task that scientists have found challenging for decades. The challenge is that DNA only contains information about the sequence of a protein’s building blocks called amino acid residues, which form long chains. Predicting how those chains will fold into the intricate 3D structure of a protein is what’s known as the ‘protein folding problem’.
An understanding of protein folding will also assist in protein design, which could unlock a number of benefits. For example, advances in biodegradable enzymes—which can be enabled by protein design—could help manage pollutants like plastic and oil, helping us break down waste in ways that are more friendly to our environment. In fact, researchers have already begun engineering bacteria to secrete proteins that will make waste biodegradable, and easier to process.
Over the past five decades, scientists have been able to determine shapes of proteins in labs using experimental techniques like cryo-electron microscopy, nuclear magnetic resonance or X-ray crystallography, but each method depends on a lot of trial and error, which can take years and cost tens of thousands of dollars per structure. This is why biologists are turning to AI methods as an alternative to this long and laborious process for difficult proteins.
Fortunately, the field of genomics is quite rich in data thanks to the rapid reduction in the cost of genetic sequencing. As a result, deep learning approaches to the prediction problem that rely on genomic data have become increasingly popular in the last few years.
Our team focused specifically on the hard problem of modelling target shapes from scratch, without using previously solved proteins as templates. We achieved a high degree of accuracy when predicting the physical properties of a protein structure, and then used two distinct methods to construct predictions of full protein structures.
Both of these methods relied on deep neural networks that are trained to predict properties of the protein from its genetic sequence. The properties our networks predict are: (a) the distances between pairs of amino acids and (b) the angles between chemical bonds that connect those amino acids. The first development is an advance on commonly used techniques that estimate whether pairs of amino acids are near each other.
The team of researchers trained a neural network to predict a separate distribution of distances between every pair of residues in a protein. These probabilities were then combined into a score that estimates how accurate a proposed protein structure is. We also trained a separate neural network that uses all distances in aggregate to estimate how close the proposed structure is to the right answer
Using these scoring functions, we were able to search the protein landscape to find structures that matched our predictions. Our first method built on techniques commonly used in structural biology, and repeatedly replaced pieces of a protein structure with new protein fragments. We trained a generative neural network to invent new fragments, which were used to continually improve the score of the proposed protein structure.
The second method optimised scores through gradient descent—a mathematical technique commonly used in machine learning for making small, incremental improvements—which resulted in highly accurate structures. This technique was applied to entire protein chains rather than to pieces that must be folded separately before being assembled, reducing the complexity of the prediction process.
The success of the team’s first foray into protein folding is indicative of how machine learning systems can integrate diverse sources of information to help scientists come up with creative solutions to complex problems at speed. Just as we’ve seen how AI can help people master complex games through systems like AlphaGo and AlphaZero, we similarly hope that one day, AI breakthroughs will help us master fundamental scientific problems, too.
This work was done in collaboration with Richard Evans, John Jumper, James Kirkpatrick, Laurent Sifre, Tim Green, Chongli Qin, Augustin Zidek, Sandy Nelson, Alex Bridgland, Hugo Penedones, Stig Petersen, Karen Simonyan, David Jones, David Silver, Koray Kavukcuoglu, Demis Hassabis, and Andrew Senior
Dolphin Interconnect Solutions a provider of low latency, industry-standard PCIe adapter cards, scalable switches, and advanced PCI Express Software, has announced the launch of its high-end, Microsemi Switchtec-based MXS824 24-port PCIe scalable switch for I/O expansion and PCIe fabric. The switch is the latest addition to Dolphin’s expanding roster of PCIe products.
The MXS824 is the key lynchpin of Dolphin’s PCIe Fabric Architecture. PCIe fabrics can enable a variety of systems, including composable architectures, PCIe clusters, I/O expansion, and reflective memory systems based on PCI Express.
‘The MXS824 adds to Dolphin’s solution for composable architectures and provides new levels of flexibility,’ said Herman Paraison, VP Sales and Marketing at Dolphin Interconnect Systems. ‘The switch is targeted at clients looking to scale PCI Express. It enables them to scale up to larger clusters, build targeted applications, or add more devices to their PCIe environment.’
Composable architectures allow users to flexibly build systems on the fly. A wide variety of hardware can be added or removed from these systems as needed, such as NDMe drives, GPUs, processors, and FPGAs. Dolphin’s unique approach to building composable architectures is called device lending. Unlike other concepts for building composable architectures, device lending allows access to devices installed in servers as well as in expansion boxes or JBoFs.
This creates a pool of transparent I/O resources that can then be shared among computers without any application-specific distribution mechanisms or requiring any modifications to drivers. Just as importantly, the resources can easily be reallocated whenever required, allowing for extremely flexible and ever-changing distributions of resources.
The MXS824 is a key component in this environment for connecting a large number of systems and expansion boxes to a PCIe fabric. All of these devices can be integrated into the PCIe fabric and made available to systems, not just the devices attached to the switch via an expansion box. Thus, multiple servers connected in the fabric can borrow devices in connected servers, expansion boxes, or JBoFs.
With device lending, applications such as TensorFlow can allocate a large number of GPUs, significantly speeding application performance, and release them when a simulation is complete. This kind of flexibility can maximize limited resources and is highly useful for customers who consistently need to move workloads around to better utilize their very expensive resources, such as universities and laboratories.
The MXS824 is the high-end switching component in Dolphin’s new PCIe fabric implementation. It is supported in Dolphin’s latest version of eXpressWare software, the PCIe fabric software stack, and allows clients to build even larger topologies. The 24-port Microsemi PFX-based 1U cluster switch delivers 32 GT/s of non-blocking bandwidth per port at ultra-low latency. The switch supports various configurations where up to 4 ports can be combined into a single x16 / 128 GT/s port for higher bandwidth. Ports can be configured as 24 x4 ports, 12 x8 ports, 6 x16 ports, and various configurations of each. Multiple switches can be connected to create larger port counts. Each connection is fully compliant with PCI Express Gen1, Gen2, and Gen3 I/O specifications.
Beyond composable architectures, the MXS824 expands capabilities of PCI Express in markets like simulation, military, automotive, and financial services — industries that can take advantage of the new scaling capabilities and performance delivered by the switch. Applications such as scientific simulations, which use reflective memory and many nodes to acquire and distribute data, benefit from the expanded topologies and ultra-low latency.
Another key market for the MXS824 is testing and development. For customers seeking to prototype various system architectures based on PCI Express, the MXS824 can enable building topologies and development environments that mimic their end systems. These development environments can be used for software development and performance testing. The switch is also useful for customers connecting a large number of devices in production testing, as these devices can be connected to an external PC running test programs in production environments.
Visit Dolphin at booth 443 at SC18 in Dallas, Texas November 12–15 for a demo of how the MXS824 switch enables low-latency composable architecture.
Researchers at the University of Birmingham will soon be able to carry out research on the largest IBM POWER9 Artificial Intelligence (AI) cluster in the UK, as the University has now announced the deployment alongside HPC integrator OCF. OCF and the university will integrate a total of 11 IBM POWER9-based IBM Power Systems servers into its existing high-performance computing (HPC) infrastructure, the Birmingham Environment for Academic Research (BEAR).
Birmingham initially deployed two IBM Power Systems AC922 servers, powered by POWER9 CPUs with the industry’s only CPU-to-GPU NVIDIA NVLink interconnect, in September 2018. However, the Advanced Research Computing (ARC) team soon realised that it needed more computational power tailored to the ever-increasing AI workloads generated by the University’s researchers, delivering ground-breaking computational vision analysis and to solve life sciences challenges, such as improving cancer diagnosis.
‘It’s very important to us as a research-led institution that we are at the forefront of data research which means we are always looking at ways to make AI quicker and more accessible for our researchers,’ said Simon Thompson, research computing infrastructure architect at the University of Birmingham. ‘With the sheer amount of data, the common questions from researchers are how can we analyse it fast enough and how can we make the process even quicker? With our early deployment of the two IBM POWER9 servers we have seen what is possible. By scaling up, we can keep-pace with the escalating demand and offer the computational capacity and capability to attract leading researchers to the University.’
The University will now add an additional nine IBM Power Systems AC922 warm water-cooled nodes, each equipped with four NVIDIA Tesla V100 16GB Tensor Core GPUs, 1TB of system memory, dual 18 core POWER9 CPUs and Mellanox 100Gb EDR InfiniBand. The solution uses IBM PowerAI Enterprise software, unlocking potential for accelerated computing, capitalising on the largest IBM POWER9 cluster in the UK. IBM will also support use of the new systems by providing comprehensive training and support to Birmingham’s researchers in partnership with ARC.
This significant enhancement to BEAR will mean an even more powerful and versatile computing environment to serve researchers. For example, fellows from The Alan Turing Institute looking at early diagnosis of and new therapies for heart disease and cancer, will use AI to run faster diagnostics in the future. In contrast, researchers in the physical sciences are similarly using machine learning and data science approaches to quantify the 4D (3D plus time) microstructures of advanced materials collected at national large synchrotron facilities such as the Diamond Light Source. This research expects to use the large model support provided by IBM PowerAI software to analyse TBs of data being generated daily; currently an almost impossible task.
‘We are thrilled that the University of Birmingham has decided to invest in building the UK's largest POWER9 AI cluster’, said Simon Robertson, director, IBM Servers, UK & Ireland. ‘We are proud to see the practical application of IBM technology used by researchers across the University and beyond.’
‘We’re delighted to work with the University on this initiative,’ said Julian Fielden, managing Director of OCF. ‘AI workloads are driving data intensive challenges that can only be met with accelerated infrastructure, such as IBM’s POWER9. The University is leading the way with this impressive project and will continue to attract world-class researchers with this type of innovation.’
Altair has announced a definitive merger agreement under which Altair has agreed to acquire Datawatch. Altair will pay $13.10 per share in cash, representing a fully diluted equity value of approximately $176 million. The transaction was unanimously approved by the Boards of Directors of both companies.
James Scapa, Altair’s Founder, Chairman, and Chief Executive Officer, commented: ‘Bringing Datawatch into Altair should result in a powerful offering consistent with our vision to transform product design and decision making by applying simulation, data science and optimisation throughout product lifecycles. We see a convergence of simulation with the application of machine learning technology to live and historical sensor data as essential to creating better products, marketing them efficiently, and optimising their in-service performance. Datawatch is a great team of people with best-in-class products, and we look forward to their joining us.’
Altair believes the acquisition of Datawatch will be useful in a number of different markets that want to leverage data analytics and data science technologies. Datawatch’s solutions, which include data prep, data prediction, and real-time high-volume data visualisation technologies, are highly relevant and applicable to almost any company and vertical market. Altair also reports that there is opportunity to cross-sell Datawatch products into Altair’s primarily manufacturing customer base.
Michael Morrison, Chief Executive Officer of Datawatch, added, ‘The Datawatch team is excited to join Altair and benefit from its long track record of success with developing and bringing to market highly differentiated software technology across diverse industry verticals. We feel great about the cultural alignment and look forward to driving continued innovation in our market-leading solutions as an integral part of Altair’s vision.’
Autoscribe Informatics has announced release of v5.4.6 of Matrix Gemini LIMS. New features include improved visualisation of data and information, especially hierarchical data structures and storage locations including freezers, well plates and other storage types and advanced analytical quality control (AQC) functions.
The new ‘Container View’ allows container types to be represented in the LIMS including well plates, storage racks, cryogenic storage boxes and chemical inventory storage facilities. Each point, or location, within the container grid can represent a sample (or item) and may be configured to show any information about the item that is desired.
The location of individual items or groups of items within the storage system may be changed using drag and drop features. The system will also allow entire containers (and their contents) to be moved. All changes can be recorded in an audit trail for later recall, providing a complete audit history of the sample location during storage. The Container view in conjunction with the Matrix Tree view functionality allows laboratories to define and manage any hierarchically defined information within the laboratory.
The ‘Container View’ is not limited to displaying items in a storage facility. The inherent flexibility means it can be used to visualise many types of information, for example data heat maps or the mapping of contamination hot spots within a facility. It can also be used for resource mapping and allocation including the creation of Kanban boards. Samples, instruments or people (or any resources) can be defined and allocated using the built-in drag and drop methodology. Like all Matrix Gemini LIMS features the container view is configured using Matrix Gemini’s powerful built-in graphical configuration tools, allowing maximum flexibility without the need for software coding skills.
Advanced AQC provides highly flexible run sheet functionality, allowing users to create run sheet templates that meet the needs of today’s analytical laboratory. AQC samples including blanks, duplicates, replicates, spikes, controls, spike duplicates and control duplicates and their position within a run can be defined based on user defined patterns. Advanced AQC allows complex rules to be used when placing control samples amongst standard samples within a run sheet. These rules include set positioning of individual QC samples, leading and trailing sets of QC samples as well as repeating patterns and randomised positioning within sets of unknown samples. Advanced AQC allows the laboratory to maximise sample throughput while meeting regulatory and quality standards and without compromising the quality or accuracy of data. It can also drive improved efficiency and maximise the use of resources across the laboratory.
Requested by our customers, the highly innovative ‘Container View’ and Advanced AQC visibly demonstrate how Matrix Gemini LIMS continues to drive innovative solutions for Laboratories across a broad range of industries world-wide. Matrix Gemini v5.4.6 is available today for immediate delivery to all Autoscribe Informatics customers.
The Mars lander, InSight, short for Interior Exploration using Seismic Investigations, Geodesy and Heat Transport, began its 205-day journey into space on May 5, 2018. InSight’s landing is expected to occur at 3:00 pm Eastern Time on Monday, November 26, as space enthusiasts watch online and at live viewing parties scheduled in 70 nationwide locations and four sites abroad.
'Jet Propulsion Laboratory has been a longstanding customer of DDN, and we feel truly privileged to play a part in the space exploration they are conducting. We look forward to the landing of InSight and are happy to share that experience with others around the globe. The data collected from the core of Mars will surely result in a better understanding of the red planet and potentially in life-changing outcomes for all of us here on Earth,' said Paul Bloch, president at DDN.
Landing on Mars is no simple feat. Entry, descent and landing (EDL) is a terrifying event that begins approximately 80 miles above the surface and lasts about 6 minutes. As this event takes place, cruise stage completes and an aeroshell descends through the atmosphere with a parachute and retrorockets that deploy to slow the spacecraft down. Suspended legs then extend to absorb some of the shock from the touchdown, and the EDL phase is complete.
Many lessons have been learned from past Mars missions, and EDL techniques are honed using Monte Carlo simulations. Every single movement of the spacecraft is precisely calculated from machine learning algorithms that record data and repeat Monte Carlo simulations after every single turn during EDL. All the data generated is stored on DDN EXAScaler appliances. Even though the Monte Carlo event is thoroughly calculated, EDL has flexibility to handle shifting weather. The mission team has the ability to tweak when InSight's parachute deploys and uses radar to find the landing surface, which is quite an astonishing feat.
The landing will take place in a landscape called Elysium Planitia, within an 81-mile long, 17-mile wide oval that is located on the western edge of a flat, smooth expanse of lava plain, which is similar to the topography in the southwestern regions of the United States. Unlike other highly-promoted Mars rover missions, InSight will remain where it lands and collect data from there, sending the data 91 million miles back to the DDN EXAScaler systems on Earth where deep analysis and simulation will occur.
InSight is the first robot of its kind to conduct this type of exploration work. The work represents the first-time drilling has occurred on Mars and will provide the means to measure the planet's seismology, heat flow and precision tracking. Having this data pushes new boundaries in space exploration and will give insight into the formation of our solar system. By comparing the interiors of Earth and Mars, scientists hope to better understand the universe and aid in discovering other planets that could support life.