Large research infrastructures for brain imaging
Brain imaging in the age of big data will require large research infrastructures for acquisition and analysis.
Over the last 30 years, brain imaging has become a powerful tool for clinical, cognitive and fundamental neuroscience. The growth of the field is quite staggering. It has now started to join the world of ‘big data’, either through huge increases in the quantity of data collected on each subject, or through the study of very large numbers of subjects. For instance, in order to amass tens of thousands of subjects, the Enigma consortium (see: http://enigma.ini.usc.edu/) is pooling many existing studies, while the UK Biobank is scanning 100,000 volunteers in a single study (Miller, 2016).
Because of the outstanding needs for storage and computing resources resulting from this evolution, the field of brain imaging has become a new use case for High Performance Computing (HPC) centers. This use case is on the roadmap of the Human Brain Project (HBP), one of the European flagships, which has released several dedicated computing platforms, including tools to simulate the brain, with a view to eventually become a permanent, pan-European research infrastructure. In North America, CBRAIN is a web-based software connecting neuroimaging researchers to HPC facilities across Canada.
Large infrastructures
But the kind of phase transition occurring in the brain imaging field is not limited to the IT aspects. Large infrastructures dedicated to image acquisition are also emerging, with the flavour of what happened in the domain of particle physics a long time ago. For instance, where typical laboratories might use only one or two imaging systems to slice up mouse brains, the new HUST-Suzhou Chinese institute for Brainsmatics boasts 50 automated machines to snap high-definition pictures of each slice and reconstruct those into a 3D picture. In Germany, the Jülich research center is leveraging its unique know-how in postmortem neuroimaging (Amunts, 2013) to build a high-throughput microscopy facility dedicated to the human brain, with data rates of multiple terabytes a day.
Some of these emerging facilities are just too big to have one each, even if you are quite a wealthy country. Hence, they are founded to become service platforms that scientists from around the world can access, similar to the way that astronomers share telescope time. Some of these large research infrastructures are centralised facilities such as CERN in Geneva: the 11.7T(esla) MRI for humans under construction in Neurospin institute (see Fig. 1). Others include physically distributed resources, such as seismographic networks: the CATI’s distributed imaging network for population imaging.
Ultra High Field MRI
Large scientific instruments are often built to open up new ‘discovery spaces’, i.e., they are unique in terms of sensitivity and resolution. The most important discoveries that are made using such instruments are often ones that were not foreseen in the original science case. Ultra High Field MRI is a challenging but promising way to improve the spatial and temporal resolutions of brain images, as well as to investigate novel contrast mechanisms.
While 3T magnets (high fields) are the standard of MRI for humans, during the last decade lots of developments were achieved to tame 7T MRI (Ultra High Fields), for instance fighting against severe signal inhomogeneities using parallel transmission and reception, or developing novel compressed sensing-based acquisition approaches to speed-up acquisitions. The degree of maturity of 7T MRIs is such that key manufacturers of the domain have now obtained FDA approvals, turning them into actual clinical devices. Iron-related contrast is probably the main novelty brought by clinical 7T MRI, enhancing the contrast of deep brain structures known to contain iron at various concentrations.
Next generation
A few research centres around the world are now exploring the paths to the next generation, which will probably belong to the large instrument world. In Neurospin, the French Atomic Energy Commission, which is hosting the physicists at the origin of the CERN magnets, is developing an Extreme High Field 11.7 MRI for humans (Schield, 2016). The goal is to reach the brain mesoscopic scale corresponding to a hundred of micrometers, which probably corresponds to the size of the cortex building blocks. In addition, while conventional MRI relies on the observation of water protons, EHF MRI will also give access to endogenous chemical species such as sodium or potassium, bringing new insights about the brain metabolism and the pathophysiology of diseases. It will also open the door to exogenous chemical species such as lithium, which would bring insights about the mechanisms of action of this drug used to treat psychiatric disorders.
This outstanding scanner, which is expected to generate exciting serendipitous discoveries, will be embedded in an institute hosting multiple instruments, including the world strongest MRI magnet for rodents (17T), and interdisciplinary teams of researchers. This is leading to a sparkling research ecology that allowed, for instance, Neurospin to be the nest of Scikit-learn (see: http://scikit-learn.org/), a famous machine learning software.
During the last decade, projects involving the multi-center collection of neuroimaging data have multiplied for clinical research, therapeutic trials and massive data mining with an artificial intelligence perspective. Because of the heterogeneity of the scanners, this need often stumbles on the cross center harmonisation of the acquisitions necessary for the optimal use of such datasets. In addition, the logistical complexity of collecting and analysing multicenter data generally involves prohibitive costs beyond the reach of most projects, as evidenced by the experience of the pioneering North American ADNI project (USD $100m (~€85.57m) invested to follow a thousand North American subjects in the context of Alzheimer’s disease).
The harmonisation problem
The problem of harmonisation is solved simply in epidemiological studies on aging thanks to very open inclusion criteria. It is sufficient to install identical dedicated machines in a few large population basins, as does the UK Biobank programme. Unfortunately, this approach is not appropriate when more stringent inclusion criteria require the use of many pre-existing imagers.
One may think that by accumulating millions of images, the magic of Big Data will operate to overcome the confines linked to the scanner specificities. This is one of the challenges tackled by the Human Brain Project, which aims at aggregating images acquired in a hundred European hospitals. A third approach was developed in France which consisted of taking advantage of the launch of Memento, a large cohort dedicated to the study of the natural history of Alzheimer’s disease, to build a platform capable of meeting the needs of various studies. This platform called CATI relied on experts scattered all over France to iteratively build what physicists would call a large infrastructure dedicated to cohort imaging, i.e. a network of a hundred scanners whose acquisition parameters (MRI sequences, PET reconstruction parameters, etc.) were set so as to minimise site effects. Today, the network is constantly monitored to react to hardware or software updates. The images are collected by a secure web service and quality checked and analyzed in a centralised facility.
The role of CATI
CATI plays a role of concentrator for image analysis technologies, accelerating the availability of advances made in France or elsewhere for clinical research. CATI aims at playing a role in academic clinical research similar to that played by Contract Research Organization in trials. In addition to using analysis software developed by the community, CATI conducts its own R&D programmes to minimise, as much as possible, the bias associated with multiple types of scanners and improve the robustness of analysis algorithms. It also develops original approaches to optimise or even automate the quality control of analyses.
By making multicenter imaging affordable through economies of scale, CATI has accelerated the emergence of new clinical studies. Today, more than thirty studies call upon its services, on a wide spectrum of pathologies: Alzheimer’s disease, Lewy body dementias, fronto-temporal dementias, Parkinson’s and Huntington’s diseases, amyotrophic lateral sclerosis, bipolar disorders, and so on.
CATI has already analysed images from more than 10,000 subjects, generating a harmonised multi-pathology database without equivalent in the world. This database will gradually be made available to the ‘machine learning’ community to feed the search for distributed biomarkers beyond reach for conventional statistical approaches.
The ultimate large infrastructure will probably be the worldwide network of clinical scanners, which will have to be harmonised and monitored to allow optimal use of the biomarkers in clinics, but also to feed the research community with population level databases leading to constant refinement of these biomarkers, in the spirit of the HBP dream. The optimal management of this global imaging infrastructure will probably require the association of the scanner vendors, to push them to adhere to some standard while preserving their innovation capacities.
References
Miller K. et al., Nature Neuroscience, 2016
Amunts et al., Science, 2013
Schild et al., IEEE Transactions on Applied Superconductivity, 2016
Jean-Francois Mangin
CATI director
Cyril Poupon
UNIRS Director
CEA
+33 (1) 690 878 38
jfmangin@cea.fr
https://sites.google.com/view/jfmangin/
http://cati-neuroimaging.com/