SPECIAL THEME: BIOINFORMATICS
ERCIM News No.43 - October 2000 [contents]

Human Brain Informatics – Understanding Causes of Mental Illness

by Stefan Arnborg, Ingrid Agartz, Mikael Nordström, Håkan Hall and Göran Sedvall


The Human Brain Informatics (HUBIN) project aims at developing and using methodology for cross-domain investigations of mental illness, particularly schizophrenia. A comprehensive data base with standardized information on individual patients and healthy control persons is being built and is analyzed using modern data mining and statistics technology.

Mental disease is a complex phenomenon whose causes are not known, and only symptomatic therapy exists today. Moreover, mental disease affects maybe one third of all humans at least once during their life time. Approximately 1% of all humans suffer from schizophrenia. The more severe cases lead to tragic life-long disability, and the annual costs can be measured in many billions of Euros even in moderate size countries like Sweden. On the other hand, some of our greatest artists and scientists suffered from schizophrenia. Large research projects have mapped mental diseases so that their typical manifestations are well known separately in several specialized medical domains such as genetics, physiology, psychiatry, neurology and brain morphology. But little is known about the relationships between the different domains.

The Project
The HUBIN project started in 1998, and is carried out at the Clinical Neurosciences Department of the Karolinska Institute, with participation by Swedish Institute of Computer Science (SICS), IBM and several medical schools: Lund and Umeå in Sweden, Cardiff in the UK, and Iowa City, USA. It is financed by a grant from the Swedish Wallenberg foundation, with supplementary funding from SICS, NUTEK, Teknikbrostiftelsen and from some private foundations.

Project Goals
The aim of the project is, in the long term, to find causes and effective therapies for mental illness. The short term aims of the project are to develop methodology in intra-domain and cross-domain investigations of schizophrenia by building a comprehensive data base with information about individual patients and control persons and using modern statistical and data mining technology.

Possible Causes of Schizophrenia
Schizophrenia is believed to develop as a result of disturbed signalling within the human brain. Causes of these disturbances can be anomalous patterns of connections between neurons of the functional regions, loss or anomalous distribution of neurons, or disturbances in the biochemical signalling complex. It is known that it is significantly influenced by genetic factors, although in a complex way which has not yet been finally attributed to individual genes. It is also related to disturbances in early (pre-natal) brain development, in ways not yet known but statistically confirmed by investigations of birth and maternity journals. The shape of the brain is also affected, but it is difficult to trace the complex interactions between disease, medication and brain morphology.

Complex Sets of Medical Information
Modern medical investigations create enormous and complex data sets, whose analysis and cross-analysis poses significant statistical and computing challenges. Some of the data used in HUBIN are:

• Genetic micro-array technology can measure the activity of many thousand genes from a single measurement on a microscopic (post mortem) brain sample. These data are noisy and extremely high-dimensional, so standard regression methods fail miserably. Support vector and mixture modelling techniques are being investigated. It is necessary to combine the hard data with subjective information in the form of hypotheses on the role of different genes. Heredity investigations based on disease manifestation and marker maps of relatives also yield large amounts of data with typically (for multiple gene hunting) extremely weak statistical signals.

• Medical imaging methods can give precise estimates of the in vivo anatomy of the brain and the large individual variations in shape and size of white (axon, glia), gray (neurons) and wet (CSF in ventricles and outside the brain) matter in a large number of anatomical regions of the brain.

• Diffusion tomography gives approximate measures of the diffusion tensor in small cubes (ca 3 mm side) which indicates number of and direction of axons that define the long-distance signalling connections of the brain. This tensor can thus be used to get approximate measures of the signalling connectivity of the brain.

• Functional MRI measures the metabolism (blood oxygenation) that approximates neural activity with high resolution. These investigations give extremely weak signals, and for inference it is usually necessary to pool several investigations.

Post-mortem whole brain investigations can give extremely high resolution maps of the biochemical signalling system of the brain, and of gene activity in the brain. More than 50% of the active human genome is believed to be related to brain development, and very little is known about the mechanisms involved.

The patients mental state is measured by a psychiatrist using standardized questionaries. It is vital that the subjective information entered is standardized and quality assured. Obviously, it is a difficult problem to get high-quality answers to 500 questions from a patient, so it is critical to balance the number of questions asked to patients.

As a first sifting of this large information set, summary indices are computed and entered in a relational data base, where standard data mining and statistical visualization techniques give a first set of promising lines of investigations. Information regarding the mental state of individuals is extremely sensitive and its collection and use is regulated by ethics councils of participating universities and hospitals. Identities of patients and controls are not stored in the data base, but it must still be possible to correlate individuals across domains. This is accomplished with cryptographic methods.

Conclusions
The current activities are aimed at showing the feasibility of our interdisciplinary approach when coupled with standardized recording of clinical information.

Presently, the main efforts go into collection of standardized clinical measurements and evaluation of statistical and visual analysis methods in image and genetic information analysis. Current activities aim at relating brain morphology (size, orientation and shape of various regions and tissue types of the brain) to physiological and psychiatric conditions and to investigate relationships between the different domains.

Needless to say, the current project is only one and as yet a small one of many projects worldwide aimed at improving conditions for persons affected by mental illness. In order to connect and improve communications between these many groups, our efforts also involve intelligent text mining for rapidly scanning the literature.

We are building a web site, http://hubin.org/about/index_en.html, for communication between researchers worldwide and between medical experts and relatives of affected persons on national (native language) basis.

Links:
HUBIN: http://hubin.org/about/index_en.html 

Please contact:
Stefan Arnborg - Nada, KTH
Tel +46 8 790 71 94
E-mail: stefan@nada.kth.se
http://www.nada.kth.se/~stefan/