Structural Genomics Consortium, Phase IV

Overview

On June 12, 2017, the Honourable Kirsty Duncan, Minister of Science, announced a new $33 million investment to support the Structural Genomics Consortium (SGC) – a Canadian-led, international public-private partnership that conducts basic science on the structures of human proteins and releases the research to the public to accelerate drug discovery and help patients worldwide. This latest investment in the fourth phase of Canadian-based SGC activities, conducted principally at the University of Toronto, includes $11 million in federal funding through Genome Canada, $5 million through the Government of Ontario, and an additional $17 million through pharmaceutical companies. This funding will help translate scientific discoveries into cures for patients with a range of diseases such as cancer, ALS, Huntington’s disease, malaria and tuberculosis.

The estimated 100,000 different human proteins constitute the key functional and structural components of our and indeed all species; they are critical to normal development and health. Each protein, which is actually a long linear chain of amino acids, is folded into an exact three dimensional shape. This, the so called protein structure, is essential data when thinking about creating molecules such as drugs that bind and modify protein function.

Since 2004, the Structural Genomics Consortium (SGC) has been working to determine the three-dimensional structure of proteins relevant to human diseases. These structures are made available online in the public domain, without restrictions on their use by industry or academia, to support early-stage drug discovery.

To date, and with ongoing support from Genome Canada and other partners, the SGC is responsible for internationally leading 13 per cent of all solved human protein structures. The SGC also leads an international program in partnership with other scientists, research agencies and pharmaceutical companies, that generates small inhibitor molecules, called chemical probes, against proteins and makes them available in the public domain. The chemical probes help researchers understand the role of a protein in normal and disease physiology and play an essential role in the early drug-discovery process.

A key area of focus for the SGC is proteins that regulate epigenetics, the study of heritable modifications to gene expression. Understanding how proteins turn specific genes on and off is important for developing therapies to treat many debilitating diseases, such as cancer and neurodegenerative and inflammatory diseases. Based on SGC science, there are now more than 25 clinical trials ongoing, including in Canada.

Another example is when, because of genetic mutations, proteins are misinformed, resulting in one of the estimated 7000 different rare diseases which affect approximately one million 2 Canadians. Here, SGC’s ability to define protein structure can make all the difference. Care 4 Rare, a Genome Canada-funded large-scale applied research project, working with the SGC has identified the structure of a protein which may hold the key to therapy for a rare seizure disorder as well as discovering small molecules which bind to this protein, potentially serving as drugs; more recently the world’s first genetic model of the condition has been created and is now poised to perform pre-clinical testing of the SGC-enabled theory.

The SGC is considered a leader in open science. By sharing all research results and output with no restrictions on use, SGC’s open science ethos protects against waste and duplication of effort, and strengthens collaboration and innovation by facilitating scientific exchange that is unencumbered by intellectual property considerations.

In its current phase, the SGC is expanding its open science collaborative network to include disease and patient foundations. The SGC is also partnering with clinicians and research hospitals to test its chemical probes on patient samples, a more predictive approach to validating new targets for drug discovery. The project will also provide training for the next generation of Canadian researchers in early-stage drug discovery.

The Canadian arm of the SGC is co-supported by eight pharmaceutical companies and the Ontario Ministry of Research, Innovation and Science (MRIS). SGC has also established a robust collaborative research and training network among Canadian institutions with support from hospital research institutes, the Bill and Melinda Gates Foundation, the Natural Sciences and Engineering Research Council (NSERC), Canadian Institutes of Health Research (CIHR) and the Mitacs program.

The wider research program includes sister SGC sites at Oxford University (UK), the University of Campinas (Brazil), the Karolinska Institute (Sweden), the University of North Carolina (USA) and Goethe University Frankfurt (Germany).

Extracting Signal from Noise: Big Biodiversity Analysis from High-Throughput Sequence Data

Overview

Surveying biodiversity is critical for environmental health and for managing natural resources. It helps to assess the impact of resource development, but also to identify pests, invasive species, and pathogens in a rapid and cost-effective manner. It is essential to Canada’s economic growth in the forestry, agriculture, and fishery sectors and to decision-making in public health. Genetic methods of surveying biodiversity, such as high-throughput sequencing, are being broadly adopted, but bioinformatics has not kept pace with the data being generated. In addition, current methods are geared toward bacteria and similar organisms, rather than multi-celled plants and animals that need monitoring as well.

Drs. Sarah Adamowicz and Paul Hebert, along with colleagues from the University of Guelph, are creating new bioinformatics tools that will facilitate the rapid and accurate processing of DNA data resulting from high-throughput sequencing. The tools will enable the simultaneous analysis of bulk samples, which are made up of many different species. It will include a de-noising tool to detect errors; a method to cluster DNA sequences into species-like units to permit biodiversity analysis; and a method for assigning sequencing data to higher taxonomic categories to unlock functional biological information. The team will combine these various tools into a biodiversity informatics pipeline that can be incorporated into existing web-based platforms for uptake by a broad variety of users.

The new biodiversity informatics tools will support large-scale biodiversity research by academics; efficient, accurate, and cost-effective environmental assessments for the mining and pulp-and-paper industries; enhanced capacity and accuracy of regulation; and more rapid and accurate biodiversity data for government and private-sector decision-makers.

CReSCENT: CanceR Single Cell ExpressioN Toolkit

Overview

Tumours are complex mixtures of cancer, immune, and normal cells that interact and change during treatment. The interplay of all three types of cells can dictate development of cancer over time, as well as response or resistance to treatments. Recent advances in microfluidic and DNA sequencing technologies have enabled researchers to simultaneously analyze tens of thousands of single cells from complex tissues, including tumours. Interpreting these data is challenging, due to the lack of high-quality reference sets of each cell type in the body and a lack of methods to link these data back to tumour biology.

Drs. Trevor Pugh of the Princess Margaret Cancer Centre and Michael Brudno of The Hospital for Sick Children are developing the CanceR Single Cell ExpressioN Toolkit (CReSCENT), a scalable and standardized set of novel algorithmic methods, tools, and a data portal deployed on cloud computing infrastructure. To allow comparison of cells in cancerous and healthy tissues, the system will aggregate single-cell genomic data generated by cancer researchers and connect them to international reference data generated by experts from around the world as part of the Human Cell Atlas. This data sharing and aggregation system is a key differentiating factor in CReSCENT that will increase researcher productivity by accelerating execution and comparison of computational methods, as well as providing contextual data for understanding how cells behave within tumour tissues.

This platform, which will be useable by any researcher on any computing platform, will assemble a crucial data resource to navigate the upcoming wave of single cell cancer genomics research. CReSCENT will bring together researchers across a broad spectrum of scientific areas and disease types and increase the impact of data generated across research programs. In the long term, this system will pave the way for novel single cell diagnostics and discovery of new drug strategies for improved health care.

Software for Peptide Identification and Quantification from Large Mass Spectrometry Data using Data Independent Acquisition

Overview

Precision medicine gives patients the opportunity to tailor medical and treatment decisions at the individual level to maximize outcomes and minimize adverse effects. It can be used to treat a wide variety of diseases, including cancer. Decisions are often based on the presence and quantity of biomarkers such as proteins in the blood or tissue samples.

Advances in mass spectrometry instruments have made it feasible to discover and measure protein biomarkers, but researchers lack the necessary bioinformatics software to analyze the data. Drs. Bin Ma of the University of Waterloo and Michael Moran of the Hospital for Sick Children are developing this software to enable more sensitive and accurate protein identification and quantification from the mass spectrometry data generated using a method called data independent acquisition (DIA). They expect that their software will significantly increase the total number of proteins identified and quantified in comparison to existing DIA analytical software. It will be especially effective with post-translational modifications (PTMs), which are critical biomarkers in a proteins’ function and degradation.

The free availability of the software to academic labs coupled with its superior performance can help health researchers discover and trace disease biomarkers. Within the next decade, the software could become an indispensable tool for many proteomics labs performing DIA analysis throughout the world. The new software may also help commercial partners create value-added new products, services and jobs.

Ultimately, this will lead to improvements in human health and reduction in healthcare costs by enabling early disease detection and diagnosis and by facilitating the selection of optimal treatment for individual patients.

SYNERGx: a computational framework for drug combination synergy prediction

Overview

When just one drug is used to treat cancer, the patient may not respond, or may develop resistance to it. Combination therapy, where two or more drugs are used in treatment, is more likely to be successful. Yet, it is impossible to test all drug combinations in clinical trials due to the high cost of required resources and certain ethical considerations. Computational techniques are therefore required to model the large amount of available data to improve current cancer treatment strategies and propose more efficient combinations of drugs.

Dr. Benjamin Haibe-Kains of the Princess Margaret Cancer Centre is developing SYNERGx, a new computational platform that will integrate multiple pharmacogenomic datasets. These datasets will be used to predict possible combinations of known drugs that can act in synergy, meaning that their combined therapeutic efficacy is greater than the sum of their individual effects.

The platform will implement analytic tools to improve modeling of synergistic drug effects. Users will have access to highly curated drug-combination pharmacogenetics data and an open-source machine-learning pipeline for drug synergy prediction. SYNERGx will also implement a new way to optimize drug-screening studies to identify novel synergistic combinations that can be further validated in preclinical studies and then in clinical trials.

SYNERGx will provide an efficient way to leverage massive investments in pharmacogenomics studies by allowing the integration of otherwise disparate datasets. It represents a major step forward in the design of new therapeutic strategies for cancer.

Computational tools for Data-Independent Acquisition (DIA) for quantitative proteomics and metabolomics

Overview

When cells lose control over their own behaviour or communication with other cells, diseases such as diabetes or cancer can arise. Protein and small molecule metabolites are responsible for cells’ behaviour, so identifying and quantifying these molecules is key to understanding how disease happens and how to prevent it.

Mass spectrometry has become the workhorse for proteomics and metabolomics. Drs. Anne-Claude Gingras of the Lunenfeld-Tanenbaum Research Institute and Hannes Röst of the Donnelly Centre for Cellular & Biomolecular Research at the University of Toronto are working with a technology called Data-Independent Acquisition (DIA), in which the mass spectrometer systematically identifies and quantifies the proteins and metabolites present in a sample. DIA has been shown to improve quantitative accuracy, reproducibility and throughput over other methods. Since its introduction, however, this approach has only been applied to small-scale studies and in a relatively small number of laboratories. Limitations to this method are due to the lack of user-friendly software that could enable a scalable analysis of the complex data generated in large-scale biomedical and medical research.

The project builds on the team’s proven strength in DIA data analysis and software development and will result in an integrated set of tools available under an open-source license. To encourage uptake of these tool, documentation, webinars and workshops will be made available to potential users. The results of the project could have long-lasting impact on the health sector in Canada by facilitating research into the root causes of disease and assisting with clinical questions such as patient stratification.

BridGE-SGA: A novel computational platform to discover genetic interactions underlying human disease

Overview

The ability to sequence the entire human genome at increasingly lower cost has led to a fundamental change in biomedical research. But there is a gap between the amount of data available and our ability to understand and interpret that data. Addressing this gap is essential to realize the promise of precision medicine.

Dr. Charles Boone and Dr. Brenda Andrews of the Donnelly Centre for Cellular and Biomolecular Research at the University of Toronto, and Dr. Chad Myers of the University of Minnesota, have worked together to discover that a significant part of our inability to interpret genomic data likely stems from the reality that disease generally arises from complex genetic interactions. While all humans essentially have the same set of genes, most have around five million unique genetic variants. The effect of any one variant depends on its interactions with other variants. So we need to understand not just the millions of genetic differences that affect gene function, but also how all those genes interact with each other. Current computational methods and technologies lack the statistical power to do so.

Drs. Boone, Andrews, Myers have developed the first complete genetic interaction map for any organism, and have built a computational method, BridGE, to discover genetic interactions. The team is now working to develop an innovative computational platform for genome sequencing data, BridGE-SGA, to enable the discovery of disease-associated genetic interactions from large-scale human genotype data. Their goal is to discover genetic interactions for a variety of diseases. Identifying and understanding these key genetic interactions will improve our ability to interpret data from whole genome sequencing and identify novel gene targets for drug discovery and development.

Synthetic antibody program: Commercial reagents and novel therapeutics (2010)

Overview

Cancer is now, or will shortly become, the number one cause of death in developed countries. Hence, there is an obvious and urgent need to accelerate the development and rational application of new therapies. The central premise of our program is that achieving this goal will require the identification of new therapeutic targets, the rapid development of specific and effective drugs directed against these targets, and the testing of these agents in relevant models of human cancer. Over the past decade, recombinant antibodies that target cancer-associated proteins have emerged as one of the most effective and major classes of targeted therapeutics in oncology. Moreover, while the production of small-molecule drugs remains a costly and slow process, technological advances have enabled the development of therapeutic grade antibodies in an academic setting, which now expands the cancer therapeutic domain beyond that of pharmaceutical companies. To take advantage of these new developments, the Donnelly Centre at the University of Toronto has established the Toronto Recombinant Antibody Centre (TRAC), a state-of-the-art antibody platform that can be applied to the generation of therapeutic grade antibodies against hundreds of antigens in a high-throughput pipeline. In turn, the TRAC has partnered with the Centre for Drug Research and Development (CDRD) in Vancouver to leverage additional expertise in therapeutic antibody development. Importantly, we have assembled a consortium of leading cancer biologists from the Canadian research community, and together, we have compiled a panel of cancer related proteins that are high-value targets for next-generation cancer therapeutics. Taken together, our program represents a unique and complete platform for the development of antibody therapeutics in a Canadian academic environment. In a three-year framework, we will generate and validate hundreds of antibodies against a host of cancer-associated targets. These antibodies will be powerful tools for discovery research and a significant subset will be candidates for new therapeutic entities. In summary, the program will have major impact on basic research in cancer biology, on therapeutic options for cancer treatment, and on the development of commercial biotechnology in Canada.

The microbiota at the intestinal mucosa-immune interface: A gateway for personalized health (2012)

Overview

Inflammatory bowel diseases (IBD), such as Crohn’s disease and ulcerative colitis, are incurable debilitating lifelong diseases that can affect children. Early detection is critical to avoiding complications and improving their quality of life. At the moment, however, there is no single test to determine the presence or type of IBD and the tests that exist are very uncomfortable for children. Drs. Alain Stintzi, David Mack and team are developing a simple, non-invasive approach to detecting IBD that will also be more cost effective. Using cutting-edge technology, the scientists will examine intestinal bacteria to develop better ways of identifying IBD and determining its severity. This work could also lead to new treatment, enhancing the quality of life for children everywhere.

Autism spectrum disorders: Genomes to outcomes (2012)

Overview

Genome Canada and CIHR-funded research has already led to some exciting breakthroughs in our understanding of autism spectrum disorder, a complex condition that affects normal brain development, social relationships, communication and behaviour. Among these breakthroughs is the identification of specific DNA anomalies associated with the illness. Now, Drs. Stephen Scherer, Peter Szatmari and team are going to the next level, aiming to identify the remaining genetic risk factors. This ground-breaking work will mark Canada’s contribution to an ambitious international initiative that aims to sequence and analyze the genomes of 10,000 people with autism spectrum disorder. With a more complete understanding of the genetic elements of autism, doctors will be able to make earlier diagnoses, provide better, more personalized care to patients and reduce the enormous cost autism imposes on our health care system.