Brown Group RNA Genomics, Department of Biochemistry, University of Otago, New Zealand

My research group at the University of Otago uses bioinformatics, genomics, and experiments to understand the RNA output of genomes.

We study and publish on a broad range of organisms - Viruses, Bacteria, Archaea, Fungi, Plants and Humans.

The common theme is development and use of bioinformatics and genomic tools, particularly to analyse RNA genes. These are complemented by experimental assays and collaborations in specific areas.

The applied aspects of this work are aimed to treat infectious disease and to prevent and respond to climate change.

We are located in the Biochemistry Department at the University of Otago and are involved in University, National and International Consortia and Initiatives. (e.g. Genetics Otago, Maurice Wilkins Centre, Joint Genome Institute (JGI)).

Current vacancies (11/2023).

PhD and other students on any of the projects below- but particularly on antibiotic discovery to combat multidrug resistant bacteria. Research funding is available, but students would need to be self-funded or apply for University of Otago Doctoral Scholarships. Doctoral scholarships are likely guaranteed if you have a 'A' in a research degree from a NZ university.

Current research.

Microbe-host interactions

We are mining genome-wide data from diverse organisms. These studies aim to identify non-coding RNAs and regulatory elements in these genomes, and proteins encoding genes with specific defence and pathogenicity functions.

Project areas include

Plant interactions with pathogens, fungal endophytes and their beneficial compounds (notably Fungal Volatile Organic Compounds, FVOC)
Human bacterial pathogen interactions with their viruses (bacteriophages)
Methanogen interactions with their viruses
Human interactions with their viruses
Discovery of new antibiotics from the viruses of human pathogens
Discovery of new antibiotic targets in antimicrobial resistant bacteria (AMR)

Discovery of new genes and gene regulation mechanisms.

Gene expression or RNA or proteins must be carefully regulated in all species. We are investigating this regulation using bioinformatics and wet experiments in a range of systems. Software and datasets from the Brown group are available through our servers, by download, or through collaborators. Current projects involve the use of both computer and experimental tools to test for new types of gene expression control mechanisms.

Databases and tools for genomics.

We have a strong track record of delivering informatic tools to the research community, or applying custom solutions to problems. We have successfully mined genomes, metagenomes and viromes. Our first tool (TransTerm) was first published over 30 years ago, it includes data and tools related to translation of RNA. Virus research tools and databases include HBVRegDB, CRISPRTarget, CRISPRSuite and PredVirusHost.

Current Members Brown/Lim 2023

David Chyou. Post-Doc. Bacterial RNA transcription termination, Mushrooms and Truffles, Protein stability, advanced bioinformatics and genomics
Andy Nilsen. Post-Doc Mushrooms and Truffles (mainly with Botany)
Sravya Garimella - ARF
Sofia Moreira. PhD Student. Bacterial RNA transcription termination
Mandana Baharlou. PhD Student. Bacterial RNA transcription promoter regulation (submitted 2023)
Thomas Ware. PhD Student. Bacterial RNA gene expression
Anh Thu Phan PhD Student. RNA-protein interactions (Lim primary supervisor)
Gabrielle Chieng PhD Student. Introns and uORFs. (Lim primary supervisor)
Finn Dobbie. MSc Student. Mushrooms and Truffles
Megan Addison. MSc Student. Methanogen viruses, CRISPR RNA (completed 2023)
Will Fox. MSc Student. Mushrooms and Truffles (completed 2023)
Nicole Guise. MSc Student. CRISPR RNA in Forensics (submitted 2023)

Brown group birthday lunch photo Nov 2023, Chris, Sofia, Anh, Sravya, Gabrielle, Finn, Lim, Mandana, missing David Chyou, Thomas Ware

Brown group lunch photo, Oct 2022.

Funding bodies

Thanks to current and prior funders for projects we contribute to.

MBIE. Fungal volatile organic compounds for sustainable agriculture in a changing environment (2022-2027).

HRC Explorer Grant. Novel Antibiotics (2023-2025)

The Marsden Fund (2019-2024).

University of Otago Research Committee.

Human Frontier Science Organisation.

Joint Genome Institute. Projects: Trichoderma genus wide analysis, Truffle-like fungi

NESI- Projects 2013-2023. Fungal genomics, Greenshell Mussel Genomics, CRISPRDetect, Genomic data mining, Viruses and viral defences in the Biosphere, Data Mining of Viral Genomes, Massively parallel protein similarity searches, Gene discovery in microbial genomes.

Publications.

Google Scholar

Some current collaborators.

Chun Shen Lim (Biochemistry, Otago) Functions of introns in eukaryotic RNA.
Joe Wade (New York, USA) RNA transcription termination
Peter Fineran (Microbiology, Otago). The discovery of CRISPR elements in bacterial genomes and their targets in viral (bacteriophage) genomes.
David Orlovich and Tina Summerfield (Botany, Otago) New Zealand native mushroom genomes (Taonga).
Artemio Mendoza and John Hampton (Lincoln, NZ) Genomics of fungal-plant interactions - Gene expression bioinformatics. Fungal non-coding RNAs. Natural fungal plant beneficial compounds
Herve Le Hir (Paris) Functions of introns in eukaryotic RNA
Irina Druzhinina (London)Roles RNA in fungi in pathogenic and beneficial fungi

Software and Resources.

Resources from Chris Brown's Research Group at the University of Otago, New Zealand.

Major headings are at the top, with links to lower down the page, then links to the resources.

Most of the tools listed here have been described in publications and abstracts are shown.

Regulatory Genomics - translational control of gene expression:

Transterm, 3' UTR Regulatory Elements, Scan for Motifs

Prokaryotic non coding RNA including CRISPR-Cas:

CRISPRSuite comprises: CRISPRDetect, CRISPRDirection, CRISPRTarget, tracrRNA prediction CRISPRBank, and CRISPRHost

Link to a community resource for comparison between tools :Bioinformatics for CRISPR Biology

Draft standard CRISPR array interchange format in gff3 for array representation: CRISPR Array gff3

Viral non-coding RNA:

HBVRegDB, and the Viral part ofTransTerm

Genomic studies:

Kina transcriptome, Genomics in the NZ Bush

TransTerm. A species centric database of key mRNA regions and features (e.g. stop codon usage, stop signal usage, initiation context).

More Information:

Transterm: a database to aid the analysis of regulatory sequences in mRNAs (Jacobs et al 2009).

Grant H. Jacobs Augustine Chen, Stewart G. Stevens, Peter A. Stockwell, Michael A. Black, Warren P. Tate and Chris M. Brown

Messenger RNAs, in addition to coding for proteins, may contain regulatory elements that affect how the protein is translated. These include protein and microRNA-binding sites. Transterm (http://mRNA.otago.ac.nz/Transterm.html) is a database of regions and elements that affect translation with two major unique components. The first is integrated results of analysis of general features that affect translation (initiation, elongation, termination) for species or strains in Genbank, processed through a standard pipeline. The second is curated descriptions of experimentally determined regulatory elements that function as translational control elements in mRNAs. Transterm focuses on protein binding sites, particularly those in 3′-untranslated regions (3′-UTR). For this release the interface has been extensively updated based on user feedback. The data is now accessible by strain rather than species, for example there are 10 Escherichia coli strains (genomes) analysed separately. In addition to providing a repository of data, the database also provides tools for users to query their own mRNA sequences. Users can search sequences for Transterm or user defined regulatory elements, including protein or miRNA targets. Transterm also provides a central core of links to related resources for complementary analyses.

Translation Efficiency. Understanding the relationship between proteins and the mRNAs that encode them. Genes under translational control. The poor correlation between protein and RNA levels in human cells.

More Information:

In Silico Estimation of Translation Efficiency in Human Cell Lines: Potential Evidence for Widespread Translational Control (Stevens and Brown, PLoS ONE 2013)

Stewart G. Stevens, Chris M. Brown

Recently large scale transcriptome and proteome datasets for human cells have become available. A striking finding from these studies is that the level of an mRNA typically predicts no more than 40% of the abundance of protein. This correlation represents the overall figure for all genes. We present here a bioinformatic analysis of translation efficiency – the rate at which mRNA is translated into protein. We have analysed those human datasets that include genome wide mRNA and protein levels determined in the same study. The analysis comprises five distinct human cell lines that together provide comparable data for 8,170 genes. For each gene we have used levels of mRNA and protein combined with protein stability data from the HeLa cell line to estimate translation efficiency. This was possible for 3,990 genes in one or more cell lines and 1,807 genes in all five cell lines. Interestingly, our analysis and modelling shows that for many genes this estimated translation efficiency has considerable consistency between cell lines. Some deviations from this consistency likely result from the regulation of protein degradation. Others are likely due to known translational control mechanisms. These findings suggest it will be possible to build improved models for the interpretation of mRNA expression data. The results we present here provide a view of translation efficiency for many genes. We provide an online resource allowing the exploration of translation efficiency in genes of interest within different cell lines (http://crispr.otago.ac.nz/TranslationEfficiency).

CRISPRTarget. Discovery of the functional targets of CRISPR RNA elements in viral (bacteriophage), mobile elements, or chromosomal DNA. Computational or bioinformatic prediction of CRISPR targets.

More Information: CRISPRTarget: Bioinformatic prediction and analysis of crRNA targets (Biswas et al, 2013)

Ambarish Biswas Joshua N. Gagnon Stan J.J. Brouns, Peter C. Fineran and Chris M. Brown

The bacterial and archaeal CRISPR/Cas adaptive immune system targets specific protospacer nucleotide sequences in invading organisms. This requires base pairing between processed CRISPR RNA and the target protospacer. For type I and II CRISPR/Cas systems, protospacer adjacent motifs (PAM) are essential for target recognition, and for type III, mismatches in the flanking sequences are important in the antiviral response. In this study, we examine the properties of each class of CRISPR. We use this information to provide a tool (CRISPRTarget) that predicts the most likely targets of CRISPR RNAs (http://crispr.otago.ac.nz/CRISPRTarget). This can be used to discover targets in newly sequenced genomic or metagenomic data. To test its utility, we discover features and targets of well-characterized Streptococcus thermophilus and Sulfolobus solfataricus type II and III CRISPR/Cas systems. Finally, in Pectobacterium species, we identify new CRISPR targets and propose a model of temperate phage exposure and subsequent inhibition by the type I CRISPR/Cas systems.

CRISPRDetect

More Information: CRISPRDetect: A flexible algorithm to define CRISPR arrays (2016)

Biswas, A., R. H. Staals, S. E. Morales, P. C. Fineran and C. M. Brown

BACKGROUND: CRISPR (clustered regularly interspaced short palindromic repeats) RNAs provide the specificity for noncoding RNA-guided adaptive immune defence systems in prokaryotes. CRISPR arrays consist of repeat sequences separated by specific spacer sequences. CRISPR arrays have previously been identified in a large proportion of prokaryotic genomes. However, currently available detection algorithms do not utilise recently discovered features regarding CRISPR loci. RESULTS: We have developed a new approach to automatically detect, predict and interactively refine CRISPR arrays. It is available as a web program and command line from bioanalysis.otago.ac.nz/CRISPRDetect. CRISPRDetect discovers putative arrays, extends the array by detecting additional variant repeats, corrects the direction of arrays, refines the repeat/spacer boundaries, and annotates different types of sequence variations (e.g. insertion/deletion) in near identical repeats. Due to these features, CRISPRDetect has significant advantages when compared to existing identification tools. As well as further support for small medium and large repeats, CRISPRDetect identified a class of arrays with 'extra-large' repeats in bacteria (repeats 44-50 nt). The CRISPRDetect output is integrated with other analysis tools. Notably, the predicted spacers can be directly utilised by CRISPRTarget to predict targets. CONCLUSION: CRISPRDetect enables more accurate detection of arrays and spacers and its gff output is suitable for inclusion in genome annotation pipelines and visualisation. It has been used to analyse all complete bacterial and archaeal reference genomes.

CRISPRDirection

tracrRNA prediction : Available on https://github.com/davidchyou and http://galaxy.otago.ac.nz:8080

More Information: Prediction and diversity of tracrRNAs from type II CRISPR-Cas systems (2018)

Te-Yuan Chyou and C. M. Brown

Type II CRISPR-Cas9 systems require a small RNA called the trans-activating CRISPR RNA (tracrRNA) in order to function. The prediction of these non-coding RNAs in prokaryotic genomes is challenging because they have dissimilar structures, having short stems (3–6 bp) and non-canonical base-pairs e.g. G-A. Much of the tracrRNA is involved in base-pairing interactions with the CRISPR RNA, or itself, or in RNA-protein interactions with Cas9. Here we develop a new bioinformatic tool to predict tracrRNAs. On an experimentally verified test set the algorithm achieved a high sensitivity and specificity, and a low false discovery rate (FDR) on genome analysis. Analysis of representative RefSeq genomes (5462) detected 275 tracrRNAs from 165 genera. These tracrRNAs could be grouped into 15 clusters which were used to build covariance models. These clusters included Streptococci and Staphylococci tracrRNAs from the CRISPR-Cas9 systems which are currently used for gene editing. Compensating base changes observed in the models were consistent with the experimental structures of single guide RNAs (sgRNAs). Other clusters, for which there are not yet structures available, were predicted to form novel tracrRNA folds. These clusters included a large and divergent tracrRNA set from Bacteroidetes. These computational models contribute to the understanding of CRISPR-Cas biology, and will assist in the design of further engineered CRISPR-Cas9 systems. The tracrRNA prediction software is available through a galaxy web server.

Keywords: tracrRNA software, tracrRNA gene, tracrRNA finder, sRNA, bacteria

Supplements and code at time of publication 8/2018 are here: http://bioanalysis.otago.ac.nz/tracrRNA/

Scan for Motifs. Computational models of experimentally determined translational control elements and search your sequences with them. MicroRNA and RBA-BP sites are searched simultaneously.

More Information: Scan for Motifs: a webserver for the analysis of post-transcriptional regulatory elements in the 3' untranslated regions (3' UTRs) of mRNAs (Biswas et al 2014)

Biswas, A. and C. M. Brown

Scan for Motifs (SFM) simplifies the process of identifying a wide range of regulatory elements on alignments of vertebrate 3'UTRs. SFM includes identification of both RNA Binding Protein (RBP) sites and targets of miRNAs. In addition to searching pre-computed alignments, the tool provides users the flexibility to search their own sequences or alignments. The regulatory elements may be filtered by expected value cutoffs and are cross-referenced back to their respective sources and literature. The output is an interactive graphical representation, highlighting potential regulatory elements and overlaps between them. The output also provides simple statistics and links to related resources for complementary analyses. The overall process is intuitive and fast. As SFM is a free web-application, the user does not need to install any software or databases.

CisRegRNA . Computational models (covariance/RFam) of structured RNA cis-regulatory elements.

More Information: Global or local? Predicting secondary structure and accessibility in mRNAs (Lange et al 2012)

Lange, S. J., D. Maticzka, M. Mohl, J. N. Gagnon, C. M. Brown and R. Backofen

Determining the structural properties of mRNA is key to understanding vital post-transcriptional processes. As experimental data on mRNA structure are scarce, accurate structure prediction is required to characterize RNA regulatory mechanisms. Although various structure prediction approaches are available, it is often unclear which to choose and how to set their parameters. Furthermore, no standard measure to compare predictions of local structure exists. We assessed the performance of different methods using two types of data: transcriptome-wide enzymatic probing information and a large, curated set of cis-regulatory elements. To compare the approaches, we introduced structure accuracy, a measure that is applicable to both global and local methods. Our results showed that local folding was more accurate than the classic global approach. We investigated how the locality parameters, maximum base pair span and window size, influenced the prediction performance. A span of 150 provided a reasonable balance between maximizing the number of accurately predicted base pairs, while minimizing effects of incorrect long-range predictions. We characterized the error at artificial sequence ends, which we reduced by setting the window size sufficiently greater than the maximum span. Our method, LocalFold, diminished all border effects and produced the most robust performance.

CisRNA-SVM Genome wide predictions of novel structured RNA cis-regulatory elements in human 3' UTRs.

More Information:Computational identification of new structured cis-regulatory elements in the 3'-untranslated region of human protein coding genes (Chen and Brown 2012b)

Chen, X. S. and C. M. Brown

Messenger ribonucleic acids (RNAs) contain a large number of cis-regulatory RNA elements that function in many types of post-transcriptional regulation. These cis-regulatory elements are often characterized by conserved structures and/or sequences. Although some classes are well known, given the wide range of RNA-interacting proteins in eukaryotes, it is likely that many new classes of cis-regulatory elements are yet to be discovered. An approach to this is to use computational methods that have the advantage of analysing genomic data, particularly comparative data on a large scale. In this study, a set of structural discovery algorithms was applied followed by support vector machine (SVM) classification. We trained a new classification model (CisRNA-SVM) on a set of known structured cis-regulatory elements from 3'-untranslated regions (UTRs) and successfully distinguished these and groups of cis-regulatory elements not been strained on from control genomic and shuffled sequences. The new method outperformed previous methods in classification of cis-regulatory RNA elements. This model was then used to predict new elements from cross-species conserved regions of human 3'-UTRs. Clustering of these elements identified new classes of potential cis-regulatory elements. The model, training and testing sets and novel human predictions are available at: http://mRNA.otago.ac.nz/CisRNA-SVM.

Iron Responsive Elements (Stevens et al 2011) Genome wide predictions in human mRNAs

More Information:

Logopaint (Schreiber and Brown 2002). Exploring the visualisation of bias in regulatory DNA or RNA element, using information context and modified logos.

More Information:

Mammalian promoter analysis (Zadissa et al 2007). Software to analyse promoters of co-regulated genes. Example files to repeat the published analysis of muscle promoters.

More Information:

Disease associated 3' UTR variants (UTRPathDB): In preparation unpublished.

More Information:

HBVRegDB (Panjaworayan et al 2007). Genes and regulatory elements in hepatitis B virus genomes.

More Information: HBVRegDB: annotation, comparison, detection and visualization of regulatory elements in hepatitis B virus sequences (Panjaworayan et al 2007)

Panjaworayan, N., S. K. Roessner, A. E. Firth and C. M. Brown

BACKGROUND: The many Hepadnaviridae sequences available have widely varied functional annotation. The genomes are very compact (approximately 3.2 kb) but contain multiple layers of functional regulatory elements in addition to coding regions. Key regions are subject to purifying selection, as mutations in these regions will produce non-functional viruses. RESULTS: These genomic sequences have been organized into a structured database to facilitate research at the molecular level. HBVRegDB is a comparative genomic analysis tool with an integrated underlying sequence database. The database contains genomic sequence data from representative viruses. In addition to INSDC and RefSeq annotation, HBVRegDB also contains expert and systematically calculated annotations (e.g. promoters) and comparative genome analysis results (e.g. blastn, tblastx). It also contains analyses based on curated HBV alignments. Information about conserved regions - including primary conservation (e.g. CDS-Plotcon) and RNA secondary structure predictions (e.g. Alidot) - is integrated into the database. A large amount of data is graphically presented using the GBrowse (Generic Genome Browser) adapted for analysis of viral genomes. Flexible query access is provided based on any annotated genomic feature. Novel regulatory motifs can be found by analysing the annotated sequences. CONCLUSION: HBVRegDB serves as a knowledge database and as a comparative genomic analysis tool for molecular biologists investigating HBV. It is publicly available and complementary to other viral and HBV focused datasets and tools http://hbvregdb.otago.ac.nz. The availability of multiple and highly annotated sequences of viral genomes in one database combined with comparative analysis tools facilitates detection of novel genomic elements.

MLOGD (Firth and Brown 2005, 2006). Discovery of overlapping reading frames in viral and non-viral mRNAs. http://guinevere.otago.ac.nz/aef/MLOGD/

More Information:

HBV Epsilon Structures (Chen and Brown 2012a). Computational models of the hepatitis B virus epsilon RNA regulatory element.

More Information:

Viral Division of TransTerm: CDS sequences, Compositional statistics, codon usages and biases in the viral division of GenBank- used for novel virus discovery and annotation.

More Information:

HBV PRE: Computational models of the HBV Post-transcriptional Regulatory Element (PRE) Panjaworayan 2007

More Information: HBVRegDB: Annotation, comparison, detection and visualization of regulatory elements in hepatitis B virus sequences

Nattanan Panjaworayan, Stephan K Roessner, Andrew E Firth and Chris M Brown

Abstract
Background. The many Hepadnaviridae sequences available have widely varied functional annotation. The genomes are very compact (~3.2 kb) but contain multiple layers of functional regulatory elements in addition to coding regions. Key regions are subject to purifying selection, as mutations in these regions will produce non-functional viruses.
Results

These genomic sequences have been organized into a structured database to facilitate research at the molecular level. HBVRegDB is a comparative genomic analysis tool with an integrated underlying sequence database. The database contains genomic sequence data from representative viruses. In addition to INSDC and RefSeq annotation, HBVRegDB also contains expert and systematically calculated annotations (e.g. promoters) and comparative genome analysis results (e.g. blastn, tblastx). It also contains analyses based on curated HBV alignments. Information about conserved regions – including primary conservation (e.g. CDS-Plotcon) and RNA secondary structure predictions (e.g. Alidot) – is integrated into the database. A large amount of data is graphically presented using the GBrowse (Generic Genome Browser) adapted for analysis of viral genomes. Flexible query access is provided based on any annotated genomic feature. Novel regulatory motifs can be found by analysing the annotated sequences.
Conclusion

HBVRegDB serves as a knowledge database and as a comparative genomic analysis tool for molecular biologists investigating HBV. It is publicly available and complementary to other viral and HBV focused datasets and tools http://hbvregdb.otago.ac.nz. The availability of multiple and highly annotated sequences of viral genomes in one database combined with comparative analysis tools facilitates detection of novel genomic elements.

Other software currently hosted by the Brown group: Firth AE, Patrick WM. GLUE-IT and PEDEL-AA: new programmes for analyzing protein diversity in randomized libraries. Nucleic Acids Res. 2008;36(Web Server issue):W281-W285. doi:10.1093/nar/gkn226

http://guinevere.otago.ac.nz/stats.html

Dr Chris Brown, Biochemistry and Genetics Otago