Using machine learning, especially deep learning, to facilitate biological research
is a fascinating research direction. However, in addition to the standard classi cation
or regression problems, whose outputs are simple vectors or scalars, in bioinformatics,
we often need to predict more complex structured targets, such as 2D images
and 3D molecular structures. The above complex prediction tasks are referred to as
structured prediction. Structured prediction is more complicated than the traditional
classi cation but has much broader applications, especially in bioinformatics, considering
the fact that most of the original bioinformatics problems have complex output
objects.
Due to the properties of those structured prediction problems, such as having
problemspeci c constraints and dependency within the labeling space, the straightforward
application of existing deep learning models on the problems can lead to
unsatisfactory results. In this dissertation, we argue that the following two ideas
can help resolve a wide range of structured prediction problems in bioinformatics.
Firstly, we can combine deep learning with other classic algorithms, such as probabilistic
graphical models, which model the problem structure explicitly. Secondly,
we can design and train problemspeci c deep learning architectures or methods by
considering the structured labeling space and problem constraints, either explicitly
or implicitly. We demonstrate our ideas with six projects from four bioinformatics
sub elds, including sequencing analysis, structure prediction, function annotation,
and network analysis. The structured outputs cover 1D electrical signals, 2D images, 3D structures, hierarchical labeling, and heterogeneous networks. With the help of
the above ideas, all of our methods can achieve stateoftheart performance on the
corresponding problems.
The success of these projects motivates us to extend our work towards other more
challenging but important problems, such as healthcare problems, which can directly
bene t people's health and wellness. We thus conclude this thesis by discussing such
future works, and the potential challenges and opportunities.
Date of Award  Nov 1 2020 

Original language  English (US) 

Awarding Institution   Computer, Electrical and Mathematical Science and Engineering


Supervisor  Xin Gao (Supervisor) 

 Bioinformatics
 Structured prediction
 Deep learning