Skip to Content

Diego Mollá Aliod

List of Undergraduate and Postgraduate Projects

Below is a list of possible projects for students at Macquarie University (ENGG, MIT, and MRes). Please contact me for further details. Also, if you have a project in mind which is not listed here but which is related to my research interests, contact me and chances are that I will be interested too.

If you are pursuing a PhD, some of these projects can be extended to PhD projects.

Natural Language Processing for Medical Text

  1. Topic Modelling for Query-based Summarisation of Clinical Publications (MIT, MRes)
  2. Deep Learning for Query-based Summarisation of Clinical Publications (MIT, MRes)
  3. Optimisation Techniques for Query-based Summarisation of Clinical Publications (MIT, MRes)
  4. Label the Clusters of Clinical Research Papers (MIT, MRes)
  5. Find Relevant Medical Publications (MIT, MRes)
  6. Evaluate the Quality of a Search of Medical Documents (MIT, MRes)
  7. Search and Display Clinical Evidence (MIT, ENGG805/806)

Natural Language Processing

  1. Deep Learning for Question Answering and Summarisation (MIT, MRes)
  2. Machine Learning in Cloud Computing for Text Processing (MIT, ENGG805/806)
  3. Make a Chatbot Speak like Donald Trump (MRes, ENGG460/411)

Natural Language Processing for Medical Text

For more background of the kind of research that I am doing on Natural Lanuage Processinf for Medical Text, look at my research webpage on this topic.

Topic Modelling for Query-based Summarisation of Clinical Publications

(MIT, MRes)

We have a corpus of clinical questions and answers. Each answer has a list of publication references arranged in groups according to how they address the answer. For example, if the question is “what is the best treatment for X”, and there are three possible treatments, then each treatment will have a group of references associated to it. The goal of this project is to use topic modelling in general, and variants of Latent Dirichlet Allocation (LDA) in particular, to automatically summarise the texts of the groups of abstracts. You will need to have knowledge of statistical methods and natural language processing, or willingness to learn these, and good programming skills, preferably using the Python programming language and Python packages such as NLTK and Numpy/Scipy.

Deep Learning for Query-based Summarisation of Clinical Publications

(MIT, MRes)

We have a corpus of clinical questions and answers. Each answer has a list of publication references arranged in groups according to how they address the answer. For example, if the question is “what is the best treatment for X”, and there are three possible treatments, then each treatment will have a group of references associated to it. The goal of this project is to use deep learning to automatically summarise the texts of the groups of abstracts. In particular, you will explore topics related to the use of convolutional and recurrent networks. You will also explore the use of word embeddings for text summarisation. You will need to have knowledge of statistical methods and natural language processing, or willingness to learn these, and good programming skills, preferably using the Python programming language and Python packages such as Google's TensorFlow.

Optimisation Techniques for Query-based Summarisation of Clinical Publications

(MIT, MRes)

We have a corpus of clinical questions and answers. Each answer has a list of publication references arranged in groups according to how they address the answer. For example, if the question is “what is the best treatment for X”, and there are three possible treatments, then each treatment will have a group of references associated to it. The goal of this project is to use optimisation techniques in general, and Integer Linear Programming (ILP) in particular, to automatically summarise the texts of the groups of abstracts. You will need to have knowledge of statistical methods and natural language processing, or willingness to learn these, and good programming skills, preferably using the Python programming language and Python packages such as NLTK and Numpy/Scipy. Ideally, you will also have experience with Integer Linear Programming (ILP).

Label the Clusters of Clinical Research Papers

(MIT, MRes)

We have a corpus of clinical questions and answers. Each answer has a list of publication references arranged in groups according to how they address the answer. For example, if the question is "what is the best treatment for X", and there are three possible treatments, then each treatment will have a group of references associated to it. The goal of this project is to automatically produce the key concepts listed in each group of references by exploring the contents of the publication abstract in each reference and applying topic labelling methods. You will need to have knowledge of statistical methods and natural language processing, or willingness to learn these, and good programming skills, preferably using the Python programming language and Python packages such as NLTK and Numpy/Scipy.

Find Relevant Medical Publications

(MIT, MRes)

The goal of this project is to design and apply search technology so that a search retrieves published medical information that is relevant to the query asked by a medical doctor. Examples of medical queries are the titles of the clinical inquiries column of the Journal of Family Practice. You will use these clinical inquiries as your source questions, and you will aim to retrieve documents like those listed in the clinical inquiries, by accessing the MEDLINE database of medical publications through interfaces such as PubMed. You will preferably have knowledge and experience with using Web applications, such as search services. Knowledge of search technology and text processing is also desirable but not essential.

Evaluate the Quality of a Search of Medical Documents

(MIT, MRes)

We have a collection of search queries related to clinical questions and the documents that the search engines returned as relevant. In this project you will implement methods to automatically determine how well the relevant documents fit the target topic, and whether there are any key aspects of the information that are missing.

Search and Display Clinical Evidence

(MIT, ENGG805/806)

We are developing technology for searching for medical evidence, extract the specific evidence related to specific conditions, and appraise the quality of the evidence. In this project you will implement an interface to such a technology so that it can be used from a Web browser. You will need knowledge of Web technology and applications, and a good sense of Web application design and user interfaces. There are several aspects of the development that can be covered in multiple projects: make the system scale up to large results, enable user login, leverage the use of past queries by the user or other users.


Natural Language Processing

Deep Learning for Question Answering and Summarisation

(MIT, MRes)

Deep learning has recently been used successfully in a wide range of applications, many of which are related to text processing tasks. In this project you will explore the use of deep learning techniques in question answering or summarisation such as in the shared tasks organised by BioASQ. You will need to have knowledge of statistical methods and natural language processing, or willingness to learn these, and good programming skills, preferably using the Python programming language and Python packages such as Google's TensorFlow.

Machine Learning and Cloud Computing for Text Processing

(MIT, ENGG805/806)

There is an increasing range of cloud platforms such as Amazon Web Services, Google Computer Engine, and Microsoft Cloud Platform. Some of these platforms include specialised components for machine learning. In this project you will review these platforms and report on their possibilities for the development and deployment of machine learning applications for text processing.

Make a Chatbot Speak like Donald Trump

(MRes, ENGG460/411)

In this project you will adapt a full-fledged chatbot such as A.L.I.C.E. and make it speak in the style of well-known people with strong Twitter presence such as Donald Trump. You will apply machine learning technology to build language models based on the tweets by the target person, and modify the output of the chatbot so that it speaks in the style of the person.