The PhD topics I offer are in the area that combines machine vision with Natural Language Processing. Below are some PhD topics, however these are just ideas and can be tuned to match both our interests.
If you are a self-funded student considering to study for a PhD in any of the topics below please email me to discuss before applying. Click here for information on applying for Research Degrees at Loughborough University.
I am recruiting sponsored or self-funded PhD students who wish to undertake projects in data science and A.I, including projects within the topics below and here:
Topic 1: Using AI to ethically describe images using text. *NEW* This project involves the development of algorithms for learning image and text embeddings. The idea behind this project is to 1) develop methods towards accessible, explainable and ethical AI; and 2) to develop algorithms that can be used to describe images for users with visual disabilities. Deep Learning and Information Retrieval methods will be implemented. The proposed methods and tools will be applied to a real-world application in collaboration with project partners.
Topic 2: Interactively Generating Synthetic Images from multi-modal data using Deep Learning, NLP and Information Retrieval Techniques *NEW* Significant progress has been made in generating visually realistic images using generative adversarial networks (GANs), however the task of semantic alignment of the generated image with the input text remains a challenge. This project involves the development of realistic images from textual descriptions that describe the image. The project will commence with developing AI methods and tools that generate images using textual descriptions and thereafter progress onto interactive image generation using relevance feedback (and other techniques from the informational retrieval domain) to improve the output of the algorithms. The proposed methods and tools will be applied to a real-world application in collaboration with project partners.
Topic 3: Retrieval of video keyframes and textual video summarisation *NEW* This project involves the development of algorithms for generating summaries of events using key frames extracted from videos. The project will commence with developing methods for 1) selecting keyframes automatically and semi-automatically using text queries; and then 2) generating textual summaries describing the keyframes. The project will involve the development of methods from the AI (deep learning) and informational retrieval domains. The proposed methods and tools will be applied to a real-world application in collaboration with project partners. Applications on human activity recognition and surveillance.
Topic 4: Feature selection of complex data for unsupervised learning. Projects under this topic concern the development of algorithms which are capable of selecting combinations of features from multi-modal datasets (images and text data). The focus will also focus on developing methods that select features in an ethical/unbiased manner, and which are not biased towards a specific class or group of cases in the dataset. The project can focus on information theoretic approaches, deep feature selection approaches and hybrids.
Topic 5: Ethical Continual/Lifelong Deep Learning: Training deep neural networks to learn a very accurate mapping from inputs (such as image data, sensor data, text) to outputs (e.g. labels also known as classes) requires large amounts of labelled data. Even when these models are trained, they have limited ability to generalise to conditions which are different to the ones used for training the model. Projects under this topic concern the development of Continual/Lifelong learning algorithms that can learn continuously and adaptively, to autonomously and incrementally develop complex skills and knowledge. Projects include the development of methods for recognising new behaviours in various environments such as smart environments (e.g. cities, homes, healthcare settings), and continuous object recognition. A focus of this project will be on ethical AI.
Topic 6: Multimodal data fusion using Deep Learning: Deep learning is a subset of machine learning in artificial intelligence (AI) that is capable of learning from data. Deep learning is receiving a lot of attention due to its ability to achieve unprecedented levels of performance in terms of accuracy and speed, to the point where deep learning algorithms can outperform humans at decision making, and tasks such as classifying images, and real-time detection. Multi-modal data fusion is an important task for many machine learning applications, including human activity recognition, information retrieval, and real-time applications of A.I. For example, datasets can include audio and visual data; image and sensor data; multi-sensor multi-modal data; and text and image data. Multi-modal data can be complex, noisy and imbalanced, particularly when collected from real environments. Therefore, it is a significant challenge to create deep learning models that can classify multi-modal data. Machine learning models designed to classify imbalanced data are biased toward learning the more commonly occurring classes. Such bias occurs naturally since the models better learn classes which contain more records, a similar process that would occur with human learning. The aim of this project is to: devise feature engineering approaches for multi-modal data; identify whether fusing multi-modal data improves results as opposed to using uni-modal datasets in specific machine learning tasks; and to develop computational approaches and methods for fusing imbalanced multi-modal data.