Computer Science Department, MS Thesis Presentation Xinyi Fang "Enhancing Language Models for Classification on Text-Attributed Graphs through Multi-Model Data Augmentation"
11:00 am to 12:00 pm
Xinyi Fang
MS Student
WPI – Computer Science Department
Friday, April 18, 2025
Time: 11:00 Am – 12:00 PM
Location: Fuller Labs Lower Perreault Hall
Advisor: Prof. Kyumin Lee
Reader: Prof. Fabricio Murai
Abstract :
Graph representation learning has become a critical task across various domains such as social networks and recommender systems. Recently, the rise of large language models (LLMs) has opened up new possibilities for processing text-attributed graphs (TAGs), where nodes are associated with textual information. Despite promising progress, applying LLMs to TAGs faces significant challenges, including input window size limitations and the computational overhead of handling large-scale graphs with millions of nodes.
To address these challenges, we propose a novel approach that leverages a multi-model profiling approach for data augmentation, thereby increasing the diversity and quantity of the training samples. The data generated by each model is then combined with the graph structure, prompts and ground-truth labels to create a comprehensive and varied fine-tuning dataset. By strategically selecting profiling models, an appropriate number of neighboring nodes and constructing concise yet informative fine-tuning prompts, our proposed method enables LLMs to process more complex graphs while operating within limited computational resources. Notably, our experiments demonstrate that it is unnecessary to construct intricate graph structures for fine-tuning to achieve strong performance. Our approach outperforms nine state-of-the-art baselines, showcasing its effectiveness. Furthermore, we have made our model publicly available on Hugging Face for reference and use.