Computer Science Department, MS Thesis Presentation Xinyi Fang "Enhancing Language Models for Classification on Text-Attributed Graphs through Multi-Model Data Augmentation"

Friday, April 18, 2025
11:00 am to 12:00 pm

Xinyi Fang  

MS Student

WPI – Computer Science Department 

 

Friday, April 18, 2025 

Time: 11:00 Am – 12:00 PM 

Location: Fuller Labs Lower Perreault Hall 

Advisor: Prof. Kyumin Lee

Reader: Prof. Fabricio Murai

 

Abstract : 

Graph representation learning has become a critical task across various domains such as social networks and recommender systems. Recently, the rise of large language models (LLMs) has opened up new possibilities for processing text-attributed graphs (TAGs), where nodes are associated with textual information. Despite promising progress, applying LLMs to TAGs faces significant challenges, including input window size limitations and the computational overhead of handling large-scale graphs with millions of nodes.

To address these challenges, we propose a novel approach that leverages a multi-model profiling approach for data augmentation, thereby increasing the diversity and quantity of the training samples. The data generated by each model is then combined with the graph structure, prompts and ground-truth labels to create a comprehensive and varied fine-tuning dataset. By strategically selecting profiling models, an appropriate number of neighboring nodes and constructing concise yet informative fine-tuning prompts, our proposed method enables LLMs to process more complex graphs while operating within limited computational resources. Notably, our experiments demonstrate that it is unnecessary to construct intricate graph structures for fine-tuning to achieve strong performance. Our approach outperforms nine state-of-the-art baselines, showcasing its effectiveness. Furthermore, we have made our model publicly available on Hugging Face for reference and use.

Audience(s)

Department(s):

Computer Science