(通讯员:关迪)2025年6月23日下午15:30,中央财经大学潘蕊教授莅临长春工业大学数学与统计学院,开展了一场专业且富有启发性的学术报告。本次报告由数学与统计学院承办,在南湖校区新图书馆5307室进行。会议由数学与统计学院副院长杨凯教授主持,学院部分老师及研究生参加了本次学术报告会。
会议开始之际,由副院长杨凯教授作为代表对潘教授的到来表示感谢,并对潘教授及其研究内容做了介绍。

报告人简介:潘蕊,中央财经大学统计与数学学院教授、博士生导师,中央财经大学龙马学者青年学者。主要研究领域为网络结构数据的统计建模、时空数据的统计分析等。在Annals of Statistics、Journal of the American Statistical Association、Journal of Business & Economic Statistics等期刊发表论文30余篇。著有中文专著《数据思维实践》、《网络结构数据分析与应用》。主持国家自然科学基金项目、全国统计科学研究项目等。
报告题目:Academic Literature Recommendation in Large-scale Citation Networks Enhanced by Large Language Models
报告摘要:Literature recommendation is essential for researchers to find relevant articles in an ever-growing academic field. However, traditional methods often struggle due to data limitations and methodological challenges. In this work, we construct a large citation network and propose a hybrid recommendation framework for scientific article recommendation. Specifically, the citation network contains 190,381 articles from 70 journals, covering statistics, econometrics, and computer science, spanning from 1981 to 2022. The recommendation mechanism integrates network-based citation patterns with content-based semantic similarities. To enhance content-based recommendations, we employ text-embedding-3-small model of OpenAI to generate an embedding vector for the abstract of each article. The model has two key advantages: computational efficiency and embedding stability during incremental updates, which is crucial for handling dynamic academic databases. Additionally, the recommendation mechanism is designed to allow users to adjust weights according to their preferences, providing flexibility and personalization. Extensive experiments have been conducted to verify the effectiveness of our approach. In summary, our work not only provides a complete data system for building and analyzing citation networks, but also introduces a practical recommendation method that helps researchers navigate the growing volume of academic literature, making it easier to find the most relevant and influential articles in the era of information overload.

潘教授分别从研究背景,LMANStat数据集和LMANStat Embedding数据集三个方面进行了介绍和讲解。
该研究构建了基于42种统计学期刊(1981-2021年)的大规模多层学术网络数据集LMANStat,涵盖合作网络、引用网络、关键词共现网络等8类网络结构,包含97,436篇论文、70,735位作者及500余万条关系数据。研究通过四步流程解决作者同名问题,并提取研究兴趣、生产力等多维作者属性。进一步扩展的LMANStat Embedding数据集新增28种跨学科期刊,采用OpenAI模型生成论文摘要的1536维嵌入向量以支持语义分析。基于该数据集,团队在社区检测(如D-SCORE算法)、引用预测、学术推荐等领域取得系列成果,相关论文发表于Statistics and Its Interface等期刊。数据集已开源,为学术网络分析、科研评价及知识发现提供了多维度、动态化的研究基础,推动了统计学与信息科学的交叉创新。最后,潘教授对报告进行了总结,同学们针对报告内容与潘教授进行了交流与讨论。

本次报告拓展了老师和同学们的学术视野,激发了同学们的学习热情,并使老师和同学们对各网络的相关知识有了更深的理解,聆听报告的师生均表示受益匪浅。
初审:关迪
复审:杨凯
终审:王丹、王纯杰
数学与统计学院
2025年6月24日