学院动态

学术报告信息发布

当前位置: 首页 >> 学术报告信息发布 >> 正文

Quantile-Matched DC in Massive Data Regression

发布日期:2024-10-11    作者:     点击:

报告题目:Quantile-Matched DC in Massive Data Regression

报告时间:20241012上午7:30

报告地点:南湖校区教学科研楼404

主办单位:数学与统计学院

报告人:林路

报告人简介:林路,山东大学中泰证券金融研究院教授、博士生导师,第一和第二届教育部应用统计专业硕士教育指导委员会成员,山东省教育厅应用统计专业硕士教育指导委员会成员,山东省政府参事,济南应用数学高等研究院院长。从事大数据、高维统计、非参数和半参数统计以及金融统计等方的研究,在国内外统计学、机器学习和相关应用学科顶级期刊和其它重要期刊(包括Ann. Statist., JMLR, Stat. Comput.和中国科学) 发表研究论文130余篇;多个金融策略资政报告得到省长的正面批示;主持过多项国家自然科学基金课题、全国统计科学研究重大项目、教育部博士点专项基金课题、教育部新文科课题、山东省自然科学基金重点项目等;获得国家统计局颁发的全国统计优秀研究成果一等和二等奖,山东省优秀教学成果一等奖(均排名第一)

摘要:The issues of bias-correction and robustness are crucial in the strategy of divide-and-conquer (DC) for massive data sets. For the models with asymmetric error distributions, the weighted composite quantile regression (WCQR) can correct the non-negligible bias of composite quantile regression (CQR), but the selecting weights heavily depends on the nonparametric pilot estimations for some error-dependent nonparametric functions, and then the relevant estimation procedure is inconvenient to implement and is not robust to the outliers and fat-tailed distributions. This work explores a new DC-based WCQR method for nonparametric models with general error distributions. By fully utilizing the model structure, a quantile-matched composite is proposed. In the new framework, the non-negligible bias of the WCQR estimators is explicitly expressed by a parametric estimation together with robust statistics, instead of the unknown error-dependent nonparametric functions. Then the non-negligible bias can be easily corrected by selecting the robust weights. By the asymptotic properties, the optimal weights are attained. The theoretical properties of the new methods are systematically investigated and the behaviors are further illustrated by comprehensive simulation studies and real data analyses. Compared with the competitors, the new methods have the favorable features of estimation accuracy, robustness, applicability and communication efficiency.


上一条:Optimal Kernel Quantile Learning with Random Features

下一条:Optimal Subsampling for Big Data: from ‘Static Data’ to ‘Data Streams’