Extensions to quantile regression forests for very high-dimensional data
文献类型:会议论文
作者 | Tung, Nguyen Thanh; Huang, Joshua Zhexue; Khan, Imran; Li, Mark Junjie; Williams, Graham |
出版日期 | 2014 |
会议名称 | 18th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2014 |
会议地点 | Tainan, Taiwan |
英文摘要 | This paper describes new extensions to the state-of-the-art regression random forests Quantile Regression Forests (QRF) for applications to high-dimensionaldata with thousands of features. We propose a new subspace sampling method that randomly samples a subset of features from two separate feature sets, one containing important features and the other one containing less important features. The two feature sets partition the input data based on the importance measures of features. The partition is generated by using feature permutation to produce raw importance feature scores first and then applying p-value assessment to separate important features from the less important ones. The new subspace sampling method enables to generate trees from bagged sample data with smaller regressionerrors. For point regression, we choose the prediction value of Y from the range between two quantiles Q0.05 and Q0.95 instead of the conditional mean used inregression random forests. Our experiment results have shown that random forests with these extensions outperformed regression random forests and quantileregression forests in reduction of root mean square residuals. © 2014 Springer International Publishing.(11 refs) |
收录类别 | EI |
语种 | 英语 |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/6040] ![]() |
专题 | 深圳先进技术研究院_数字所 |
作者单位 | 2014 |
推荐引用方式 GB/T 7714 | Tung, Nguyen Thanh,Huang, Joshua Zhexue,Khan, Imran,et al. Extensions to quantile regression forests for very high-dimensional data[C]. 见:18th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2014. Tainan, Taiwan. |
入库方式: OAI收割
来源:深圳先进技术研究院
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。