卡耐基.梅隆大學的牛發寫的關于孤立點和數據清洗的文章,全英文,2003年完成,Probabilistic Noise Identification and Data Cleaning,Real world data is never as perfect as we would like it
to be and can often suffer from corruptions that may impact
interpretations of the data, models created from the
data, and decisions made based on the data. One approach
to this problem is to identify and remove records that contain
corruptions. Unfortunately, if only certain fields in a
record have been corrupted then usable, uncorrupted data
will be lost. In this paper we present LENS, an approach for
identifying corrupted fields and using the remaining noncorrupted
fields for subsequent modeling and analysis.
標簽:
大學
數據
上傳時間:
2017-08-29
上傳用戶:thinode