AUC究竟在衡量模型什么能力？代码实现_随笔

AUC究竟在衡量模型什么能力？代码实现

当测试集中的正负样本的分布变化的时候，ROC曲线能够保持不变。在实际的数据集中经常会出现类不平衡（class imbalance）现象，即负样本比正样本多很多（或者相反），而且测试数据中的正负样本的分布也可能随着时间变化。

它不受类别不平衡问题的影响，不同的样本比例不会影响AUC的评测结果

#author: wepondef naive_auc(labels,preds):    """    最简单粗暴的方法　　　先排序，然后统计有多少正负样本对满足：正样本预测值>负样本预测值, 再除以总的正负样本对个数     复杂度 O(NlogN), N为样本数    """    n_pos = sum(labels)    n_neg = len(labels) - n_pos    total_pair = n_pos * n_neg    labels_preds = zip(labels,preds)labels_preds = sorted(labels_preds,key=lambda x:x[1])accumulated_neg = 0satisfied_pair = 0for i in range(len(labels_preds)):    if labels_preds[i][0] == 1:        satisfied_pair += accumulated_neg    else:        accumulated_neg += 1 return satisfied_pair / float(total_pair)

欢迎分享，转载请注明来源：内存溢出

原文地址: http://www.outofmemory.cn/zaji/4881212.html

AUC究竟在衡量模型什么能力？代码实现

发表评论

评论列表（0条）