我正在尝试在PyTorch中构建ML情感模型。
我在数据框架中从CMU-MOSEI数据集获取了情感标签,如下所示:
| 快乐|伤心|愤怒|惊喜|厌恶|恐惧|
| --|--|--|--|--|--|
| 1.33 |0.0| 0.0| 0.0| 0.0| 0.0|
| 2.0 |0.0| 0.0| 0.33| 0.0| 0.0|
| 0.0 |0.0|一点三三|0.33| 2.0版本|0.0|
每种情绪可以在0.0 -> 3.0
之间的范围内
问题是:
如何对该数据进行归一化,使其范围为0 -> 1
:
1.通过以下方式规范化每列:
from sklearn.preprocessing import minmax_scale
for emo in ['happy', 'sad', 'anger', 'surprise', 'disgust', 'fear']:
mosei[emo] = minmax_scale(mosei[emo])
字符串
这给予我ie:1.33,0.0,0.0,0.0,0.0,0.0
-> 0.44,0.0,0.0,0.0,0.0,0.0
2.0,0.0,0.0,0.33,0.0,0.0
-> 0.67,0.0,0.0,0.11,0.0,0.0
0.0,0.0,1.33,0.33,2.0,0.0
-> 0.0,0.0,0.44,0.11,0.67,0.0
但对于最后一个例子sum() > 1
2.规范化每列,在数据加载器中执行softmax()
>>> F.softmax(torch.tensor([0.44,0.0,0.0,0.0,0.0,0.0]), dim=0)
tensor([0.2370, 0.1526, 0.1526, 0.1526, 0.1526, 0.1526])
>>> F.softmax(torch.tensor([0.0,0.0,0.44,0.11,0.67,0.0]), dim=0)
tensor([0.1312, 0.1312, 0.2037, 0.1464, 0.2564, 0.1312])
型
3.按行而不是按列进行归一化
>>> minmax_scale([1.33,0.0,0.0,0.0,0.0,0.0])
array([1., 0., 0., 0., 0., 0.])
>>> minmax_scale([0.0,0.0,1.33,0.33,2.0,0.0])
array([0. , 0. , 0.665, 0.165, 1. , 0. ])
型
但同样是最后一个例子sum() > 1
- 也许又是softmax
F.softmax(torch.tensor([0. , 0. , 0.665, 0.165, 1. , 0. ]), dim=0)
tensor([0.1131, 0.1131, 0.2199, 0.1334, 0.3074, 0.1131])
型
或者可能有不同/更好的标准化方法?
1条答案
按热度按时间doinxwow1#
Softmax通常用于ML中的标准化。但是,您也可以根据您的df执行以下操作:
字符串
其中行被归一化,并且总和为1。