python 如何将分类值转换为数值并保存对原始数据的更改?

xeufq47z  于 2022-10-30  发布在  Python
关注(0)|答案(1)|浏览(84)

我有这13个专栏:

我想将“Category”列拆分为测试集,其余的拆分为训练集。我使用sklearn,sklearn最适合处理数值,因此我想将“Sex”列设置为数值。我已经完成了以下代码,将“Sex”值(m或f)转换为数值(1和0)


# Convert categorical values in 'sex' column to numerical

from sklearn import preprocessing
le=preprocessing.LabelEncoder()

sex_new=sex_new.apply(le.fit_transform)

# Check the numerical values

sex_new.Sex.unique()

但是我不知道下一步该怎么做,原来的数据似乎没有受到分类到数值的变化的影响。
下面是我所做工作的完整代码:

import sys
import pandas as pd
import numpy as np
import sklearn
import matplotlib
import keras

import matplotlib.pyplot as plt
from pandas.plotting import scatter_matrix

# Data location

url='https://archive.ics.uci.edu/ml/machine-learning-databases/00571/hcvdat0.csv'

df=pd.read_csv(url)
df.head(2)

df.info()

# Drop the unnamed column

df_=df.drop("Unnamed: 0",axis=1)

df_.info()

# Assign 'sex' column into a variable

sex_new=df_.iloc[:, 2:3]

# How many unique values in 'sex_new'?

sex_new.Sex.unique()

# Convert categorical values in 'sex' column to numerical

from sklearn import preprocessing
le=preprocessing.LabelEncoder()

sex_new=sex_new.apply(le.fit_transform)

# Check the numerical values

sex_new.Sex.unique()

或者我应该只将两列都放入dtype对象中进行测试吗?
如果你们知道任何其他的最佳选择来做培训和测试这个数据集一定要与我分享。

aij0ehis

aij0ehis1#

检查标签编码器的语法

变更:

sex_new=sex_new.apply(le.fit_transform)

收件人:

sex_new=le.fit_transform(sex_new)

标签编码器的fit转换语法应采用以下格式:fit_transform(<label>) .

代码:

import sys
import pandas as pd
import numpy as np
import sklearn
import matplotlib
import keras

import matplotlib.pyplot as plt
from pandas.plotting import scatter_matrix

# Data location

url='https://archive.ics.uci.edu/ml/machine-learning-databases/00571/hcvdat0.csv'

df=pd.read_csv(url)
df.head()

# Drop the unnamed column

df_=df.drop("Unnamed: 0",axis=1)
df_.head()

# Assign 'sex' column into a variable

sex_new=df_.Sex
sex_new

# How many unique values in 'sex_new'?

sex_new.unique()

# Convert categorical values in 'sex' column to numerical

from sklearn import preprocessing
le=preprocessing.LabelEncoder()
sex_new=le.fit_transform(sex_new) #Edit is on this line
sex_new

输出量:
Output

参考:

相关问题