python 如何在sklearn中从字符串数据中获取predict

cbwuti44 于 2023-03-28 发布在 Python

关注(0)|答案(1)|浏览(100)

当我将数据从pandas dataframe转换到sklearn以便进行预测时。字符串数据变得有问题。所以我使用了labelencoder，但它似乎限制我使用编码数据而不是源字符串数据。
在sklearn的predict方法中，我想对这个输入进行预测：

learn_to_machine=dtc.fit(X,Y)
test=[
    [128, 6 ,50, 'mobile_phone', 'Samsung', 6000],
    [512, 8, 65, 'mobile_phone', 'Huawei',5000]
        ]
answer=learn_to_machine.predict(test)
print(answer[0])
print(answer[1])
# 11399000
# 15304000

而不是这个

learn_to_machine=dtc.fit(X,Y)
test=[
    [128, 6 ,50, 0, 2, 6000],
    [512, 8, 65, 0, 3,5000]
        ]
answer=learn_to_machine.predict(test)
print(answer[0])
print(answer[1])
# 11399000
# 15304000

如果有帮助的话，这里是我所有的代码：

import sqlalchemy
import pandas as pd
read_engine=sqlalchemy.create_engine('mysql+mysqlconnector://root:@localhost/six')
conn = read_engine.connect()
df_new=pd.read_sql_table('mobile1' ,con= conn )
df_new['price']=df_new['price'].astype(int)
df_new['ram']=df_new['ram'].astype(int)
df_new['battery']=df_new['battery'].astype(int)
df_new['size']=df_new['size'].astype(float)
df_new['camera']=df_new['camera'].mask(df_new['camera'] == '')
df_new['camera']=df_new['camera'].mask(df_new['camera'] == ' ')
df_new['camera']=df_new['camera'].mask(df_new['camera'] == '  ')
df_new['camera']=df_new['camera'].fillna(0)
df_new['camera']=df_new['camera'].astype(float)

X=df_new[['ram','size','camera','product','Brand','battery']].values
Y=df_new[['price']].values

from sklearn import preprocessing
product_enc=preprocessing.LabelEncoder()
product_enc.fit([char for char in X[:,4]])
X[:,4]=product_enc.transform(X[:,4])
product_enc.fit([ char for char in X[:,3]])
X[:,3]=product_enc.transform(X[:,3])
from sklearn import tree
dtc=tree.DecisionTreeClassifier()
learn_to_machine=dtc.fit(X,Y)

# when i execute with this its ok
test=[
    [128, 6 ,50, 0, 2, 6000],
    [512, 8, 65, 0, 3,5000]
        ]

answer=learn_to_machine.predict(test)
print(answer[0])
print(answer[1])
# 11399000
# 15304000

当我尝试执行达特的时候：

test=[
    [128, 6 ,50, 'mobile_phone', 'Samsung', 6000],
    [512, 8, 65, 'mobile_phone', 'Huawei',5000]
        ]

此错误引发：ValueError: could not convert string to float: 'mobile_phone'

python

来源：https://stackoverflow.com/questions/75856297/how-to-get-predict-from-string-data-in-sklearn

1条答案

按热度按时间

hmtdttj41#

首先，你可能应该改变你的两个不同的labelencoder有2个不同的名称-

product_enc=preprocessing.LabelEncoder()
product_enc.fit([char for char in X[:,3]])
X[:,3]=product_enc.transform(X[:,3])

company_enc=preprocessing.LabelEncoder()
company_enc.fit([ char for char in X[:,4]])
X[:,4]=company_enc.transform(X[:,4])

然后您可以自动转换新的原始数据

test=[
    [128, 6 ,50, 'mobile_phone', 'Samsung', 6000],
    [512, 8, 65, 'mobile_phone', 'Huawei',5000]
        ]
test_transform = test
test_transform[:,3] = product_enc.transform(test[:,3])
test_transform[:,4] = company_enc.transform(test[:,4])

answer=learn_to_machine.predict(test_transform)

赞(0）回复(0）举报 2023-03-28

我来回答

python 如何在sklearn中从字符串数据中获取predict

1条答案

相关问题

热门标签

最新问答