Jupyter内核一直在尝试将Pandas转换为NumPy

vulvrdjw  于 2023-02-06  发布在  其他
关注(0)|答案(1)|浏览(79)

我有下面的代码,我试图在Jupyter笔记本中逐行运行,但是它一到达Pandas Dataframe 被转换成NumPy的那一行,它就一直死机。

#importing libraries
import sqlalchemy
import spacy
import numpy as np
import pandas as pd

#connecting to database and reading into dataframe with sqlalchemy
user_inputs = "SELECT * FROM t1"
rasa_questions = "SELECT * FROM o2"

server = 'DEM'
db = 's'

engine = sqlalchemy.create_engine('mssql+pyodbc://' + server + '/' + db + '?driver=SQL+Server')

user_inputs_df = pd.read_sql_query(user_inputs, engine)
rasa_questions_df = pd.read_sql_query(rasa_questions, engine)

#loading spacy
nlp = spacy.load("de_core_news_lg")

rasa_questions_list = rasa_questions_df["F"]
user_input_list = user_inputs_df["U"]

rasa_vector = [nlp(s).vector for s in rasa_questions_list]
user_vector = [nlp(s).vector for s in user_input_list]

similarity_scores = np.inner(rasa_vector, user_vector) / (np.linalg.norm(rasa_vector, axis=1) * np.linalg.norm(user_vector, axis=1))

data = []
for i in range(len(rasa_questions_list)):
    for j in range(len(user_input_list)):
        data.append([rasa_questions_list[i], user_input_list[j], similarity_scores[i][j]])

O2_Similarity_Scores = pd.DataFrame(data, columns=['RASA Frage', 'User Input', 'Similarity Score'])
print(O2_Similarity_Scores)

所以,这是使内核失效的代码行-相似性得分= np.内部(Rasa_vector,用户_vector)/(np.线性代数.范数(rasa_vector,轴=1)* np.线性代数.范数(用户_vector,轴=1))
我使用的是Windows 10和Python 3.9.12,我做错了什么?

c3frrgcw

c3frrgcw1#

扩展您的评论以使其可读:

dot_similarity_scores = np.matmul(rasa_vector, user_vector) 
rasa_vector_norm = np.linalg.norm(rasa_vector, axis=0) 
user_vector_norm = np.linalg.norm(user_vector, axis=1)

运行无误。下一步

rasa_vector_norm_2d = np.reshape(rasa_vector_norm, (300, 1)) 
user_vector_norm_2d = np.reshape(user_vector_norm, (1, 300))

也是可以的。但是试图计算

norm_product = rasa_vector_norm_2d * user_vector_norm_2d

获取错误

ValueError: operands could not be broadcast together with shapes (300,211537) (12234,300)

最后一个对于变量交换的matmul来说是可以的,但是元素乘法的形状是错误的。
但是在前面的整形中,这些数组应该是(300,1)和(1,300),而"*"将产生(300,300)结果。

相关问题