numpy 找到dim为3的数组,check_pairwise_arrays expected < = 2

djmepvbi  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(158)

我有一个文本文件'commands.txt',其中包括一些命令,两个python文件(train.pymain.py)。在这里,train.py将创建一个名为commands_model.h5的模型。然后,使用main.py我想做的是,当我输入一些命令时,它(main.py)将使用该模型,然后返回正确的命令给我,该命令位于commands.txt文件中。.但是当我使用main.py时,它显示此错误,

Started
2023-09-06 13:44:05.970618: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Enter an command: check connection
1/1 [==============================] - 0s 327ms/step
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 47ms/step
Traceback (most recent call last):
  File "D:\Advanced Robot\commands\test.py", line 32, in <module>
    similarity_scores = cosine_similarity(user_input_embedding, command_embeddings)
  File "C:\Users\hp\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\metrics\pairwise.py", line 1393, in cosine_similarity
    X, Y = check_pairwise_arrays(X, Y)
  File "C:\Users\hp\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\metrics\pairwise.py", line 163, in check_pairwise_arrays
    Y = check_array(
  File "C:\Users\hp\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\utils\validation.py", line 915, in check_array
    raise ValueError(
ValueError: Found array with dim 3. check_pairwise_arrays expected <= 2.

这是我的main.py

print("Started")
import tensorflow as ten
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

tokenizer = Tokenizer()

loaded_model = ten.keras.models.load_model('commands_model.h5')

max_sequence_length = 100

inp = input("Enter an command: ")
user_input_sequence = tokenizer.texts_to_sequences([inp])
padded_user_input = pad_sequences(user_input_sequence, maxlen=max_sequence_length, padding='post', truncating='post')

user_input_embedding = loaded_model.predict(padded_user_input)

command_embeddings = []

with open('commands.txt', 'r') as file:
    commands = file.read().splitlines()

for command in commands:
    command_sequence = tokenizer.texts_to_sequences([command])
    paded_command = pad_sequences(command_sequence, maxlen=max_sequence_length, padding='post', truncating='post')
    command_embedding = loaded_model.predict(paded_command)
    command_embeddings.append(command_embedding)

command_embeddings = np.array(command_embeddings)
similarity_scores = cosine_similarity(user_input_embedding, command_embeddings)
most_similar_command_index = np.argmax(similarity_scores)
most_similar_command = commands[most_similar_command_index]

print("Most similar: ", most_similar_command)

这是我的train.py

print("Started")

import tensorflow as tf
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from sklearn.preprocessing import LabelEncoder
print('Imported')
# Load the file which contain the list of commands
with open('commands.txt', 'r') as command_file:
    text_data = command_file.read().splitlines()

vocab_size = 1000
embedding_dim = 16
num_epochs = 500
batch_size = 32
labels = []
for word in text_data:
    labels.append(word)

lbl_encoder = LabelEncoder()
lbl_encoder.fit(labels)
labels = lbl_encoder.transform(labels)

# Tokenize the text
tokenizer = Tokenizer()
tokenizer.fit_on_texts(text_data)
sequences = tokenizer.texts_to_sequences(text_data)

# Pad equences to make them the same length
max_sequence_length = 100
paded_sequences = pad_sequences(sequences, maxlen=max_sequence_length, padding='post', truncating='post')

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_sequence_length),
    tf.keras.layers.LSTM(units=64),
    tf.keras.layers.Dense(1, activation='sigmoid')
])
print("Compiling....")
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
from sklearn.model_selection import train_test_split

x_train, x_val, y_train, y_val = train_test_split(paded_sequences, np.array(labels), test_size=0.2, random_state=42)
model.fit(x_train, y_train, epochs=num_epochs, batch_size=batch_size, validation_data=(x_val, y_val))
print("Model saving....")
model.save("commands_model.h5")
print("Model saved")

我找不到包含错误的文件,我的意思是我创建模型的方式是否错误?或者是www.example.com中的错误main.py?

cbjzeqam

cbjzeqam1#

这个错误告诉你,它找到了一个维度为3的数组,但它期望不超过2个维度,发生在这一行:

similarity_scores = cosine_similarity(user_input_embedding, command_embeddings)

在你的代码中,user_input_embedding是一个3D数组(一个Tensor),command_embeddings是一个3D数组列表,但cosine_similarity()期望两个输入都是2D数组。
您可以尝试将用户输入和每个命令嵌入都重塑为2D数组,然后像这样找到相似性:

similarity_scores = []

for command_embedding in command_embeddings:
    similarity = cosine_similarity(user_input_embedding.reshape(1, -1), command_embedding.reshape(1, -1))
    similarity_scores.append(similarity[0][0])

most_similar_command_index = np.argmax(similarity_scores)
most_similar_command = commands[most_similar_command_index]

相关问题