我已经成功地实现并运行了图像数据(MNIST数字)的自动编码器。我通过Anaconda Navigator使用Spyder。我运行Python 3.7.1。
我根据经过审查的示例构建了一个简单的CNN。我的代码通过完成模型和加载训练数据(在本例中,CIFAR10)来执行。当我调用model.fit()
时,代码崩溃,没有错误,也没有在内核中留下任何变量。
1.如何监视此代码的执行以更好地了解它崩溃的原因?
1.是我的代码错误导致了崩溃吗?或者,这可能是一个环境或内存错误?
我从可能工作的CNN示例中复制了类似的代码,并使用已发布的代码复制了该行为(尽管我的自动编码器代码在相同的环境中工作)。
下面是我的代码的相关部分:
from keras.layers import Input, Dense, Flatten, Conv2D, MaxPooling2D
from keras.models import Model
from keras.utils import to_categorical
from keras.datasets import cifar10
proceedtofit = True
#define input shape
input=Input(shape=(32,32,3))
#define layers
predictions=Conv2D(16,(3,3),activation='relu',padding='same')(input)
predictions=MaxPooling2D(pool_size=(2,2),strides=None,padding='same')(predictions)
predictions=Conv2D(4,(3,3),activation='relu',padding='same')(predictions)
predictions=MaxPooling2D(pool_size=(2,2),strides=None,padding='same')(predictions)
predictions=Flatten()(predictions)
predictions=Dense(32,activation='relu')(predictions)
predictions=Dense(10,activation='sigmoid')(predictions)
#integrate into model
model=Model(inputs=input,outputs=predictions)
#print("Succesfully integrated model.")
model.summary()
#compile (choose optimizer and loss function)
model.compile(loss='categorical_crossentropy',metrics=['accuracy'],optimizer='adam')
#input training and test data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# Convert class vectors to binary class matrices.
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
#train model
if proceedtofit:
model.fit(x_train, y_train, batch_size=10, epochs=50, shuffle=True,
validation_data=(x_test, y_test))
print("Finished fit.")
字符串
代码在内核中执行并生成预期的模型摘要。如果proceedtofit
是False
,则代码正常退出。如果proceedtofit
是True
,则代码调用model.fit()
方法并崩溃。详细的输出开始到结束是:
Python 3.7.0 (default, Jun 28 2018, 07:39:16)
Type "copyright", "credits" or "license" for more information.
IPython 7.2.0 -- An enhanced Interactive Python.
runfile('/Users/Fox/Documents/Python Machine Learning/convclass.py', wdir='/Users/Fox/Documents/Python Machine Learning')
WARNING:tensorflow:From /Applications/anaconda3/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Using TensorFlow backend.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 32, 32, 3) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 32, 32, 16) 448
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 16, 16, 16) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 16, 16, 4) 580
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 8, 8, 4) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 256) 0
_________________________________________________________________
dense_1 (Dense) (None, 32) 8224
_________________________________________________________________
dense_2 (Dense) (None, 10) 330
=================================================================
Total params: 9,582
Trainable params: 9,582
Non-trainable params: 0
_________________________________________________________________
(50000, 1)
(50000, 10)
WARNING:tensorflow:From /Applications/anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Train on 50000 samples, validate on 10000 samples
Epoch 1/50
2019-08-04 16:32:52.400023: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2019-08-04 16:32:52.400364: I tensorflow/core/common_runtime/process_util.cc:71] Creating new thread pool with default inter op setting: 8. Tune using inter_op_parallelism_threads for best performance.
型
此时,代码退出并返回到内核提示符。训练(拟合)没有执行,也没有返回错误。模型不再存在于内存中。也就是说,在提示符下调用model.summary()
会产生以下错误:
[1]:model.summary()
Traceback (most recent call last):
File "<ipython-input-1-5f15418b3570>", line 1, in <module>
model.summary()
NameError: name 'model' is not defined
型
在注解之后,我在终端中运行了代码。我确实得到了更详细的输出和错误报告。我还不明白它,但至少它是一个开始的地方。想法?(见下文)
OMP: Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized.
OMP: Hint: This means that multiple copies of the OpenMP runtime have been
linked into the program. That is dangerous, since it can degrade performance or
cause incorrect results. The best thing to do is to ensure that only a single
OpenMP runtime is linked into the process, e.g. by avoiding static linking of the
OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround
you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program
to continue to execute, but that may cause crashes or silently produce incorrect
results. For more information, please see
http://www.intel.com/software/products/support/.
Abort trap: 6
型
我发现了这个。Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized看起来很有希望。我将探讨所提供的建议,然后也许这个问题应该与其他讨论结合起来?
1条答案
按热度按时间k3fezbri1#
在命令shell而不是Spyder中运行代码后,我捕获了错误并确定了一个已经回答的相关问题。
基于Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized中的讨论,我使用
conda remove tensorflow
删除了tensorflow,然后使用字符串
和
型
然后我重新运行了代码,一切都在命令shell和Spyder中工作。