我试图通过尝试运行一些提供的示例(例如本例中的https://github.com/keras-team/keras/blob/tf-keras-2/examples/mnist_cnn.py)来了解Keras,并看看如何使用它们来查看会发生什么。然而,正如该文件顶部所述,该示例的基准输出,即99.25%的准确率,远远高于我在Google Colab(使用T4 GPU)中获得的输出,即85%(0.8503000140190125)。
我的输出,当简单地复制和粘贴链接文件到谷歌可乐,给我以下输出:
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Epoch 1/12
469/469 [==============================] - 8s 10ms/step - loss: 2.2807 - accuracy: 0.1435 - val_loss: 2.2415 - val_accuracy: 0.3414
Epoch 2/12
469/469 [==============================] - 4s 10ms/step - loss: 2.2157 - accuracy: 0.2814 - val_loss: 2.1615 - val_accuracy: 0.5900
Epoch 3/12
469/469 [==============================] - 4s 9ms/step - loss: 2.1305 - accuracy: 0.4081 - val_loss: 2.0526 - val_accuracy: 0.6552
Epoch 4/12
469/469 [==============================] - 4s 9ms/step - loss: 2.0150 - accuracy: 0.4893 - val_loss: 1.9049 - val_accuracy: 0.6928
Epoch 5/12
469/469 [==============================] - 5s 10ms/step - loss: 1.8653 - accuracy: 0.5421 - val_loss: 1.7169 - val_accuracy: 0.7290
Epoch 6/12
469/469 [==============================] - 4s 9ms/step - loss: 1.6864 - accuracy: 0.5822 - val_loss: 1.4985 - val_accuracy: 0.7573
Epoch 7/12
469/469 [==============================] - 5s 10ms/step - loss: 1.4975 - accuracy: 0.6175 - val_loss: 1.2778 - val_accuracy: 0.7841
Epoch 8/12
469/469 [==============================] - 4s 9ms/step - loss: 1.3218 - accuracy: 0.6478 - val_loss: 1.0859 - val_accuracy: 0.8070
Epoch 9/12
469/469 [==============================] - 5s 10ms/step - loss: 1.1783 - accuracy: 0.6739 - val_lloss: 0.9350 - val_accuracy: 0.8256
Epoch 10/12
469/469 [==============================] - 4s 10ms/step - loss: 1.0702 - accuracy: 0.6944 - val_loss: 0.8224 - val_accuracy: 0.8354
Epoch 11/12
469/469 [==============================] - 4s 9ms/step - loss: 0.9836 - accuracy: 0.7120 - val_loss: 0.7383 - val_accuracy: 0.8433
Epoch 12/12
469/469 [==============================] - 4s 9ms/step - loss: 0.9166 - accuracy: 0.7276 - val_loss: 0.6741 - val_accuracy: 0.8503
Test loss: 0.6741476655006409
Test accuracy: 0.8503000140190125oss: 0.9350 - val_accuracy: 0.8256
正如您所看到的,与示例文件顶部的注解相比,Google Colab在任何时间点上花费的时间都要少得多。我想知道我是不是漏掉了什么。例如,为什么他们说“参数调整还有很大的余地”?这应该是某种“教程”,我应该调整这些参数,直到我得到他们的“圣杯”99.25%?
1条答案
按热度按时间owfi6suc1#
根据这个SO回答,问题源于示例中使用的旧Keras实现(
learning_rate=1
)和Adadelta的当前tensorflow/Keras实现(learning_rate=0.001
)之间的不同默认学习率。如果将学习率显式设置为1:
optimizer=tf.keras.optimizers.Adadelta(learning_rate=1)
你将在12个时期内获得0.99的精度。
这也在current API reference for Adadelta中得到了暗示:“请注意,与其他优化器相比,Adadelta往往受益于更高的初始学习率值。要与原始论文中的形式完全匹配,请使用1.0。