keras .predict()仅在CPU上运行,即使GPU可用

mwngjboj  于 2023-04-30  发布在  其他
关注(0)|答案(6)|浏览(144)

我在一台安装并启用了GPU的机器上使用了this script to train a model & predict,它似乎在预测阶段只使用了CPU。
我在.predict()部分看到的设备放置日志如下所示:

2020-09-01 06:08:19.085400: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op RangeDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.085617: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op RepeatDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.089558: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op MapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.090003: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op PrefetchDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.097064: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op FlatMapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.097647: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op TensorDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.097802: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op RepeatDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.097957: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op ZipDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.101284: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op ParallelMapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.101865: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op ModelDataset in device /job:localhost/replica:0/task:0/device:CPU:0

即使当我跑步时

print(tf.config.experimental.list_physical_devices('GPU'))

我收到:

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU')]

我使用的代码是can be found herefull output logs can be seen here
更多上下文:
Python:Python
Tensorflow:2.1.0
GPU:Nvidia Tesla V100-PCIE-16GB
CPU:Intel Xeon Gold 5218 CPU @ 2.30GHz
内存:394851272 KB
操作系统:Linux

lmyy7pcs

lmyy7pcs1#

由于您已经有了一个GPU,我假设tf.test.is_gpu_available()返回True。您可以使用这段代码强制TensorFlow使用特定的设备-

with tf.device('/gpu:0'):
    // GPU stuff

如果你想强制它使用CPU来代替代码的某些部分,这也是可行的。

with tf.device('/cpu:0'):
    // CPU stuff

一个插件,这可能是有用的,而使用tf.device(),你可以使用这个函数列出所有你拥有的设备-

def get_available_devices():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos]

get_available_devices()

虽然对于你提到的用例,我不保证GPU的推理速度更快。

4szc88ey

4szc88ey2#

听起来你需要根据文档使用Distributed Strategy。然后,您的代码将变成如下所示:

tf.debugging.set_log_device_placement(True)
strategy = tf.distribute.MirroredStrategy()

with strategy.scope():
    model = keras.Sequential(
        [
            keras.layers.Flatten(input_shape=(28, 28)),
            keras.layers.Dense(128, activation='relu'),
            keras.layers.Dense(10)
        ]
    )
    model.compile(
        optimizer='adam', 
        loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), 
        metrics=['accuracy']
    )
    model.fit(train_images, train_labels, epochs=10)

    test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)
    probability_model = tf.keras.Sequential(
        [model, tf.keras.layers.Softmax()]
    )
    probability_model.predict(test_images)

根据文档,使用多个GPU的最佳实践是使用tf。分发策略

hvvq6cgz

hvvq6cgz3#

您的预测功能使用GPU。而且,我已经用你的代码在NVIDIA 1080 GTX上重新计算了时间&它需要100毫秒的推理时间。
重新启动系统或检查GPU是否正在使用。
下面是说明推理在GPU上运行的代码行:
2020-09-01 06:19:15.885778:I tensorflow/core/common_runtime/eager/execute。cc:573]在设备/作业:localhost/replica:0/task:0/设备:GPU:0中执行op __inference_distributed_function_58022

yhived7q

yhived7q4#

您是否使用了正确的tensorflow包?卸载tensorflow并安装tensorflow-gpu可能会有所帮助。
有关文档,请参见:https://www.tensorflow.org/install/gpu

luaexgnf

luaexgnf5#

你能从tensorflow.python打电话给keras吗?
喜欢;

from tensorflow.python.keras.models import Sequential

另外,请检查CUDA和CuDNN版本。Cuda和CuDNN版本必须与Tensorflow版本兼容。您可以从here查看。由于您的tensorflow版本是2.1,CUDA和CuDNN版本必须分别是10.17.6

a2mppw5e

a2mppw5e6#

既然你有一个GPU, www.example.com _gpu_available()应该返回True。此代码强制TensorFlow使用指定的设备:

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

相关问题