首页 > 解决方案 > 我的问题是代码中应该有哪些变化

问题描述

问题

我正在使用 tensorflow 在我的 GPU 上训练 CNN 模型,但内存不足

我尝试过的事情

我尝试更改我的 batch_size ,有一个积极的变化,但最终内存不足

模型=顺序()

代码

enter code here

model.add(Conv2D(64, (3, 3), input_shape=X.shape[1:]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(64, (3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten())
model.add(Dense(64))

model.add(Dense(1))
model.add(Activation("sigmoid"))

model.compile(loss="binary_crossentropy",optimizer="adam",metrics= 
['accuracy'])
model.fit(X, Y, batch_size=32, validation_split=0.1)

输出

C:\Anaconda3\envs\tutorial\pythonw.exe "C:/Users/roshaan zafar/PycharmProjects/InternshipRiseTech/main.py"
WARNING: Logging before flag parsing goes to stderr.
W0820 13:05:23.726494 24488 deprecation.py:506] From C:\Anaconda3\envs\tutorial\lib\site-packages\tensorflow\python\ops\init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0820 13:05:23.817250 24488 deprecation.py:323] From C:\Anaconda3\envs\tutorial\lib\site-packages\tensorflow\python\ops\nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Train on 360 samples, validate on 40 samples
2019-08-20 13:05:24.028720: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-08-20 13:05:24.030744: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library nvcuda.dll
2019-08-20 13:05:24.976333: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:01:00.0
2019-08-20 13:05:24.976601: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-08-20 13:05:24.977484: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-08-20 13:05:25.734584: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-08-20 13:05:25.734785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2019-08-20 13:05:25.734905: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2019-08-20 13:05:25.735694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6376 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-08-20 13:05:26.180767: W tensorflow/core/framework/allocator.cc:107] Allocation of 1240006656 exceeds 10% of system memory.
2019-08-20 13:05:26.834340: W tensorflow/core/framework/allocator.cc:107] Allocation of 1240006656 exceeds 10% of system memory.
2019-08-20 13:05:27.476075: W tensorflow/core/framework/allocator.cc:107] Allocation of 1240006656 exceeds 10% of system memory.
2019-08-20 13:05:28.102630: W tensorflow/core/framework/allocator.cc:107] Allocation of 1240006656 exceeds 10% of system memory.
2019-08-20 13:05:28.715843: W tensorflow/core/framework/allocator.cc:107] Allocation of 1240006656 exceeds 10% of system memory.
2019-08-20 13:05:47.982488: W tensorflow/core/common_runtime/bfc_allocator.cc:314] Allocator (GPU_0_bfc) ran out of memory trying to allocate 9.34GiB (rounded to 10029662208).  Current allocation summary follows.
2019-08-20 13:05:47.983224: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (256):   Total Chunks: 47, Chunks in use: 47. 11.8KiB allocated for chunks. 11.8KiB in use in bin. 1.5KiB client-requested in use in bin.
2019-08-20 13:05:47.983956: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (512):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.984651: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (1024):  Total Chunks: 1, Chunks in use: 1. 1.3KiB allocated for chunks. 1.3KiB in use in bin. 1.0KiB client-requested in use in bin.
2019-08-20 13:05:47.985413: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (2048):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.986243: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (4096):  Total Chunks: 2, Chunks in use: 2. 13.5KiB allocated for chunks. 13.5KiB in use in bin. 13.5KiB client-requested in use in bin.
2019-08-20 13:05:47.988224: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (8192):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.988864: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (16384):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.989820: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (32768):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.990495: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (65536):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.991146: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (131072):    Total Chunks: 2, Chunks in use: 2. 288.0KiB allocated for chunks. 288.0KiB in use in bin. 288.0KiB client-requested in use in bin.
2019-08-20 13:05:47.992567: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (262144):    Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.993545: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (524288):    Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.994186: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (1048576):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.994859: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (2097152):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.995569: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (4194304):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.996235: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (8388608):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.996924: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (16777216):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.997650: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (33554432):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.998404: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (67108864):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.999135: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (134217728):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.999876: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (268435456):     Total Chunks: 5, Chunks in use: 3. 6.23GiB allocated for chunks. 2.75GiB in use in bin. 2.75GiB client-requested in use in bin.
2019-08-20 13:05:48.000650: I tensorflow/core/common_runtime/bfc_allocator.cc:780] Bin for 9.34GiB was 256.00MiB, Chunk State: 
2019-08-20 13:05:48.001093: I tensorflow/core/common_runtime/bfc_allocator.cc:786]   Size: 450.00MiB | Requested Size: 450.00MiB | in_use: 0 | bin_num: 20, prev:   Size: 256B | Requested Size: 8B | in_use: 1 | bin_num: -1, next:   Size: 256B | Requested Size: 128B | in_use: 1 | bin_num: -1
2019-08-20 13:05:48.003835: I tensorflow/core/common_runtime/bfc_allocator.cc:786]   Size: 3.04GiB | Requested Size: 0B | in_use: 0 | bin_num: 20, prev:   Size: 256B | Requested Size: 4B | in_use: 1 | bin_num: -1
2019-08-20 13:05:48.004577: I tensorflow/core/common_runtime/bfc_allocator.cc:793] Next region of size 6686052608
2019-08-20 13:05:48.013828: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400000 next 1 of size 1280
2019-08-20 13:05:48.014294: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400500 next 2 of size 256
2019-08-20 13:05:48.014708: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400600 next 3 of size 256
2019-08-20 13:05:48.015131: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400700 next 4 of size 256
2019-08-20 13:05:48.015622: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400800 next 5 of size 256
2019-08-20 13:05:48.016053: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400900 next 6 of size 256
2019-08-20 13:05:48.016492: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400A00 next 7 of size 256
2019-08-20 13:05:48.016914: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400B00 next 8 of size 256
2019-08-20 13:05:48.017347: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400C00 next 9 of size 256
2019-08-20 13:05:48.017774: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400D00 next 10 of size 256
2019-08-20 13:05:48.018202: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400E00 next 11 of size 256
2019-08-20 13:05:48.019604: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400F00 next 12 of size 256
2019-08-20 13:05:48.020000: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401000 next 13 of size 256
2019-08-20 13:05:48.020407: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401100 next 14 of size 256
2019-08-20 13:05:48.020801: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401200 next 15 of size 256
2019-08-20 13:05:48.021203: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401300 next 16 of size 256
2019-08-20 13:05:48.022177: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401400 next 17 of size 256
2019-08-20 13:05:48.022845: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401500 next 18 of size 256
2019-08-20 13:05:48.023458: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401600 next 19 of size 1240006656
2019-08-20 13:05:48.024110: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F291600 next 20 of size 256
2019-08-20 13:05:48.024721: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F291700 next 21 of size 147456
2019-08-20 13:05:48.025371: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F2B5700 next 22 of size 6912
2019-08-20 13:05:48.026024: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F2B7200 next 23 of size 256
2019-08-20 13:05:48.026686: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F2B7300 next 24 of size 256
2019-08-20 13:05:48.027396: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F2B7400 next 25 of size 1240006656
2019-08-20 13:05:48.027798: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000799147400 next 26 of size 147456
2019-08-20 13:05:48.028202: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916B400 next 27 of size 6912
2019-08-20 13:05:48.028598: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916CF00 next 28 of size 256
2019-08-20 13:05:48.028990: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D000 next 29 of size 256
2019-08-20 13:05:48.029868: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D100 next 30 of size 256
2019-08-20 13:05:48.030492: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D200 next 31 of size 256
2019-08-20 13:05:48.030887: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D300 next 32 of size 256
2019-08-20 13:05:48.031538: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D400 next 33 of size 256
2019-08-20 13:05:48.031931: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D500 next 34 of size 256
2019-08-20 13:05:48.032327: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D600 next 35 of size 256
2019-08-20 13:05:48.032722: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D700 next 36 of size 256
2019-08-20 13:05:48.033116: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 000000079916D800 next 37 of size 471859200
2019-08-20 13:05:48.034291: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007B536D800 next 38 of size 256
2019-08-20 13:05:48.034879: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007B536D900 next 39 of size 256
2019-08-20 13:05:48.035434: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007B536DA00 next 40 of size 256
2019-08-20 13:05:48.035832: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007B536DB00 next 41 of size 471859200
2019-08-20 13:05:48.036554: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156DB00 next 42 of size 256
2019-08-20 13:05:48.037253: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156DC00 next 43 of size 256
2019-08-20 13:05:48.037949: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156DD00 next 44 of size 256
2019-08-20 13:05:48.038697: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156DE00 next 45 of size 256
2019-08-20 13:05:48.039204: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156DF00 next 46 of size 256
2019-08-20 13:05:48.039676: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E000 next 47 of size 256
2019-08-20 13:05:48.040135: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E100 next 48 of size 256
2019-08-20 13:05:48.041145: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E200 next 49 of size 256
2019-08-20 13:05:48.041535: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E300 next 50 of size 256
2019-08-20 13:05:48.041819: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E400 next 51 of size 256
2019-08-20 13:05:48.042130: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E500 next 52 of size 256
2019-08-20 13:05:48.042426: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E600 next 53 of size 256
2019-08-20 13:05:48.042713: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E700 next 54 of size 256
2019-08-20 13:05:48.043016: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E800 next 55 of size 256
2019-08-20 13:05:48.043276: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E900 next 56 of size 256
2019-08-20 13:05:48.043572: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 00000007D156EA00 next 18446744073709551615 of size 3261998848
2019-08-20 13:05:48.043902: I tensorflow/core/common_runtime/bfc_allocator.cc:809]      Summary of in-use Chunks by size: 
2019-08-20 13:05:48.044196: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 47 Chunks of size 256 totalling 11.8KiB
2019-08-20 13:05:48.044466: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 1280 totalling 1.3KiB
2019-08-20 13:05:48.044760: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 2 Chunks of size 6912 totalling 13.5KiB
2019-08-20 13:05:48.045032: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 2 Chunks of size 147456 totalling 288.0KiB
2019-08-20 13:05:48.045250: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 471859200 totalling 450.00MiB
2019-08-20 13:05:48.045553: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 2 Chunks of size 1240006656 totalling 2.31GiB
2019-08-20 13:05:48.045830: I tensorflow/core/common_runtime/bfc_allocator.cc:816] Sum Total of in-use chunks: 2.75GiB
2019-08-20 13:05:48.046120: I tensorflow/core/common_runtime/bfc_allocator.cc:818] total_region_allocated_bytes_: 6686052608 memory_limit_: 6686052843 available bytes: 235 curr_region_allocation_bytes_: 13372105728
2019-08-20 13:05:48.046453: I tensorflow/core/common_runtime/bfc_allocator.cc:824] Stats: 
Limit:                  6686052843
InUse:                  2952194560
MaxInUse:               3424053760
NumAllocs:                      64
MaxAllocSize:           1240006656

2019-08-20 13:05:48.046834: W tensorflow/core/common_runtime/bfc_allocator.cc:319] **************************************______********________________________________________________
2019-08-20 13:05:48.052167: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at conv_ops.cc:486 : Resource exhausted: OOM when allocating tensor with shape[32,64,1278,958] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc


2019-08-20 13:05:48.052167: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at conv_ops.cc:486 : Resource exhausted: OOM when allocating tensor with shape[32,64,1278,958] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
  File "C:/Users/roshaan zafar/PycharmProjects/InternshipRiseTech/main.py", line 109, in <module>
    model.fit(X, Y, batch_size=32, validation_split=0.1)

标签: tensorflowout-of-memoryconv-neural-networktf.keras

解决方案


The feature vector dimension in your network after flatten is 1278 x 958. You are going to have 64 (total filters) x 1278 x 958 x 64 (Dense units) variables (without considering bias variable) in your memory. That number is really huge to be handled by your GPU.

Consider either resizing your input image to smaller size or else, consider adding more layers (Conv2d with maxpooling) in your network. Last option is to consider replacing flatten layer with GlobalMaxPooling or GlobalAveragePooling.


推荐阅读