Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't train on GPU . Model Quick96 error #132

Open
prashantspandey opened this issue Mar 15, 2023 · 1 comment
Open

Can't train on GPU . Model Quick96 error #132

prashantspandey opened this issue Mar 15, 2023 · 1 comment

Comments

@prashantspandey
Copy link

Choose one or several GPU idxs (separated by comma).

[CPU] : CPU
[0] : METAL

[0] Which GPU indexes to choose? :
0

Metal device set to: Apple M2 Pro

systemMemory: 16.00 GB
maxCacheSize: 5.33 GB

18 devices <core.leras.device.Devices object at 0x120f93dc0>
GPU COUNT 1
gpu id 0
devices /CPU:0
Initializing models: 0%| | 0/5 [00:00<?, ?it/s]

Error: Graph execution error:

Detected at node 'Mean_2' defined at (most recent call last):
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 930, in _bootstrap
      self._bootstrap_inner()
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 973, in _bootstrap_inner
      self.run()
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 910, in run
      self._target(*self._args, **self._kwargs)
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/mainscripts/Trainer.py", line 46, in trainerThread
      model = models.import_model(model_class_name)(
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/ModelBase.py", line 193, in __init__
      self.on_initialize()
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/Model_Quick96/Model.py", line 143, in on_initialize
      gpu_src_loss =  tf.reduce_mean ( 10*nn.dssim(gpu_target_src_masked_opt, gpu_pred_src_src_masked_opt, max_val=1.0, filter_size=int(resolution/11.6)), axis=[1])
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/ops/__init__.py", line 308, in dssim
      ssim_val = tf.reduce_mean(luminance * cs, axis=nn.conv2d_spatial_axes )
Node: 'Mean_2'
Cannot assign a device for operation Mean_2: Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/device:GPU:0' assigned_device_name_='' resource_device_name_='' supported_device_types_=[CPU] possible_devices_=[]
FloorDiv: GPU CPU 
RealDiv: GPU CPU 
Maximum: GPU CPU 
Cast: GPU CPU 
FloorMod: GPU CPU 
BroadcastTo: GPU CPU 
Shape: GPU CPU 
Range: CPU 
DynamicStitch: CPU 
Reshape: GPU CPU 
Mean: GPU CPU 
Prod: GPU CPU 
AddV2: GPU CPU 
Const: GPU CPU 
Fill: GPU CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  Mean_2 (Mean) /device:GPU:0
  gradients/Mean_2_grad/Shape (Shape) /device:GPU:0
  gradients/Mean_2_grad/Size (Const) /device:GPU:0
  gradients/Mean_2_grad/add (AddV2) /device:GPU:0
  gradients/Mean_2_grad/mod (FloorMod) /device:GPU:0
  gradients/Mean_2_grad/Shape_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/range/start (Const) /device:GPU:0
  gradients/Mean_2_grad/range/delta (Const) /device:GPU:0
  gradients/Mean_2_grad/range (Range) /device:GPU:0
  gradients/Mean_2_grad/ones/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/ones (Fill) /device:GPU:0
  gradients/Mean_2_grad/DynamicStitch (DynamicStitch) /device:GPU:0
  gradients/Mean_2_grad/Reshape (Reshape) /device:GPU:0
  gradients/Mean_2_grad/BroadcastTo (BroadcastTo) /device:GPU:0
  gradients/Mean_2_grad/Shape_2 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Shape_3 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod (Prod) /device:GPU:0
  gradients/Mean_2_grad/Const_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod_1 (Prod) /device:GPU:0
  gradients/Mean_2_grad/Maximum/y (Const) /device:GPU:0
  gradients/Mean_2_grad/Maximum (Maximum) /device:GPU:0
  gradients/Mean_2_grad/floordiv (FloorDiv) /device:GPU:0
  gradients/Mean_2_grad/Cast (Cast) /device:GPU:0
  gradients/Mean_2_grad/truediv (RealDiv) /device:GPU:0

Op: Mean
Node attrs: Tidx=DT_INT32, keep_dims=false, T=DT_FLOAT
Registered kernels:
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT64]

	 [[{{node Mean_2}}]]

Original stack trace for 'Mean_2':
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 930, in _bootstrap
    self._bootstrap_inner()
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/mainscripts/Trainer.py", line 46, in trainerThread
    model = models.import_model(model_class_name)(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/Model_Quick96/Model.py", line 143, in on_initialize
    gpu_src_loss =  tf.reduce_mean ( 10*nn.dssim(gpu_target_src_masked_opt, gpu_pred_src_src_masked_opt, max_val=1.0, filter_size=int(resolution/11.6)), axis=[1])
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/ops/__init__.py", line 308, in dssim
    ssim_val = tf.reduce_mean(luminance * cs, axis=nn.conv2d_spatial_axes )
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 1082, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/math_ops.py", line 2581, in reduce_mean_v1
    return reduce_mean(input_tensor, axis, keepdims, name)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 1082, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/math_ops.py", line 2639, in reduce_mean
    gen_math_ops.mean(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/gen_math_ops.py", line 6286, in mean
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/op_def_library.py", line 740, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 3776, in _create_op_internal
    ret = Operation(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 2175, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)

Traceback (most recent call last):
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1377, in _do_call
    return fn(*args)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1359, in _run_fn
    self._extend_graph()
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1400, in _extend_graph
    tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation Mean_2: Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/device:GPU:0' assigned_device_name_='' resource_device_name_='' supported_device_types_=[CPU] possible_devices_=[]
FloorDiv: GPU CPU 
RealDiv: GPU CPU 
Maximum: GPU CPU 
Cast: GPU CPU 
FloorMod: GPU CPU 
BroadcastTo: GPU CPU 
Shape: GPU CPU 
Range: CPU 
DynamicStitch: CPU 
Reshape: GPU CPU 
Mean: GPU CPU 
Prod: GPU CPU 
AddV2: GPU CPU 
Const: GPU CPU 
Fill: GPU CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  Mean_2 (Mean) /device:GPU:0
  gradients/Mean_2_grad/Shape (Shape) /device:GPU:0
  gradients/Mean_2_grad/Size (Const) /device:GPU:0
  gradients/Mean_2_grad/add (AddV2) /device:GPU:0
  gradients/Mean_2_grad/mod (FloorMod) /device:GPU:0
  gradients/Mean_2_grad/Shape_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/range/start (Const) /device:GPU:0
  gradients/Mean_2_grad/range/delta (Const) /device:GPU:0
  gradients/Mean_2_grad/range (Range) /device:GPU:0
  gradients/Mean_2_grad/ones/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/ones (Fill) /device:GPU:0
  gradients/Mean_2_grad/DynamicStitch (DynamicStitch) /device:GPU:0
  gradients/Mean_2_grad/Reshape (Reshape) /device:GPU:0
  gradients/Mean_2_grad/BroadcastTo (BroadcastTo) /device:GPU:0
  gradients/Mean_2_grad/Shape_2 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Shape_3 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod (Prod) /device:GPU:0
  gradients/Mean_2_grad/Const_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod_1 (Prod) /device:GPU:0
  gradients/Mean_2_grad/Maximum/y (Const) /device:GPU:0
  gradients/Mean_2_grad/Maximum (Maximum) /device:GPU:0
  gradients/Mean_2_grad/floordiv (FloorDiv) /device:GPU:0
  gradients/Mean_2_grad/Cast (Cast) /device:GPU:0
  gradients/Mean_2_grad/truediv (RealDiv) /device:GPU:0

Op: Mean
Node attrs: Tidx=DT_INT32, keep_dims=false, T=DT_FLOAT
Registered kernels:
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT64]

	 [[{{node Mean_2}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/mainscripts/Trainer.py", line 46, in trainerThread
    model = models.import_model(model_class_name)(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/Model_Quick96/Model.py", line 227, in on_initialize
    model.init_weights()
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/layers/Saveable.py", line 106, in init_weights
    nn.init_weights(self.get_weights())
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/ops/__init__.py", line 48, in init_weights
    nn.tf_sess.run (ops)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 967, in run
    result = self._run(None, fetches, feed_dict, options_ptr,
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1190, in _run
    results = self._do_run(handle, final_targets, final_fetches,
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1370, in _do_run
    return self._do_call(_run_fn, feeds, fetches, targets, options,
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1396, in _do_call
    raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node 'Mean_2' defined at (most recent call last):
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 930, in _bootstrap
      self._bootstrap_inner()
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 973, in _bootstrap_inner
      self.run()
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 910, in run
      self._target(*self._args, **self._kwargs)
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/mainscripts/Trainer.py", line 46, in trainerThread
      model = models.import_model(model_class_name)(
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/ModelBase.py", line 193, in __init__
      self.on_initialize()
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/Model_Quick96/Model.py", line 143, in on_initialize
      gpu_src_loss =  tf.reduce_mean ( 10*nn.dssim(gpu_target_src_masked_opt, gpu_pred_src_src_masked_opt, max_val=1.0, filter_size=int(resolution/11.6)), axis=[1])
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/ops/__init__.py", line 308, in dssim
      ssim_val = tf.reduce_mean(luminance * cs, axis=nn.conv2d_spatial_axes )
Node: 'Mean_2'
Cannot assign a device for operation Mean_2: Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/device:GPU:0' assigned_device_name_='' resource_device_name_='' supported_device_types_=[CPU] possible_devices_=[]
FloorDiv: GPU CPU 
RealDiv: GPU CPU 
Maximum: GPU CPU 
Cast: GPU CPU 
FloorMod: GPU CPU 
BroadcastTo: GPU CPU 
Shape: GPU CPU 
Range: CPU 
DynamicStitch: CPU 
Reshape: GPU CPU 
Mean: GPU CPU 
Prod: GPU CPU 
AddV2: GPU CPU 
Const: GPU CPU 
Fill: GPU CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  Mean_2 (Mean) /device:GPU:0
  gradients/Mean_2_grad/Shape (Shape) /device:GPU:0
  gradients/Mean_2_grad/Size (Const) /device:GPU:0
  gradients/Mean_2_grad/add (AddV2) /device:GPU:0
  gradients/Mean_2_grad/mod (FloorMod) /device:GPU:0
  gradients/Mean_2_grad/Shape_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/range/start (Const) /device:GPU:0
  gradients/Mean_2_grad/range/delta (Const) /device:GPU:0
  gradients/Mean_2_grad/range (Range) /device:GPU:0
  gradients/Mean_2_grad/ones/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/ones (Fill) /device:GPU:0
  gradients/Mean_2_grad/DynamicStitch (DynamicStitch) /device:GPU:0
  gradients/Mean_2_grad/Reshape (Reshape) /device:GPU:0
  gradients/Mean_2_grad/BroadcastTo (BroadcastTo) /device:GPU:0
  gradients/Mean_2_grad/Shape_2 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Shape_3 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod (Prod) /device:GPU:0
  gradients/Mean_2_grad/Const_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod_1 (Prod) /device:GPU:0
  gradients/Mean_2_grad/Maximum/y (Const) /device:GPU:0
  gradients/Mean_2_grad/Maximum (Maximum) /device:GPU:0
  gradients/Mean_2_grad/floordiv (FloorDiv) /device:GPU:0
  gradients/Mean_2_grad/Cast (Cast) /device:GPU:0
  gradients/Mean_2_grad/truediv (RealDiv) /device:GPU:0

Op: Mean
Node attrs: Tidx=DT_INT32, keep_dims=false, T=DT_FLOAT
Registered kernels:
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT64]

	 [[{{node Mean_2}}]]

Original stack trace for 'Mean_2':
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 930, in _bootstrap
    self._bootstrap_inner()
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/mainscripts/Trainer.py", line 46, in trainerThread
    model = models.import_model(model_class_name)(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/Model_Quick96/Model.py", line 143, in on_initialize
    gpu_src_loss =  tf.reduce_mean ( 10*nn.dssim(gpu_target_src_masked_opt, gpu_pred_src_src_masked_opt, max_val=1.0, filter_size=int(resolution/11.6)), axis=[1])
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/ops/__init__.py", line 308, in dssim
    ssim_val = tf.reduce_mean(luminance * cs, axis=nn.conv2d_spatial_axes )
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 1082, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/math_ops.py", line 2581, in reduce_mean_v1
    return reduce_mean(input_tensor, axis, keepdims, name)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 1082, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/math_ops.py", line 2639, in reduce_mean
    gen_math_ops.mean(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/gen_math_ops.py", line 6286, in mean
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/op_def_library.py", line 740, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 3776, in _create_op_internal
    ret = Operation(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 2175, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)

It shows that tf.reduce_mean is the problem , what can i do to solve his problem.

Pleasehelp

@prashantspandey
Copy link
Author

prashantspandey commented Mar 16, 2023

I solved it by using the appropriate versions of tensorflow-macos==2.8.0 , tensorflow-metal==0.5.0 and numpy== 1.23.
Now at-least the training starts but , loss doesn't go down. It just randomly calculates . Even in the preview, columns 2,4,5 which contain the learned outputs don't show up.

So definitely something wrong with the model as even on CPU the same problem of loss not going down persists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant