首页 > 解决方案 > Some strange questions about pytorch copy a tensor

问题描述

I am a bit confused about pytorch's shared memory mechanism.

a = torch.tensor([[1,0,1,0],
                                   [0,1,1,0]])
b = a
b[b == 1] = 0

It's easy to know that a and b will simutaneously become tensor([[0,0,0,0],[0,0,0,0]]), cause a and b share the same memory. When I changed the code to

a = torch.tensor([[1,0,1,0],
                                   [0,1,1,0]])
b = a
b = b - 1

b became tensor([[0,-1,0,-1],[-1,0,0,-1]]), but a is still torch.tensor([[1,0,1,0],[0,1,1,0]]).
a and b are sharing the same memory. Why did b changed, while a didn't change?

标签: pythonpytorch

解决方案


In your second example a and b share the same reference but b = b - 1 is actually a copy. You are not affecting the underlying data of b (and not of a neither since it's the same).

  • You can look at it this way:

    >>> a = torch.tensor([[1,0,1,0],
                          [0,1,1,0]])
    >>> b1 = a
    >>> b2 = b1 - 1
    

    Comparing their pointer to the data buffers:

    >>> a.data_ptr() == b1.data_ptr()
    True
    
    >>> b1.data_ptr() == b2.data_ptr()
    False
    
  • If in fact, you operate on b inplace, you will of course change a as well:

    >>> a = torch.tensor([[1,0,1,0],
                          [0,1,1,0]])
    >>> b1 = a
    >>> b1.sub_(1)
    

    Then you haven't made a copy:

    >>> a.data_ptr() == b1.data_ptr()
    True
    

推荐阅读