🐍 Magic Python: Mutable vs Immutable and How To Copy Objects
Guys, I want to shed some light on the difference between immutable and mutable objects and the ways to copy both types in Python. Many people think that they fully understand what’s going on when you are copying some object, but there are many pitfalls here.
This article is what I was looking for, as a newbie, but haven’t had any success finding. Hope it will be useful to you!
Introduction
All objects in Python are either immutable or mutable:
And the key difference between immutable and mutable objects is:
You CANNOT change immutable after it’s created (when you try to, Python creates a new object instead), and you CAN change mutable (in-place). But there are some exceptions for compound objects.
We can identify an object as a compound if it contains other objects (we will also call those “other objects” as nested). Compound objects are all mutable types plus a pair of immutables (tuple and frozen set— the immutable version of set):
Well, that’s all for now. Let’s get to the ways of copying.
Create a copy via assigning (with “=”)
TL;DR
That works only for immutable objects (except compound ones).
Every time when you type “target_name = some_object”, Python creates:
- an object with unique ID (link),
- and a binding between target_name and some_object.
Here it is:
Now let’s do some coding. We will create an object and try to copy it with a “=”:
>>> A = 23
>>> B = A # Try to create a copy of a simple immutable
# object via assigning method ("=")>>> A, B # Just checked whether they are identical...
(23, 23) # ...and we see that they are.>>> print(f'{id(A)}\n{id(B)}') # Now check IDs
4323800896
4323800896 # WOW! They're the same too!>>> A = 0 # Now we change A to see if B will change too
>>> B
23 # Look! B didn't change! Assigning method worked out!
You may say: “Yeah, it worked out but why both objects have the same ID and why the object B hasn’t changed in spite of those same IDs?”.
The same IDs is a story about how Python memory management works. Python does not like to store multiple immutable objects and eats tons of memory by that, so it simply decides to construct only one immutable object with the possibility to bind it with multiple names.
Now, why the object B hasn’t changed. And you already know the answer:
Immutable object CANNOT be changed after creating.
So, in our case, when you “change” A, it is NOT actually changing, but rebinding the name “A” with a new object. It looks like this (look closely at the IDs):
Well, obviously, for immutable objects (except compound ones) the assigning method (using “=”) works fine.
Let’s see what about mutable and compound immutable types:
>>> A = ['roses', 'are', 'red'] # We create compound object
>>> B = A # Try to copy it
>>> B # Check B
['roses', 'are', 'red'] # Hmm... looks like a copy>>> print(f'{id(A)}\n{id(B)}') # Check IDs...
4328085216 # Same IDs again, but...
4328085216 # ...now we know why that is.>>> A[0] = 'violets' # Change A: 'roses' -> 'violets'
>>> A[2] = 'blue' # Change A: 'red' -> 'blue'>>> A # Check A...
['violets', 'are', 'blue'] # Yes, it's changed>>> B # Check B...
['violets', 'are', 'blue'] # ...B has changed too, because
# both A and B are bound to
# the same object
The results are due to a thing, which you already know:
Mutable objects CAN be changed in-place, and in that case Python does not create a new objects.
Since both A and B are bound to the same object, on printing we see identical output. For clarity, see the figure below:
So, in case of mutable and compound immutable objects the assigning method (using “=”) DOES NOT work.
Okay, cool, but you may ask “then how can I copy them?”. Well, let’s move to the next part.
Create shallow and deep copies with a copy module
I hope everything is clear to this moment. Now we have come to the main and the trickiest part of this article— creating copies of compound objects (mutables: list, dict, set, bytearray + immutables: tuple and frozen set).
Actually, you can create a copy of any compound object manually — for example, with a for-loop. And it might be a good way for education purposes. But why reinvent the wheel, right? 🙂 Python kindly provides us with a great module which is called “copy” (link). It has only two methods:
- copy.copy(x) — return a shallow copy of x
- copy.deepcopy(x) — return a deep copy of x
How a Shallow Copy works
When we initiate creating a list…
>>> A = ['roses', 'are', 'red']
…Python creates, let’s call it list-object, and then inserts references to every element of our list into that list-object. Every element of the list-object is an object too. That’s why we say list is a compound data type — it is the object and it contains references to other objects. We can illustrate it like this:
Now, if we want to create a shallow copy:
>>> import copy # Don't forget to import a module "copy"
>>> B = copy.copy(A) # Creating a shallow copy of object A
We got this situation:
A little details about how shallow copy works:
- The copy-method constructs a new list-object (id: 4327736432) and binds it with our new name (B),
- after that it sequentially inserts all the references from the original object to the new one.
Did you get that concept? If not, look closely at the figure above again.
If we now change the original list, there will be no changes in copied object:
>>> A[0] = 'violets' # 'roses' -> 'violets'
>>> B
['roses', 'are', 'red'] # 0-element is still 'roses'
Zero-element of B didn’t change because…
… please, don’t read further and try to think by yourself why it didn’t change. Hint: we’ve already talked about it in this article…
… because when we try to change an immutable object, we just untie the name from the old object and bind that name with a new one (which has another reference). For our list the element’s name is A[0], the old object is string ‘roses’ and a new one is string ‘violets’. We easily untie A[0] from ‘roses’ and bind it with ‘violets’.
I should also say, that I used a list just for clarity, but copy-method works with all of compound data types such as dicts, sets, bytearrays and tuples as well.
How Deep Copy works
Everything looks great, doesn’t it? But what if I give you a situation, in which our compound object contains another compound object (nested compound object), for example, another list? Will the shallow copy work here or do we need something better? So, let’s figure it out.
First of all, we should understand how that complex object looks in terms of structure:
>>> A = ['roses', ['are', 'red']]
Every nested compound object acts like it is an ordinary one — it contains references to its elements (objects), then those elements, if they are compound too, also contain references to their objects and so on until the last object is not compound.
Well, now we’ll try to create a shallow copy of compound object with another nested compound object. Of course, for that purpose we will use a copy-method from the Python module copy:
>>> import copy # Don't forget to import a module "copy"
>>> B = copy.copy(A) # Creating a shallow copy of object A
Look closely at the figure below. In fact, our shallow copy (object B) “shares” nested object (a list [‘are’, ‘red’]) with object A.
If we now change any element inside that nested object…
>>> A[1][1] = 'blue' # 'red' -> 'blue'
>>> A # Check A
['roses', ['are', 'blue']]
…we got changes in object B too:
>>> B # Check B
['roses', ['are', 'blue']] # The last element was ['are', 'red']
We can conclude, that
shallow copy IS our choice when we have some data which are supposed to be shared between objects. And it IS NOT our choice when we need to create fully independent copy.
And a fully independent copy is where we need something better than a shallow copy. Yes, I’m talking about deep copy.
Deep copy method constructs new object for every nested compound object.
See how it works on the figure below:
Notice: A and B do not share the objects ‘roses’, ‘are’, and ‘red’. The references to them are the same because these objects are immutable and it’s just how Python memory management works.
So, let me show how it all works in code:
>>> import copy # module for creating shallow and deep copies
# for compound objects>>> A = ['roses', ['are', 'red']] # create original object
>>> B = copy.copy(A) # create its shallow copy
>>> C = copy.deepcopy(A) # create its deep copy
Compare IDs to see the difference between shallow and deep copying:
# compare IDs of all three objects>>> print(f'{id(A)}\n{id(B)}\n{id(C)}\n')
4387368704
4387543472
4387653104 # they are all different as it should be
# compare IDs of obj[0] - 'roses'>>> print(f'{id(A[0])}\n{id(B[0])}\n{id(C[0])}\n')
4385886960
4385886960
4385886960 # they are all the same as it should be
# compare IDs of obj[1] - ['are', 'red']
# Here is the key difference between shallow and deep copying.>>> print(f'{id(A[1])}\n{id(B[1])}\n{id(C[1])}\n')
4386322784
4386322784
4387653344 # <-- It SHOULD BE and it IS different
# for a compound object in a deep copy
# compare IDs of obj[1][0] - 'are'>>> print(f'{id(A[1][0])}\n{id(B[1][0])}\n{id(C[1][0])}\n')
4387300528
4387300528
4387300528 # they are all the same as it should be
# compare IDs of obj[1][1] - 'red'>>> print(f'{id(A[1][1])}\n{id(B[1][1])}\n{id(C[1][1])}\n')
4387300464
4387300464
4387300464 # they are all the same as it should be
Now, we’ll check how changes in object A affect the objects B (shallow copy) and C (deep copy):
>>> A[0] = 'violets' # 'roses' -> 'violets'>>> A # Check A (changed)
['violets', ['are', 'red']]
>>> B # Check B (NOT changed)
['roses', ['are', 'red']]
>>> C # Check C (NOT changed)
['roses', ['are', 'red']]
>>> A[1][1] = 'blue' # 'red' -> 'blue'>>> A # Check A (changed)
['violets', ['are', 'blue']]
>>> B # Check B (changed)
['roses', ['are', 'blue']]
>>> C # Check C (NOT changed)
['roses', ['are', 'red']]
As you can see, it works as I illustrated above. Shallow copy (object B) shares nested list with the original object (A), and deep copy (object C) does not share any data with it.
Brief summary
Immutable objects (except compound immutables with nested mutable objects):
- Copying via assigning (=) works fine
Mutable objects + compound immutables with nested mutable objects:
- Copying via assigning (=) DOES NOT work;
- Shallow copy: copies all elements except contents within nested mutable objects (that contents becomes shared between original and all copies);
- Deep copy: copies entire object (without any exceptions)
What are your thoughts about that?
You’re welcome in the comment section below.
Thank you for your time! 🎉
Follow me for more interesting topics 😉👍