Are strings actually immutable?

isaac_r23 · September 19, 2019, 2:46pm

I just want to comment that the lesson starts off by saying that strings are immutable, but then the solution to section 12 review involves changing a string by successively adding characters to it. That doesn’t sound very immutable to me!

patrickd314 · September 19, 2019, 3:09pm

Isaac, the definition of “same” or “different” has to do with the identity of an object. When a Python object is created, it resides in a certain memory address, and that address is its identity. If I claim that the object is mutable, I say that the object at that address can be changed. If immutable, it cannot be changed.

Every time you “change” a string by adding or substituting characters, you are actually creating a new string. It may well be that you assign that new string to the variable name that you were previously using, but it is a new string , i.e., a different object, residing in a different memory location, nonetheless.

By contrast, when you append an element to a list, the resulting list occupies the same memory address as the first.

(The function id() gives you an object identification (the actual memory address in the most common Python implementation)).

my_str = 'abc'
print(my_str, id(my_str))
my_str += "x"      #  same as my_str = my_str + "x". The variable name is reassigned here.
print(my_str, id(my_str))
print()
my_lst = [1,2,3]
print(my_lst, id(my_lst))
my_lst.append("x")    # "mutating" the list
print(my_lst, id(my_lst))

Output:

abc 59852000
abcx 59853792     # same variable name, different string value, different object id 

[1, 2, 3] 56919744
[1, 2, 3, 'x'] 56919744   # same variable name, different list elements, same object id

isaac_r23 · September 19, 2019, 6:01pm

Patrick, thanks for taking the time to detail how this works behind the scenes.
I hope CodeAcademy amend this bit of the course to avoid confusion, either by making explicit what you have said, or just removing the existing reference to the immutability of strings

patrickd314 · September 19, 2019, 7:15pm

I wish they would make these things more explicit, also, Isaac, but Codecademy is apparently commited, in Python, at least, to an instruction style that strongly resists looking “beneath the hood” at the way things work.

That said, one must come to terms early on with mutability vs immutability, one way or another.

script4924157352 · January 29, 2020, 10:52am

Everything in Python is an object . You have to understand that Python represents all its data as objects. An object’s mutability is determined by its type. Some of these objects like lists and dictionaries are mutable , meaning you can change their content without changing their identity. Other objects like integers, floats, strings and tuples are objects that can not be changed.

Strings are Immutable

Strings are immutable in Python, which means you cannot change an existing string. The best you can do is create a new string that is a variation on the original.

xalava · June 24, 2020, 2:26pm

Patrickd314, you’re such a good answerer. Thank’s again for your great response and your time!

abhishekk492 · August 12, 2020, 5:15am

I understood that we can check this that new string is created with id but why the same string give different id even when we aren’t reassigning it. for example

a = "abc"
print(id(a))  // This print different value/id each time we run the program.

what does it mean…on executing each time the same string is giving different id…why so.?

rahmandikatriputra65 · September 15, 2020, 10:07pm

lol somehow i really like the way you reply the respond

j-hillman · December 26, 2020, 9:23am

@abhishekk492 - Here is my unprofessional opinion:

If you are running the code within the Code Academy interface, it is quite likely that each time you press Run, it is running in a totally separate process on the Code Academy server, and therefore in a different area of memory. You might more easily replicate the examples shown if you use the interactive Python terminal on your own machine.

I’ve done that and have listed it below. In the example shown, when you see me exit the interactive Python shell and re-enter it, that is very similar to what is happening each time we press Run in the Code Academy interface.

$ python3
>>> a = "abc"
>>> print(id(a))
140116549411248
>>> print(id(a))
140116549411248
>>> exit()

$ python3
>>> a = "abc"
>>> print(id(a))
140014491795888

web6515402446 · May 13, 2021, 2:03pm

what did i do wrong ?
id(string) showing same value even after changing it

my code

def password_generator (user_name):

  pas="" 

  print(pas)

  print(id(pas))

  for i in range(len(user_name)):

   pas+=user_name[i-1]

   print(pas)

   print(id(pas))

  return pas

password_generator("many")

my output

140389274016488
y
140389252782824
ym
140389252545872
yma
140389252545872
yman
140389252545872

tgrtim · May 13, 2021, 2:25pm

It’s not something I’ve looked at directly under the hood but I’d assume Python is saving some effort doing memory allocation for an object that only has one reference. So you do create a new object but it directly writes to the existing memory address (because it is safe to do so).

Following that idea you could probably force it to use a different address by assigning additional references to that object so that it cannot be safely altered.

So try your code with the following alteration (we haven’t changed pas at all)-

def password_generator (user_name):
    pas="" 
    print(pas)
    print(id(pas))
    for i in range(len(user_name)):
        pas+=user_name[i-1]
        test = pas  # a second reference to the object
        print(pas)
        print(id(pas))
    return pas

password_generator("many")

Like I say I’m fairly confident this would be the explanation but you’d have to check the source to be sure.

mtf · May 14, 2021, 3:08am

An empty string and a character are treated differently. The empty string value exists in only one location in memory, and no matter where we check the id of that value it will be the same. Same applies to number values up to some point (needs to be researched). Python has them fixed.

When we declare the single character string, it is treated as a byte and memory allocated accordingly. But it now has a new id. Once we concatenate to that byte, it gets allocated a block of memory. Again, a new id. After that, in situ operations with += all stay in the same place in memory as long as space allows.

That’s my analysis of this, at any rate.

tgrtim · May 14, 2021, 11:42am

I’d be inclined to a agree with your comment about an empty string, that fits with Python’s standard of pre-defining certain values on startup such as the integers -5:255.

Found a useful link regarding interning values. It seems like it might be extended a bit further to all strings of length 0 or 1 according to this post (though it covers only Python2)- The internals of Python string interning which explains why the first two values are different in @web6515402446’s example with further values being allocated to the same location. Repeatedly calling that function should then show those first two id’s repeating.

But that doesn’t explain my point about why this allocation is forced to change (for lengths greater than 0 or 1) when you make the modification I did (in the previous post) of adding a second reference to that string object. For an example of how this appears with a second reference to that same object-

140247881696880
y
140247879403888
ym
140247840202480
yma
140247840201776
yman
140247840201392

This is in contrast to the typical values I get when testing without that second object reference (the first two id values have not changed which supports the 0 or 1 length interning also being used in Python3):

140247881696880
y
140247879403888
ym
140247839285808
yma
140247839285808
yman
140247839285808

So I’d interpret += to operate in the same place only when that value can be safely altered (a single reference to that objects allows it to allocate to the same spot); if that object has other references then it cannot alter the memory at that address (which seems like a very very good idea). I certainly believe you’re right about string length and memory space becoming important for longer strings though.

At a guess, I’d surmise Python re-uses the same memory address if it can… both space permitting and without altering/breaking any existing references.

mtf · May 14, 2021, 2:57pm

Whew! That is quite a read. Lesson learned.

kawazackie · April 24, 2022, 10:50am

This was an amazing take. Thank you for this because i learnt something quite tricky about the inner pipes of Python.

debadai · March 2, 2023, 4:57pm

This explanation is very useful. Thank you for taking the time to write this down.

I wonder why Python defines some objects (like strings) as immutable and other objects (like lists) as mutable. Is there a useful reason for this? As far as I have learned up to this point (I am a total beginner in learning how to code), I don’t see the point of creating new objects in order to modify the original one. It seems like a waste of memory. But that’s just my non-qualified point of view.

Thanks again for your clear explanation.

bencrosbie · August 10, 2023, 4:12pm

I wanted to jump in here with a differently worded answer:

1/ A string value is immutable, yes
2/ The variable holding the string value, however, can still be reassigned
3/ var += var2 is directly equivalent to var = var + var2 and so you are reassigning the variable to a completely new value each time, not modifying it

Longer explanation, with examples to help explain…

The following code

variable = "dog"
print(variable)
variable = "cat"
print(variable)
variable = 1
print(variable)
variable = "rat"
print(variable)

will output:

dog
cat
1
rat

(note that the data type of variable is not fixed either and can also be changed when it is re-assigned)

And so the following code

var1 = "cat"
var2 = "dog"
var1 += var2

is valid because it is equivalent to

var1 = "cat"
var2 = "dog"
var1 = var 1 + var2

var1 is not being appended to, var1 is being re-assigned every time

It is also worth noting that, for this reason the += operator is much less efficient than the append() method

the append() method only has to writes the new list item to memory

every time += is run on a string, each individual character in the string must be read from one memory location and re-written to memory again in a different location

miguelito209 · December 18, 2023, 8:03pm

Hello Isaac_r23,

Thank you for your question!

It looks like characters are being added to a string, making it seem to be mutable.

→ --> However, the trickery is that every time one adds a character or new string to a current string, Python makes a NEW string!

It might be amazing to think about, but it is true. I am sorry for not being able to describe the situation’s specifics (low-level details); I hope this explanation helps!

juniorchuang · March 16, 2024, 9:51pm

@tgrtim @mtf this is a very interesting discussion and eye opening. Just tested this myself and found the same thing - the string IDs are different when it’s empty vs. one character, but any more appending after that gives the same ID, up to a certain character length (test shows 7) and certain configurations.

Given the definition of immutable, does that still apply? Is the immutability not based on the ID information for the object then?

mtf · March 16, 2024, 10:00pm

While we can by appearance modify a string, it is reassignment that makes it possible to alter ‘in-place’. Beyond that, strings are immutable in terms of modifying letters, and so on.

One conjectures this has to do with allocated space. UTF-8 is allocated 2 bytes per character, and UTF-16 is allocated 4 bytes. There may be a minimum of say, 16 bytes, which might explain the 0 to 7 characters keeping the same id.

Bottom line, it does us little good to know how Python uses memory or why strings are immutable. “Ours is not to question, why?” It’s easy enough to work with strings, even while given this basic constraint. We have all the tools we need, and have only to learn how to eke out everything they promise to fit our model.