Is there a way to open a file in both read and write?

Having gone through the whole lesson, I didn’t come across a way of opening a file in both ‘read’ and ‘write’ mode (I might have missed something). Is there such a way?

if you check the documentation:

https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files

you will see this mode exist. there is read ®, write (w), append (a) and read + write (r+)

34 Likes

Grand, thank you! I was on that page but clearly didn’t read it through…

1 Like

Thank you, i saw it, but. I have this type of code:

with open("bad_bands.txt", "r+") as bad_bands_doc:
  bad_bands_doc.write("Grupa Tuberculyos")
  print(bad_bands_doc.read())

with open("bad_bands.txt") as bad_bands_doc:
  print(bad_bands_doc.read())

and it outputs:


Grupa Tuberculyos

There is an empty first line. I suppose that there should be a similar to the second line. But could you help me with why it is not?
Thank you.

2 Likes

ran the code on repl.it:

and went fine, maybe there was already content in the file? You could removing and creating the file again.

Is it fine though? Because what i would suppose is that there should be two identical lines of output:

Grupa Tuberculyos
Grupa Tuberculyos

Since we wrote “open("bad_bands.txt", "r+")” with "r+", i would expect we could write “bad_bands_doc.write("Grupa Tuberculyos")” to the file and then read it with “print” “print(bad_bands_doc.read())” so that we could output two identical lines.

1 Like

I think the problem is with how python handles files.

here:

  bad_bands_doc.write("Grupa Tuberculyos")
  print(bad_bands_doc.read())

the write is kept in memory (to minimize interaction with disk, which is slow). So then when you use read(), there is nothing in the file yet

6 Likes

It displays as expected when I used the .close function. According to the course we don’t need to use close() because using “with” automatically closes the file, however, I needed to explicitly close the file before I could read it.

with open(‘bad_bands.txt’, ‘r+’) as bad_bands_doc:
bad_bands_doc.write(‘I like all bands’)
bad_bands_doc.close()

with open(‘bad_bands.txt’, ‘r+’) as bad_bands_doc:
print(bad_bands_doc.read())

I’m unable to reproduce your issue on this lesson. It works fine without having to call .close(). What was your indentation here? See - How do I format code in my posts? for including proper formatting. Were those statements nested?

I think you’re saying that the .write() is saved to be executed just before the file is closed, to minimize disk interaction, so the order you write code in doesn’t matter?

So does this mean that if I want to write something to a file and then read it, it’s best to create new file objects for each task?

1 Like

running programs exists in RAM (random access memory), what I assume happens is that when we open a file, the file is loaded from the disk into RAM. Then when we close the file, does the content get written to the disk

I don’t think this presents a real world scenario, in a real world scenario you write something to a disk because you need to persist the data after the program is finished running (and is “removed” from RAM).

so writing and then reading from a file right away kind of feels wrong. RAM is much faster, so then simply do something like:

with open("bad_bands.txt", "r+") as bad_bands_doc:
  to_write = "Grupa Tuberculyos"
  print("writing to file: {}".format(to_write))
  bad_bands_doc.write(to_write)
1 Like

The file object and the file are different. I understand that the .write() isn’t executed to the file itself until the file is closed, to reduce interaction with the disk. But the .read() runs on the file object, not the file (I assume, because why else have a file object except to store it in memory and reduce disk interaction) so why doesn’t it .read() the rewritten file object that is already in memory, rather than the empty file?

I’m not sure how your code solves the problem. If I want to write a user’s input to a file, and then immediately read the contents of that file, how would I do that?

1 Like

Not sure, you would need to delve into how the file object works in more depth. This is something very specific I do not know without doing research

is essentially the same except that you have user input? You already have the input (which you could store in a variable), which means you don’t need to read from file.

2 Likes

This is getting back to old Unix-isms. Everything you have written to the “file” are available to read back, whether it’s been flushed yet to media or not. There is no wait period. Stuff just wouldn’t work if this wasn’t honored. However, when you have just written the file, the file pointer is poised for the next write. Thus, the next time you write, you don’t overwrite existing data. If you execute a read at that point, you’ll read from the memory that has not yet been written. Closing and reopening the file does position the read pointer back to the beginning of the file, and that is why it appears to do the desired magic. What you really want to accomplish is a subset of that, though. You want to reposition the file pointer back to the start of the file to read it. That is done with a file rewind, or using what’s available in the python interface, a seek to byte zero, aka

bad_bands_doc.write("hello")
bad_bands_doc.seek(0) # your next read or write will be at the start of the file
hope_and_pray = bad_bands_doc.read()
And see if you get your data back

If you think about it, when you read, what was it you wanted to read? The last write, the whole file, etc. This specifies the “where”.
If on a non-Windows system run “man fopen” and “man fseek” for more information.

First time posting/formatting here. Be gentle.
(I’m just taking this course this week, but come from an open systems background

8 Likes

I’ve never dug too deep in Python’s file objects but I believe the following gives a rough overview (I’ve tried to keep it vague to avoid inaccuracy but I can’t promise everything below is 100% accurate, higher level means ignoring implementation sometimes).

Python doesn’t read an entire file into memory by default (it could easily be a file larger than the available memory and even if it isn’t, loading the entire thing is rarely necessary). There’s a difference between the file object and a buffer (which would be all or part of the file read into memory). You can read about and change how Python uses buffers with file objects using open(). You’d typically read a file in chunks (there is a parameter for this in the .read() method) so read chunk, process chunk, repeat.

This is also the case for what you could call the write buffer. It is much more efficient to write large chunks of data to the disk at once. As such writes are often buffered so that large chunks of data can be written at once, the size of the buffer determines when these writes occur. As you mention this may be when the file is closed, it may happen after the buffer reaches a certain size or you can force a flush of the buffer (thereby writing it).

On the other hand is that balance with memory efficiency, say I’m writing a few hundred GB of video, most systems simply cannot support keeping the entire thing in memory. Both reading and writing in moderate chunks is typically a more sensible approach (it’s a bit of a balancing act between memory efficiency and read/write operations). As discussed at the link above you can control the size of these buffers to some degree although the defaults are normally decent for simple operations.

Bear in mind that your operating system also uses its own buffers so trying to guess exactly when a file is being written to is difficult; unless you spent a lot of time looking into it I’d always err on the side of caution.

As mentioned above, there’s few reasons why you’d want to do something like this if you already have the object, the safest & easiest option has already been mentioned, just store the original input. In theory you could use r+ or similar and moving the file pointer with f.seek as @net3912806482 mentions (which I believe flushes the buffer when called anyway) but I haven’t personally tried using it. I’d just be cautious working around .flush and .seek unless you spend some time looking into them (do they explicitly state that the system will write to the file) as you’re dealing with multiple layers of abstraction.

5 Likes

Great, thank you!

Totally agreed!

1 Like

@net3912806482 I don’t think there’s a real need to be gentle seeing as you’ve gone and posted the only actual answer to @xalava 's question thus far :slight_smile: If I could give you more than one like I would!

I agree wholeheartedly, this behavior has nothing to do with buffers or memory or delayed writes or any of the other conjectures being made, and everything to do about the current position in the file.

Unfortunately the official docs are a bit thin on these more fine grained details ( Built-in Functions — Python 3.10.0 documentation ), and as you mentioned it seems this is behavior is inherited from Unix/C, not so much intrinsic to python, but the different modes give you something to chew on:

“r” - since its opening for reading, the starting “position” will the beginning, or 0
“w” - write but truncate, starting “position” will be 0, but anything contents the file had previously are gone
“a” - opens the file for writing, but opens it at the end and keeps the existing content
“r+” - this ones actually kind of weird I suppose, it starts at position 0 but keeps the old contents. Anything new you write will overwrite what’s at the current position.

I think the initial experiment of writing followed by reading might be more interesting if you added some additional lines to your text file.

Also have a look at the .tell() method, which will tell you the current position in the file the file object it at.

Try doing something like this for example:

# contents of good_bands.txt 
The Beets
Grateful Dead
Talking Heads
Mountain Goats
with open("good_bands.txt", "r+") as good_bands:
  good_bands.write("The Beatles\n")
  # Now the Beatles have replaced the Beets and overwritten the G on line 2
  good_bands.tell() # returns 12, the current file position
  print(good_bands.read())       # => prints the rest of the file from where you stopped writing
# new contents of good_bands.txt
The Beatles
ateful Dead
Talking Heads
Mountain Goats

That said, as was mentioned above, I don’t think the immediate read after writing is a workflow you’d encounter much in actual coding.

You might instead open the file for appending (“a”), write a few lines to the end, then either .seek() back to the beginning, or more likely just close it and reopen in in read mode (“r”), setting your position back to the top in the process.

1 Like

I’d love to be corrected on the following or recevie more information as it’s a long time since I really used C and even at the time I’d have considered myself barely proficient.

Mixing reads and writes has been a problem for as long as I remember. The C standards have always required a flush or file positioning inbetween writes then reads when opened with + to update. What I had forgotten is that reads then writes in the same circumstances require explicit file positiong, a flush is not enough. So far as I’m aware Python (CPython, I’ll not try and guess for the alternatives) does not deviate from this behaviour either.

If I modify your example you can (probably) observe some potentially unexpected behaviour if you rely on ‘position’ alone-

with open("good_bands.txt", "r+") as good_bands:
    print(good_bands.read(4))
    print(f"After the first read, positiion = {good_bands.tell()}")
    print('About to write "The Beatles"')
    good_bands.write("The Beatles\n")
    print(f"After writing, positiion = {good_bands.tell()}")
    print(f"Now reading...\n {good_bands.read()}")
    print(f"After this read... positiion = {good_bands.tell()}")

on Windows I get the following output (CPython3.8)-

The 
After the first read, positiion = 4
About to write "The Beatles"
After writing, positiion = 69
Now reading...

After this read... positiion = 69

The actual contents of the file end up like-

The Beets
Grateful Dead
Talking Heads
Mountain Goats
The Beatles

I’m fairly confident it’s based on the fact that C requires an explicit position for writing following reading or you’ll run into undefined behaviour (I dare say we’d all like to avoid that) and to my knowledge nothing in CPython overrides this behaviour.

Granted, for that specific scenario no amount of flushes will change the outcome (it’s position that must be set) but avoiding undefined behaviour should always be the right choice.

To work around this you’d need to explicitly set the position inbetween the read and the write, the example given is good_bands.seek(good_bands.tell()) before the write.

1 Like

That’s even weirder - I see the same behavior locally (Windows, python 3.9).

Just running some experiments (opened with ‘r+’, mind you) it seems that:

  • You can read as many times as you want, and the position will update to the end of your last read, i.e:

    f.read(4)
    f.tell()    # 4
    f.read(4)
    f.tell()   #8
    
  • If you write first, then read the position stays at the end of your write, as my previous example

  • If you read first, then write, the position jumps to the end of the file, as in your last example

I dare say you’re right and this fits with the C explanation, this is well into undefined behavior territory and probably best avoided.

2 Likes

So far as I can tell from the docs is that read then write requires the position be set, write then read should always be flushed after the write and before the read (or both options risk undefined behaviour). I’d definitely be on your side that if you can just try and avoid it entirely :grin:; sounds like a right minefield even if you could test different platforms.