Python, how to replace a text from a file between two tags?


#1

Hello,

I have a long text, im using the .replace to replace text in a file. :
if “>PRESS AREA” in filedata :
filedata = filedata.replace(’>PRESS AREA’,’>PRESS’)

but since the bloc of text that i want to replace is sometime very long and that i dont want to have this text in my python file i would like to replace the contain between two tag.

for example this is my text :
'‘text beginning : blablabla blablaba text end. ‘’
here the two tags would be ‘text beginning’’ and ‘‘text end’’ and i would like that what is inside (blablabla etc) get replace ,

thank you !


#2

i wouldn’t even use python for this, use sed, its designed specifically for the purpose you want. Use the tool that fits the job. An example is here:

https://stackoverflow.com/questions/10613643/replace-a-unknown-string-between-two-known-strings-with-sed


#3

hey,
thanks for the answer but since im learning python i would like to use this language here,
thnkas for the understanding,


#4

Why? There is a really good tool available (sed) for what you need to do, then why use a tool (python) which is not as good? In programming, you use the tool that fits the job, even if that means learning that tool. Use python where suitable.

.replace() only works for strings, so you would need to loop over all the lines/sentences in filedata (replace will not work on filedata), and then replace won’t even be good enough (there is no way to get the middle bit with replace). So then you would have to find a way to extract the middle bit, which isn’t easy.

Or you use sed, where you have a single line solution.


#5

well i started to learn python 3 weeks ago, so i dont nescessary want to mix a new language now,
how can i loop into it ?
thanks


#6

sed is stream editor for filtering and transforming text, its not entire new language.

as for the loop:

for line in filedata:

how are you even going to extract the middle bit? Are you sure this is the right task to do right now?


#7

i tried that :
for line in filedata:
filedata=re.sub(’\nbeginnign text.*?end texte’,‘new texte’,filedata, flags=re.DOTALL)

any tip…?


#8

you can’t substitute on filedata, you would have to do it line by line:

https://stackoverflow.com/questions/18935626/regex-re-sub-list-in-a-file

take a minute to reflect on what methods you are using, on what data types they work, and what data types you are using.

This is why i recommended sed, its pretty similar to what you do with re.sub here, except it will work on a file.


#9

ok thanks and can i import sed on my actual python file and use it on this same code ?


#10

hm… someone uses read and execute a sub:

import re
output = open("output.txt","w")
input = open("input.txt").read()

output.write(re.sub(r'^(.{4})(.{4})(.{4})(.{3})$',
                    r'\1\4\2\3', 
                    input, 
                    flags=re.M))

output.close()

that seems possible.

interesting, didn’t know that was possible. But you never showed me how you read the file data, so its difficult for me to say which of these approaches you should go for.

no, sed is a separate tool you would need to install


#11

this is my code

# Read in the file
import re

with open('/Users/lele/Desktop/ jully/tool /test.php') as file :
  filedata = file.read()


for line in filedata:
  filedata=re.sub('beginnign text.*?end texte','new texte',filedata, flags=re.DOTALL)



with open('/Users/lele/Desktop/gallery jully/tool galleria continua/test.php', 'w') as file:
  file.write(filedata)


file.close()

#12

if you use read, you might be able to do without the for loop.

can you post the content of test.php? Please use markup

Does it even read the file data properly? Have you printed the filedata to see? Your path is really confusing. Users directory sounds like windows, but you don’t use c:\ instead you use root directory and forward slashes (*nix system), which doesn’t add up.


#13

well yes its working for simple structurelike replacing a know expression with ‘if’
(its on mac), i tested its changing my file like i want

Read in the file

with open(’/Users/lele/Desktop/tool /shilpa-gupta-250.php’) as file :
filedata = file.read()

if “>CONTACTS” in filedata :
filedata = filedata.replace(’>CONTACTS’,’>联络我们’)

Write the file out again

with open(’/Users/lele/Desktop/shilpa-gupta-250.php’, ‘w’) as file:
file.write(filedata)

the content to replace with the two tags is full of " “” so i cannot do it directly on the python file apparently its why i need to work with 2 tags , an example of content :


#15

like i said, you need to use format/markup. See here:

How do I format code in my posts?