Need help to clean and normalize data in CSV file using Python

I need help to automatize a very boring task with Python :sweat_smile:

I work as a digital marketer and often find myself having to clean and normalize data in Excel so I can import it to our company’s CRM.

Basically, I have to check the data in certain cells of the spreadsheet and perform lots of Search & Replace operations; always with the same kind of data.

For example, I’ve got a ‘role’ column with values in French and I have to translate them into Spanish:

  • Enseignant → Profesor
  • Coordinateur → Coordinador
  • Directeur → Director
  • Autre → Otros

It’s quite a boring job and I’m sure that Python could help me automatize it, but I’m just a beginner coder and can’t do it on my own yet.

Could anyone give me a hand to just start out?

Thanks a lot in advance :blush: :pray:

Welcome to the forums, @icalvo!

Python can definitely help you out here. One easy way would be to:

  • Import the CSV as a Pandas DataFrame,
  • Change the values with Pandas
  • Export as a new CSV.

If you are a beginner, you’ll want to make sure you have a grasp on at least lists, strings, loops and maybe functions. Then learning Pandas won’t take too long. There are a few courses on it in the Codecademy Pro curriculum: Learn Data Analysis with Pandas and How to Clean Data With Python.

If you don’t have, don’t want, or can’t afford a Pro subscription then you should be able to Google what I’ve outlined above and find some other resources online as well.

Hope this helps. Happy coding!

1 Like

I’ve finally managed to do the data cleaning using Pandas :raised_hands:

The suggested resources were very useful. Thanks a lot!

1 Like