Codecademy Forums

FAQ: Learn Python: Files - Reading a CSV File

This community-built FAQ covers the “Reading a CSV File” exercise from the lesson “Learn Python: Files”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Learn Python 3

FAQs on the exercise Reading a CSV File

There are currently no frequently asked questions associated with this exercise – that’s where you come in! You can contribute to this section by offering your own questions, answers, or clarifications on this exercise. Ask or answer a question by clicking reply (reply) below.

If you’ve had an “aha” moment about the concepts, formatting, syntax, or anything else with this exercise, consider sharing those insights! Teaching others and answering their questions is one of the best ways to learn and stay sharp.

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

Could somebody please explain to me in layman’s terms, what the newline = " " argument does? Documentation is written in a language which only takes me down a rabbit hole of words I don’t understand yet.

Footnotes

[1] (1, 2) If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='' , since the csv module does its own (universal) newline handling.

That’s the bit which I do not understand. What is meant by “on platforms that use \r\n linendings on write an extra \r will be added.”? Does it mean that the code will accidentally start writing at the beginning of the same line, because of incorrent interpretation? And why do we equate the word “newline” to an empty space as an argument? Am I simply missing some logical connection, or is the answer to that very technical, and I shouldn’t worry about it for now?

Some platforms may insert their own newline escape characters (\n) that conflict with the csv module. It’s sufficient for now to just accept that there is good reason for the recommended implementation and in due course of time you will get more into the technical side of things. I would just pass on this question, for now.

1 Like

It would appear to overwrite the newline character so there are not two in a row when the module inserts its own. A space (or empty string) are the only substitutions that will not alter the data.

I’ve found an interesting explanation in the documentation.

LINK

newline controls how universal newlines mode works (it only applies to text mode). It can be None , '' , '\n' , '\r' , and '\r\n' . It works as follows:

  • When reading input from the stream, if newline is None , universal newlines mode is enabled. Lines in the input can end in '\n' , '\r' , or '\r\n' , and these are translated into '\n' before being returned to the caller. If it is '' , universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.
  • When writing output to the stream, if newline is None , any '\n' characters written are translated to the system default line separator, os.linesep . If newline is '' or '\n' , no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.
1 Like

can anyone explain why in the solution for this exercise, they get rid of the step in which they initially created an empty list?

Could you give an example of an instance where not using newline=’’ would cause an issue?

Not without following up the same information given above, and other SO type questions.

This is because the instructions asks for each line to pe printed to the screen as a dictionary. It never asks for a list to be created.

This lesson is kinda messy with the instructions. If you follow it, you’ll be stuck at #2 forever, because you’ll get an SyntaxError: unexpected EOF while parsing. The #2 and #3 should be together or otherwise the code will show a SyntaxError

6 Likes

Yes, actually there are numerous instances in the Python courses, at least, where you are asked to type a line ending in a colon, and then press Run; lacking a second line, this obviously throws an error. Usually it’s a function header, but, like you, I’ve noticed it in this course in the with open() as x: construction.

Anyway, typing pass as the second line will get you past that step.

(It’s also a bit annoying that the parser won’t pass an open() expression containing the ‘r’ mode, since it is the default.)

2 Likes

" Since our CSV’s first line calls the third field in our CSV “ Email “, we can use that as the key in each row of our DictReader."

I’m having trouble understanding what the above quotation means.

How is the first line calling the third field?

In the solution no key arguments are specified for DictReader:

cool_csv_dict = csv.DictReader(cool_csv_file)

Hi, @petercook0108566555 Don’t worry about the word “call.” That sentence could be re-written:

Since our CSV’s first line identifies the third field in our CSV “ Email “, we can use that as the key in each row of our DictReader.

If you print out the original file:

with open('cool_csv.csv') as cool_csv_file:
  for line in cool_csv_file:
    print(line.strip())

You will see:

Cool Name,Cool Birthday,Cool Fact  
Trevor Torres,03-09-08,Has never been out of the country.
Crystal Ellis,17-11-06,Published a small biography on a local legend.
... etc.

That first line, Cool Name,Cool Birthday,Cool Fact, is the header line, used by csv.DictReader to obtain the keys for its output dictionaries.

So “Cool Fact” is actually the header or field name that we’re interested in, and that is what is used in the solution:

  for row in cool_csv_dict:
    print(row['Cool Fact'])

Entirely optional below this line:


You can further investigate what’s going on:

with open('cool_csv.csv') as cool_csv_file:
  cool_csv_dict = csv.DictReader(cool_csv_file)
  print(cool_csv_dict)

Output:

<csv.DictReader object at 0x7f941b527748>

Hmm, not too useful. What’s inside that object?

with open('cool_csv.csv') as cool_csv_file:
  cool_csv_dict = csv.DictReader(cool_csv_file)  )
  for row in cool_csv_dict:
    print(row)

Output:

OrderedDict([('Cool Name', 'Trevor Torres'), ('Cool Birthday', '03-09-08'), ('Cool Fact', 'Has never been out of the country.')])
OrderedDict([('Cool Name', 'Crystal Ellis'), ('Cool Birthday', '17-11-06'), ('Cool Fact', 'Published a small biography on a local legend.')])
... etc.

Well! That’s a bit strange! We haven’t covered the OrderedDict type, but if you look at it a bit, you will see that each entry is a list of tuples, and that each tuple looks like a key:value pair from a conventional dictionary, the final key being “Cool Fact” in each case.

If OrderedDict is too much to deal with right now, you can cast it to a conventional dict by changing that last line, print(row) to print(dict(row)), and then the output looks like this:

{'Cool Name': 'Trevor Torres', 'Cool Birthday': '03-09-08', 'Cool Fact': 'Has never been out of the country.'}
{'Cool Name': 'Crystal Ellis', 'Cool Birthday': '17-11-06', 'Cool Fact': 'Published a small biography on a local legend.'}

So, That is what csv.DictReader does: it takes this:

Cool Name,Cool Birthday,Cool Fact
Trevor Torres,03-09-08,Has never been out of the country.
Crystal Ellis,17-11-06,Published a small biography on a local legend.
... etc

… and turns it into this:

{'Cool Name': 'Trevor Torres', 'Cool Birthday': '03-09-08', 'Cool Fact': 'Has never been out of the country.'}
{'Cool Name': 'Crystal Ellis', 'Cool Birthday': '17-11-06', 'Cool Fact': 'Published a small biography on a local legend.'}
... etc

… or, from the docs:

Create an object that operates like a regular reader but maps the information in each row to an OrderedDict whose keys are given by the optional fieldnames parameter.

(And that’s nothing compared to what Pandas can do!!)

1 Like

Jeez. Not sure if I’m excited for Pandas :see_no_evil: Thx again!