I have the same question. I am also curious as to why we need to include “axis = 1” when that was not necessary in the previous exercise when we created a new column with:
When you show the solution, you should be able to go back and look at the code you attempted. View Solution deletes your code and you can’t see what you did wrong. Also, View Solution should only do one step at a time.
I get an invalid syntax error every single time I try to enter my code for the total_error lambda function, even when I just copy the solution. What is going on? It says you don’t need the backslashes but they seem required. This has happened on the last 3 exercises for this lesson.
This is my code so far:
total_earned = lambda row: (row.hourly_wage * 40) + ((row.hourly_wage * 1.5) * (row.hours_worked - 40))
if row.hours_worked > 40
else row.hourly_wage * row.hours_worked
The backslash is an “escape” character. I think that they wanted to put a newline in the code, probably for readability, but they didn’t want the newline to affect the code itself, so they used the backslash to “escape” the effect of the newline.
I think it is because you have whitespace / newlines before “if” and before “else” in your code which is interrupting the logic. I tried your code that you pasted and it gave me a syntax error. I deleted the whitespace/newlines and it worked.
df[‘new_column’] = df.apply(some function, axis=1) says that you apply function to a row as a whole.
It’s not necessary to write “row” here: function lambda row: row[‘price’]+10 is the same as lambda x: x[‘price’]+10 or whatever else name of parameter: row/x will be changed by each row of a database as an argument.
But I found an interesting moment: row[‘column’] and row.column is not the same if column is string type: the second is fine for numbers but doesn’t work with strings correctly:
Besides we can apply a function not only to a row as a whole but and to a column as a whole if we write axis=0:
Now I see that the reason of this error is not a data type but a column’s name “name” - it seems that Pandas takes name in row.name as a column of indexes:
I don’t recall seeing it in any lesson relating to lambda or if/else before this. Every lambda or if/else up until now has not required it, but in this lesson it is. Is it because we’re using Pandas? Is it a Pandas or table exclusive requirement?
The purpose of backslashes is joining lines. See this Python reference about explicit and implicit line joining (2.1.5 and 2.1.6) if you like.
For example, if we remove all backslashes from the first code in the post of @sionchen, we will get a SyntaxError. To avoid this, we need to do some line joining, or make everything into one line without any line breaks.
It’s because the way Python works, you can’t just line break anywhere within a line of code for it to work. When the line of code is too long and you want to continue on a new line for better readability, you use \
So as far as I know, it doesn’t matter whether you’re using lambda, pandas, whatever. You can use \ if you want to write one statement with multiple lines.