Yes passing regular functions is perfectly valid. In Python the only difference would be that
lambda is anonymous (no bound name). You can also call other functions within a
lambda if you like and therefore it can be used with
.apply but consider
.apply as more of a last resort, the operations will be significantly slower than any vectorised pandas tools. In fact it’s likely to come in around roughly the same time as iteration since that’s what it’s doing. Using a loop might even be preferable here to nesting functions inside lambdas and trying to force the arguments into place, I’d choose a clear and readable for loop over a confusing
I’d suggest a look through the following which go into a lot of detail about when
.apply is useful.
.apply can be useful, just try not to overuse it.
You can make your example work but you’d need to change a few things. At the moment what are the arguments for
shape going to be for a data series? You’re only going to be dealing with a single argument.
It’s a little awkward but if you use a frame instead of a series you can access it by column or by row-
df = pd.DataFrame(
[[1, 2], [3, 4], [5, 6]],
df["C"] = df[["A", "B"]].apply(lambda row: row + row, axis=1)
# axis argument means we're accessing each row in turn
# equally you could write
df["E"] = df.apply(lambda row: row["A"] + row["B"], axis=1)
# because each row is provided as a Series object (like .iloc does)
This function would be adding element 0 and element 1 from each row together to make a new element in a new column
"C". You could equally pass something like the indexed data
frame to your function.
df["new"] = df.apply(lambda row: func(row["shape"], row["size"]), axis=1)
For something more complicated than that I’d consider passing a regular function, there’s already quite a lot going on in that line.
The frame itself would then look like the following where
"C" is just the sum of the elements before it-
I included this as an example though the vectorisation option is obvious in this case,
df["A"] + df["B"]. As a compromise a list comprehension might be an option, e.g.
df["D"] = [a + b for a, b in zip(df["A"], df["B"])].