FAQ: Working with Multiple DataFrames - Inner Merge II

This exercise is from the lesson "Working with Multiple DataFrames".

Data Science

Data Analysis with Pandas

Hi everyone,

I’m a total newbie to programming and have started with the Data Science path. Everything went (relatively :wink: ) smooth so far, however, now I stumbled upon on hurdle: In this lesson (Inner Merge II, Instruction 3) it seems, that some basic Python knowledge is already assumed.

My question: should I’ve started with the Python programming course first?

Thanks in advance!

Hi, I’m having the same problem. Everything was going fine until here and now it seems to require a higher level of coding knowledge. I’m stuck!

Hi guys, any idea why join doesn’t work here?

import codecademylib
import pandas as pd

sales = pd.read_csv('sales.csv')
targets = pd.read_csv('targets.csv')

sales_vs_targets = pd.merge(sales,targets)
sales_vs_tragets_2 = sales.join(targets, on ='month' )


It showed this error:
columns overlap but no suffix specified: Index([u’month’], dtype=’object’)

According to the document of the join method (see the last example at the bottom), this method always uses index (not column) of the DataFrame given as argument. So it seems that month column of targets should be set as index using the set_index method:

sales_vs_tragets_2 = sales.join(targets.set_index('month'), on='month')


Concerning pd.merge(), does the order of inputs matter?
That is, does pd.merge(df1, df2) produce the same result as pd.merge(df2, df1)?

Order matters. You can also use the how= parameter of pd.merge()
Did you try it out? What happened?



I found that a similar question was asked here.

It seems that, for the default merge type, the merge order affects (only) the order of columns in the result dataframe.