Comparing 2 dataframes

Hi.

I have 2 dataframes quite similar df1 and df2 where df2 has more columns than df1.How can I add to df1 the columns that miss compared to df2 in manner that the similar column has the same position in the 2 dataframes?
Thank you in advance.

What do your dataframes look like?

Did you check the Pandas docs?

You can use pd.concat() or maybe try df1.append(df2)

https://pandas.pydata.org/pandas-docs/version/1.3/reference/api/pandas.DataFrame.append.html

I need to insert the missing columns in df1 in the right position.The important thing is to add the column with the right label, without thinking to the values.

What have you tried so far? What’s the results?
Check the documentation—there are different parameters for both methods. Try them out to see which one works better for you.

I’ll read the docs.
Thank you.

Angelo

Hi,
It’s not totally clear what you’re attempting.

If you just want df1 to have the same column structure as df2 by inserting the missing columns in the right place;
e.g.
df1 = A, B, E
df2 = A, B, C, D, E
and you want df1 to be A, B, C, D, E too.
Then the most straightforward way of doing it is to simply add the C, and D columns to df1, and then reorder it.
For instance;

df1['C'] = 0
df1['D'] = 0
df1 = df[['A','B','C','D','E']]

would add columns C and D, filled with 0’s and then reorder the dataframe.

If you’re looking to combine the two and the other comment hasn’t worked then you might want to look into ‘join’ or ‘merge’.
They give you various ways to fuse the two together depending on how much data you need to retain from each dataframe.

Hope that helps

Hi.

Thank you for your reply.
I’ve applied one hot encoding to df_test and df_training and , because they have columns with different values, after this process they have a different number of columns.The problem begins when I apply the predict method to df_ test after fitting the model wth df_ training. They have different columns and I don’t how to solve it.
Thank you for every your suggestion.