FAQ: Working with Multiple DataFrames - Merge on Specific Columns II

This community-built FAQ covers the “Merge on Specific Columns II” exercise from the lesson “Working with Multiple DataFrames”.

Paths and Courses
This exercise can be found in the following Codecademy content:

Data Science

Data Analysis with Pandas

FAQs on the exercise Merge on Specific Columns II

Join the Discussion. Help a fellow learner on their journey.

Ask or answer a question about this exercise by clicking reply (reply) below!

Agree with a comment or answer? Like (like) to up-vote the contribution!

Need broader help or resources? Head here.

Looking for motivation to keep learning? Join our wider discussions.

Learn more about how to use this guide.

Found a bug? Report it!

Have a question about your account or billing? Reach out to our customer support team!

None of the above? Find out where to ask other questions here!

Hello! So I am a bit confused about why we would ever use left_on, right_on. It adds an extra column which just repeats another column and seems unnecessarily complicated when compared to just renaming the column in the merge() call.

I just don’t understand in what situation we would prefer to do this, since the result seems to always be more clunky.

In what principle the columns are sorted after being merged?

How come it is the ‘id’ columns that get renamed to id_x and id_y?

import codecademylib
import pandas as pd

orders = pd.read_csv('orders.csv')
print(orders)
products = pd.read_csv('products.csv')
print(products)

orders_products = pd.merge(
  orders,
  products,
  left_on = 'product_id',
  right_on = 'id',
  suffixes = ['_orders', '_products']
)
print(orders_products)

Left table we are merging on column ‘product id’ right table we are merging on column ‘id’

But in the merged table the columns that get changed to id_x and id_y (or renamed using suffix) is the ‘id’ column in both tables, not the ‘product id’ column in the left table

1 Like

According to the Pandas document of the merge method, suffixes are applied to overlapping column names in the left and right side, respectively. In our example, only id is overlapping and others are not, so suffixes are added to id.

1 Like