Hello community,
I am hoping someone can guide me in the right direction regarding Python Pandas merge function. I have 2 csv files I am trying to join. The first csv will only contain 1 ID that will constantly change. I need to take that ID and search the 2nd file and return the values in the adjacent cells. I can easily do this if my data on both files are strings. However, I will be working with alphanumeric values. Pandas does not like that. Is there a work around? This is what I have as an example:
df1 contains the following
External_ID
A158976453
df2 contains the following
External_ID, Sample_ID
1587966543, 025-85-258
1659787846, 068-87-856
A569787522, 568-98-785
A158976453, 485-89-562
My code:
import pandas as pd
df1 = pd.read_csv(‘BarcodeScan.csv’)
df2 = pd.read_csv(‘SamplePrepIDs.csv’)
df3 = pd.merge(df1, df2[[‘External_ID’, ‘Sample_ID’]], on=‘External_ID’, how=‘left’)
print(df3)
+++++++++
The result I’m looking for is:
External_ID, Sample_ID
A158976453, 485-89-562
+++++++++
Error:
The error I’m receiving is that “you are trying to merge on int64 and object columns. If you wish to proceed you should use pd.concat.”
+++++++++
Concat is not what I need. Is there a work around or an entirely other method that I need to look at?
Thank you for all the help in advance!