Question
Can we add a new column at a specific position in a Pandas dataframe?
Answer
Yes, you can add a new column in a specified position into a dataframe, by specifying an index and using the insert()
function. By default, adding a column will always add it as the last column of a dataframe.
Say for example, we had a dataframe with five columns. If we wanted to insert a new column at the third position (index 2), we could do so like this:
# Third position would be at index 2, because of zero-indexing.
df.insert(2, 'new-col', data)
This will insert the column at index 2, and fill it with the data provided by data
. When inserting, the columns from index 2
onward will effectively be shifted over to the right by 1 index each. The column that was previously at index 2
would now be at index 3
and so on for the following columns.
6 Likes
Hello, I tried adding data using the following method but it didnât seem to work. Any ideas where I went wrong?
import codecademylib
import pandas as pd
df = pd.DataFrame([
[1, '3 inch screw', 0.5, 0.75],
[2, '2 inch nail', 0.10, 0.25],
[3, 'hammer', 3.00, 5.50],
[4, 'screwdriver', 2.50, 3.00]
],
columns=['Product ID', 'Description', 'Cost to Manufacture', 'Price']
)
# Add columns here
sold_in_bulk = pd.dataframe ["Yes","Yes","No", "No"]
df.insert (3, "Sold in Bulk", sold_in_bulk)
print(df)
also Iâm assuming insert only works on dataframes and if I formatted it as a normal list e.g. sold_in_bulk= ["Yes","Yes","No", "No"]
it wouldnât work?
2 Likes
You have probably solved this already, but, just for future reference, the assumption was off; simply using a list for sold_in_bulk does the trick.
2 Likes
import pandas as pd
df = pd.DataFrame([
[1, â3 inch screwâ, 0.5, 0.75],
[2, â2 inch nailâ, 0.10, 0.25],
[3, âhammerâ, 3.00, 5.50],
[4, âscrewdriverâ, 2.50, 3.00]
],
columns=[âProduct IDâ, âDescriptionâ, âCost to Manufactureâ, âPriceâ]
)
Add columns here
sold_in_bulk = pd.DataFrame([âYesâ, âYesâ, âNoâ, âNoâ])
df.insert (3, âSold in Bulkâ, sold_in_bulk)
print(df)
5 Likes
I think your sold_in_bulk is a series, not a list by the way.
This is what you needed to do.
df.insert(3, "Sold in Bulk", ["Yes", "Yes", "No", "No"])
7 Likes
Thanks, your example is pretty straight forward >:)
2 Likes
PSA
It wasnât explicitly stated in the exercise, but you canât create a new column using dot-notation, though you can modify one if it already exists.
For example:
This works:
df['new_col'] = [1, 2, 3, 4]
This wonât work:
df.new_col = [1, 2, 3, 4]
GitHub Ticket: https://github.com/pandas-dev/pandas/issues/7175
2 Likes
Hi,
I tried to create the data as a series instead of data frame, and it worked.
Hope this would help and thanks for your inspiration.
Cheers!
df = pd.DataFrame([
[1, '3 inch screw', 0.5, 0.75],
[2, '2 inch nail', 0.10, 0.25],
[3, 'hammer', 3.00, 5.50],
[4, 'screwdriver', 2.50, 3.00]
],
columns=['Product ID', 'Description', 'Cost to Manufacture', 'Price']
)
#Add columns here
data = pd.Series(['Yes','Yes','No','No'])
df.insert(3, 'Sold in Bulk2?', data)
print(df)
1 Like
data = pd.Series(['Yes','Yes','No','No'])
df.insert(3, 'Sold in Bulk2?', data)
print(df)
in the variable data above, does it matter if we use pd.series or pd.dataframe? It seems to works both ways.