How is a Pandas series different from a dataframe?

Question

How is a Pandas series different from a dataframe?

Answer

In Pandas a series is a one-dimensional object that contains any type of data, similar in ways to a Numpy array.

Series objects have a single axis label, like a column title, which is the index of the series. A series is essentially a single column.

# Creating a series
clinic_east = pd.Series([100, 51, 81, 80, 51, 112])

A dataframe is a two-dimensional object that can hold multiple columns of different types of data. They are similar to a table in SQL.

A single column of a dataframe is a series, and a dataframe is a container of two or more series objects.

# Creating a DataFrame
df = pd.DataFrame ([
  ['January', 100, 100],
  ['February', 51, 45],
  ['March', 81, 96]],
  columns=["month", "clinic_east", "clinic_north"]
)
5 Likes

What do we call a single row of data frame ?

It’s known as ‘Series’

A single row is like a ‘datapoint’ of your data.

3 Likes

Seems like no particular name for 3-dimensional object (cannot find from panda documents), am I correct? Just curious. :slight_smile:

2 Likes

This might be helpful, though I have not yet tried it out: Pandas: MultiIndex / advanced indexing.

Let us know whether you find it suitable for the task.

2 Likes

A series object would be the column. A row is, as someone else stated, like a “datapoint.” It is one record with various values for the included fields (columns).

2 Likes

You can call it a “record”.

1 Like

I do not understand why the column title is the index of the series. Do you mean it is the index of the dataframe series is located in ?

Can I select multiple columns of DF?

That wording confused me as well.

I think it means that the column title can act like index of the series in terms of how you select it. The column title is really like the label on top of the index.

But perhaps someone more experienced can clarify this point.