Reshaping Data

Updated Aug 17, 2021 ·

Wide vs. Long Data

Data can be stored in two formats:

Wide format – One row per subject, with multiple columns for attributes
Long format – One subject appears in multiple rows, each row represents an attribute

Wide format is easier to read, while long format is better for analysis.

We can use the melt() method to transform a wide format data to a long format.

Using `.melt()`

The .melt() method converts wide-format data into a long-format, making it more useful for analysis.

Converts wide data to long format
Moves selected columns into rows
Keeps identifier columns unchanged

Consider a dataset containing financial metrics for two companies:

import pandas as pd

social_fin = pd.DataFrame({
    "Company": ["Facebook", "Twitter"],
    "2016": [27, 3.5],
    "2017": [40, 5.0],
    "2018": [56, 7.2],
    "2019": [70, 8.5]
})

print(social_fin)

Output in wide form:

    Company  2016  2017  2018  2019
0  Facebook  27.0  40.0  56.0  70.0
1   Twitter   3.5   5.0   7.2   8.5

Now, we use .melt() to transform this data.

social_long = social_fin.melt(id_vars=["Company"])
print(social_long)

Output in Long Form:

    Company variable  value
Facebook     2016   27.0
 Twitter     2016    3.5
Facebook     2017   40.0
 Twitter     2017    5.0
Facebook     2018   56.0
 Twitter     2018    7.2
Facebook     2019   70.0
 Twitter     2019    8.5

Controlling Columns

We can choose which columns to unpivot using value_vars. In the example below, we'll onlu unpivot the 2017 and 2018 columns.

social_long = social_fin.melt(id_vars=["Company"], value_vars=["2017", "2018"])

print(social_long)

Output:

    Company variable  value
Facebook     2017   40.0
 Twitter     2017    5.0
Facebook     2018   56.0
 Twitter     2018    7.2

Renaming Columns

We can also rename the columns for better readability.

social_long = social_fin.melt(id_vars=["Company"], 
                              var_name="Year", 
                              value_name="Revenue")

print(social_long)

Output:

    Company  Year  Revenue
Facebook  2016     27.0
 Twitter  2016      3.5
Facebook  2017     40.0
 Twitter  2017      5.0
Facebook  2018     56.0
 Twitter  2018      7.2
Facebook  2019     70.0
 Twitter  2019      8.5

Wide vs. Long Data​

Using .melt()​

Controlling Columns​

Renaming Columns​

Wide vs. Long Data

Using `.melt()`

Controlling Columns

Renaming Columns