Quantcast
Channel: Active questions tagged excel - Stack Overflow
Viewing all articles
Browse latest Browse all 90251

Import from Excel in Python Pandas and rearrange metadata header as column data

$
0
0

I try to import data from Excel to Pandas, but I have issues with rearranging the meatadata.

The Excel sheet is in the format (simplified here): Original data

But I would like to import the Excel sheet to at Pandas dataframe and then re-arrange in this format in order to simplify statistical plotting in Pandas: Final data

I've tried the following method but I can't get to the correct format:

import numpy as np
import pandas as pd
FNAME = 'Original.xlsx'
df = pd.read_excel(FNAME, sheet_name='Sheet1', header = [0,1,2,3])
mi = pd.MultiIndex.from_frame(df)
dfmi = pd.melt(df, id_vars=[mi.names[0]])
# Add column index
col = list(mi.names[0])
col.insert(0,'temp')
col.append('value')
col[-2]='type'
dfmi.columns = col

df
Out[17]: 
                   A     a1                                     
                   B     b1                    b2               
                   C     c1                    c2               
  Unnamed: 0_level_3 Data 1 Data 2  Data 3 Data 1 Data 2  Data 3
0                NaN      1      7      13      4     10      16
1                NaN      2      8      14      5     11      17
2                NaN      3      9      15      6     12      18

dfmi
Out[18]: 
    temp   A   B   C     type  value
0    NaN  a1  b1  c1   Data 1      1
1    NaN  a1  b1  c1   Data 1      2
2    NaN  a1  b1  c1   Data 1      3
3    NaN  a1  b1  c1   Data 2      7
4    NaN  a1  b1  c1   Data 2      8
5    NaN  a1  b1  c1   Data 2      9
6    NaN  a1  b1  c1   Data 3     13
7    NaN  a1  b1  c1   Data 3     14
8    NaN  a1  b1  c1   Data 3     15
9    NaN  a1  b2  c2   Data 1      4
10   NaN  a1  b2  c2   Data 1      5
11   NaN  a1  b2  c2   Data 1      6
12   NaN  a1  b2  c2   Data 2     10
13   NaN  a1  b2  c2   Data 2     11
14   NaN  a1  b2  c2   Data 2     12
15   NaN  a1  b2  c2   Data 3     16
16   NaN  a1  b2  c2   Data 3     17
17   NaN  a1  b2  c2   Data 3     18

My prefered Pandas format would be:

    A   B   C  Data 1  Data 2  Data 3
0  a1  b1  c1       1       7      13
1  a1  b1  c1       2       8      14
2  a1  b1  c1       3       9      15
3  a1  b2  c2       4      10      16
4  a1  b2  c2       5      11      17
5  a1  b2  c2       6      12      18

But the values in dfmi are unstacked to one column only, I would like to keep the three data columns. Are there any other methods in order to get to my prefered data format?

Link to Excel file: Excel file Original data


Viewing all articles
Browse latest Browse all 90251


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>