I'm trying to read an excel file into a data frame and I want set the index later, so I don't want pandas to use column 0 for the index values.
By default (index_col=None
), it shouldn't use column 0 for the index but I find that if there is no value in cell A1 of the worksheet it will.
Is there any way to over-ride this behaviour (I am loading many sheets that have no value in cell A1)?
This works as expected when test1.xlsx has the value "DATE" in cell A1:
In [19]: pd.read_excel('test1.xlsx')
Out[19]:
DATE A B C
0 2018-01-01 00:00:00 0.766895 1.142639 0.810603
1 2018-01-01 01:00:00 0.605812 0.890286 0.810603
2 2018-01-01 02:00:00 0.623123 1.053022 0.810603
3 2018-01-01 03:00:00 0.740577 1.505082 0.810603
4 2018-01-01 04:00:00 0.335573 -0.024649 0.810603
But when the worksheet has no value in cell A1, it automatically assigns column 0 values to the index:
In [20]: pd.read_excel('test2.xlsx', index_col=None)
Out[20]:
A B C
2018-01-01 00:00:00 0.766895 1.142639 0.810603
2018-01-01 01:00:00 0.605812 0.890286 0.810603
2018-01-01 02:00:00 0.623123 1.053022 0.810603
2018-01-01 03:00:00 0.740577 1.505082 0.810603
2018-01-01 04:00:00 0.335573 -0.024649 0.810603
This is not what I want.
Desired result: Same as first example (but with 'Unnamed' as the column label perhaps).
Documentation says
index_col : int, list of int, default None.
Column (0-indexed) to use as the row labels of the DataFrame. Pass None if there is no such column.