So I have a xls file with some weird style but there's nothing I can do about it so I just need to parse it.
As you can see I have some merged cells. What I want to do is fill the empty values for the merged cells ("ffill") but also keep the empty cells like they are.
Something like this
EIM, C,NI1 Enescu_Ioan, EIM, S,NI11,Enescu_Ioan
EIM, C,NI1 Enescu_Ioan, Empty
EIM, C,NI1 Enescu_Ioan EIM, S,NI11,Enescu_Ioan
EIM, C,NI1,Enescu_Ioan Empty
The way I'm loading the file right now is this.
xl = pd.ExcelFile("data/file.xls")
df = xl.parse(0, header=None)
I've also tried to open the file like this and access the merged cells but I get an empty list.
book = xlrd.open_workbook("data/file.xls")
book.sheet_by_index(0).merged_cells # This is empty []
Is there any way I could achieve this? Thanks!
EDIT
There might be some confusions regarding the question so I'll try to explain better. The attached image is a subset of a larger file where the columns may appear in different order. What I'm trying to achieve is a way of differentiating between merged cells NAN values (in a merged cell only the first column has a value, the rest are all nan) and empty cells NAN.