Quantcast
Channel: Active questions tagged excel - Stack Overflow
Viewing all articles
Browse latest Browse all 91789

Capturing excel data into a Python dictionary

$
0
0

I have an excel spreadsheet that has information regarding all atoms that are within 4 Angstroms of a chlorophyll b pigment. I would like to capture the data in the spreadsheet to make a plot of how often amino acids show up. What I've got so far has been inspired by "Automate the boring stuff with Python" by Al Sweigart. Here is a link to how the spreadsheet looks like: snapshot of the atom data.

The name of the particular chlorophyll is in column A, the atoms that are part of the chlorophyll are surrounded by brackets "< >" and the atoms that are within 4 Angstroms of these atoms are below.

Please find below the code I have so far:

from collections import OrderedDict as OD
import openpyxl as pyxl, pprint
print('Opening workbook...')
wb = pyxl.load_workbook('six_gix.xlsx')
sheet = wb.get_sheet_by_name('six_gix')
chlData = {}

    # fill in chlData with each chlorophyll in the spreadsheet
    print('Reading chlorophylls...')

    for row in range(1,sheet.max_row + 1,10):
        chl_id = sheet['A' + str(row)] #name of Chl looking at
        atom_head = sheet[str(row + 1)] # atom part of chl head
        pocket_atoms = sheet[str(row + 2)] # atoms that meet criteria

        #make sure the key for this chlorophyll exists
        chlData.setdefault(chl_id, {})

        #make sure the key for these atoms exists
        chlData[chl_id].setdefault(atom_head, [])
        chlData[chl_id][atom_head].append(pocket_atoms)

#open a new text file and write the contents of chlData to it
print('Writting results...')
resultFile = open('Six_gix.py', 'w')
resultFile.write('allData = ' + pprint.pformat(chlData))
resultFile.close()
print('Done.')

When I run the code and import Six_gix into the interpreter, I see that I didn't create the dictionary as I wanted it and I am getting an error I don't understand--it's telling me (when I import it to the interpreter) that I have invalid syntax. The way the dictionary is currently being created is so:

chlData = { <Cell 'six_gix'.A1>: {<Cell 'six_gix'.A2>, <Cell 'six_gix'.B2>, <Cell 'six_gix'.C2>...'.AT2>): [(<Cell 'six_gix'.A3>,

What I would like is:

chlData = {CHL A1001: {<Atom MG>: [PRO A36], <Atom CHA> :[VAL A35], <Atom CHB>: [ALA A37]...}

Any help is greatly appreciated, please let me know if you would like me to clarify something--thanks in advance!


Viewing all articles
Browse latest Browse all 91789

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>