jeudi 27 novembre 2014

Counting Frequencies


Vote count:

0




I am trying to figure out how to count the number of frequencies the word tags I-GENE and O appeared in a file.


The example of the file I'm trying to compute is this:


45 WORDTAG O cortex

2 WORDTAG I-GENE cdc33

4 WORDTAG O PPRE

4 WORDTAG O How

44 WORDTAG O if


I am trying to compute the sum of word[0] (column 1) in the same category (ex. I-GENE) same with category (ex. O)


In this example:


The sum of words with category of I-GENE is:


2


and the sum of words with category of O is:


97


MY CODE:


import os


def reading_files (path):



counter = 0

for root, dirs, files in os.walk(path):

for file in files:
if file != ".DS_Store":
if file == "gene.counts":
open_file = open(root+file, 'r', encoding = "ISO-8859-1")
for line in open_file:
tmp = line.split(' ')

for words in tmp:

for word in words:
if (words[2]=='I-GENE'):
sum = sum + int(words[0]
if (words[2] == 'O'):
sum = sum + int(words[0])

else:
print('Nothing')

print(sum)


asked 43 secs ago







Counting Frequencies

Aucun commentaire:

Enregistrer un commentaire