Vote count:
0
I am trying to figure out how to count the number of frequencies the word tags I-GENE and O appeared in a file.
The example of the file I'm trying to compute is this:
45 WORDTAG O cortex
2 WORDTAG I-GENE cdc33
4 WORDTAG O PPRE
4 WORDTAG O How
44 WORDTAG O if
I am trying to compute the sum of word[0] (column 1) in the same category (ex. I-GENE) same with category (ex. O)
In this example:
The sum of words with category of I-GENE is:
2
and the sum of words with category of O is:
97
MY CODE:
import os
def reading_files (path):
counter = 0
for root, dirs, files in os.walk(path):
for file in files:
if file != ".DS_Store":
if file == "gene.counts":
open_file = open(root+file, 'r', encoding = "ISO-8859-1")
for line in open_file:
tmp = line.split(' ')
for words in tmp:
for word in words:
if (words[2]=='I-GENE'):
sum = sum + int(words[0]
if (words[2] == 'O'):
sum = sum + int(words[0])
else:
print('Nothing')
print(sum)
asked 43 secs ago
Counting Frequencies
Aucun commentaire:
Enregistrer un commentaire