Vote count:
0
I want to create a spider to retrieve data from the web. The main focus is the word usage in websites. I'd need to retrieve the 10 most used words in these websites, and to create statistics about the number of occurrences of determined words related to the subject of the website.
1) Create a spider to automatically retrieve some paragraphs from random websites
2) Insert these data in the SQL database
3) Elaborate the data to create statistics
My question is, what is the most efficient way to store these data? Is it better to store the raw data and to elaborate them every time a user visits a statistics page, or to put the elaborated data into the database?
I understand that you might need more information, but I tried to explain myself the best I could. If you need some explanation just ask, I'll gladly answer.
asked 44 secs ago
Aucun commentaire:
Enregistrer un commentaire