![]() |
| EN FR |
|
|||||||||||||||||||
|
Quality Detector In the health information field, quality appears to be a high priority matter. The increasing volume of online health information covers a large specter of health related subjects, but most of the time the quality of health web content is uneven. There is no identification of doubtful sources. Existent methods consist of either, criteria evaluation or quality detection related to one specific theme. To guarantee the quality of all online health information, it’s important to be able to evaluate not only the production process of a website, but also the content of any web page related to any health subject. In this study we attempt to classify health web pages and determine whether they have a high quality level or not. We use a learning machine to detect low and high quality web pages. Here you will find the results obteained with the system. We have chosen the SVM (Support Vector Machine) classifier. Indeed, its provides better results than the NaiveBayes one. Then, a cross-validation has been realized. The average of our results is represented in the table below:
Legend We obtained a macro recall of 0.979 and a micro recall of 0.982 . This means that the system is able to return more than 97% of the reliable documents for all categories and over 98% of relevant documents mixed all together. For accuracy (precision), we obtained a value greater than 0.98 (micro and macro values). This means that over 98% of the documents returned by the system are properly classified . It is important to note that only 1.7% of documents are bad categorized by the system. To conclude, the system provides good results. This suggests its integration into a user interface for use in production. Access the compete final report (in French)
Acess the annex of the final report (in French)
|