The workshop will be held on
19th August 2007 from 9:00am-1:00pm at the
Brisbane Convention Centre, Brisbane, Australia in conjunction with the triennial congress Medinfo

Medinfo 2007 Workshop
Models of trust for health websites

Organised by:
aHealth on the Net Foundation Geneva, Switzerland
bService of Medical Informatics, Geneva University Hospitals, Switzerland
Célia Boyera, Natalia Grabara, Arnaud Gaudinata, Antoine Geissbühlera,b

Program and Participants Description
Co-chairing committee:
Henning Muller (SIM/HUG, Geneva, Switzerland)
Stephane Spahni (SIM/HUG, Geneva, Switzerland)
Arnaud Gaudinat (HON, Geneva, Switzerland)

19th August 2007 from 9:00am-1:00pm
9h - 9h30: Workshop welcome
9h30 - 10h00: Introduction
10h00 - 10h30:
Towards higher quality health search results: Automated quality rating of depression websites

     David Hawking (CSIRO ICT Centre, Australia)
     Biography: David Hawking is a researcher in the CSIRO ICT Centre and Science Leader for the Information Retrieval area. Three days a week, he works for the Funnelback enterprise search company in the role of Chief Scientist. Funnelback is a CSIRO subsidiary company. His interests lie in the areas of Information Retrieval and Web Search. He is particularly interested in search evaluation in realistic contexts, distributed search techniques, enterprise/intranet search, improvement of search through exploitation of context, personal search and search efficiency. He is a member of the editorial board for the Information Retrieval journal (INRT). From 1996-2003 David was a coordinator of VLC and Web tracks in the annual Text Retrieval Conference (TREC). The activities of these tracks are summarised in Chapter 9 of the new MIT Press book (Ellen M. Voorhees and Donna K. Harman (Ed.s), TREC - Experiment and Evaluation in Information Retrieval, Search for ISBN 0262220733 to find it on the Web.) He was a Program Chair of ACM SIGIR in 2003 and again in 2006. David Hawking hold an honorary doctorate from the University of Neuchatel and was awarded the Chris Wallace prize for Computer Science at the Australasian Computer Science Week 2005.
     Abstract: Consumers are increasingly reliant on the web for health information and advice, and increasingly reliant on search engines to locate health resources. A search engine able to bias its results against sites offering dubious or even harmful health advice, would obviously be of value to consumers. We have developed an Automated Quality Assessment procedure, which learns complex information retrieval queries from training sets of high and low quality depression websites. Processing these queries on test sets yielded site scores which correlated 0.85 with human expert ratings against evidence-based guidelines. We have subsequently used the AQA technique to guide a depression-focused web crawler and to filter results from a major web search engine with very encouraging results. Currently, we are investigating whether the AQA technique will generalise to other health domains, starting with obesity.
10h30 - 11h00:
Presentation of various models of trust used at HON Foundation
     Arnaud Gaudinat (HON, Switzerland)
     Biography: Arnaud was responsible for the development of the research tools for WRAPIN, the ICT European WRAPIN project (World Reliable Online Advice for Patients and Individuals) where the objectives were to help the public judge the quality of online health information by combining state of the art search engine technology and trustworthy health sources. He was involved in both, implementation as well as elaboration of new algorithms for the provision of better search results. Currently, Arnaud is working on the ICT European PIPS (Personalized Information Platform for life and health Services) project where HON is involved in the development of trust mechanisms and, more particularly, on the development of an automatic quality criteria detector and medical question answering system.
     Abstract: The Internet is an ever-expanding arena with hundreds of new websites being born everyday. However, due to the open nature of the Internet, the reliability of this information is not constant. As a consequence, Internet users could be overwhelmed, both by the vast choice available as well as the difficulty in filtration of the reliable information from this information pool. This situation becomes more critical in the medical domain, where content proposed by health websites can have a direct impact on the users' health and well being. In this context, we present various initiatives of the Health on the Net Foundation to ensure the reliability of online health information and to increase public awareness of the importance of reliability of medical content on the web. Over the last ten years, the Health On the Net Foundation (HON) has responded to the risks and dangers represented by the ever increasing mass of online health and medical information. The main aims of HON are: protection of Internet users by establishing a third party accreditation program through the application of the HON Code of Conduct and development of a medical search engine for a more efficient access to medical information. An overview of the main achievements of HON will be given; the HONcode initiative, active seals, the Health Search engine, the Trusted Search Engine, collaboration w ith Google and the readability categorizer. Finally, the presentation will focus on the recent original development conceived in order to address the quality problem of Web sites. We will present the design of an automatic system conceived for the categorisation of medical and health documents according to the HONcode ethical principles. Both limitations and advantages of all described initiatives will be presented. Off course, due to the huge diversity of skills, culture, ideas and other factors, of individuals, solutions proposed in order to answer the problem of quality of online medical information varies as well. By combining manual and automatic expertise and taking into consideration, the guidance and empowerment of both user and webmaster, HON addresses this issue through multiple ways, the ultimate aim being to provide a trustworthy information source for the public.
11h00 - 11h30: Coffee and chocolate pause
11h30 - 12h00:
WRAPIN: finding trusted health information on the Internet and empowering patients within personal health records
     Michel Joubert (Lertim, France)
     Biography: Michel JOUBERT is Assistant Professor at the Faculty of Medicine and the University Hospital in Marseille, France. He specialized both in Health Information Systems (he participated to several European projects in this domain) and in indexing and information retrieval (he was actively involved in the European project WRAPIN and other French projects in these domains). He is currently scientific head of a French project the aim of which is the design and implementation of a health multiterminology server.
     Abstract: In the near future, information technology may make it even easier to provide patients a chance to review their records. One may wonder, however, about the practical use of this technology by patients. Understanding his/her own health record will certainly be one of the main concerns of patients. WRAPIN has been designed to provide patients and citizens with trusted health information. It will help to determine the reliability of documents by checking the ideas contained against established benchmarks, and enable users to determine the relevance of a given document from a page of search results. We present the original and important patient-centred WRAPIN characteristics and functionalities. The comparison with two main trends in information retrieval (popularity and clustering) shows that, even though patients are tempted to use popular search engines, these are not sufficiently specialized in the medical domain to help them understand their own HER.
12h00 - 12h30:
Text Categorization Models f or Identifying Unproven Cancer Treatments on the Web

     Yin Aphinyanaphongs (Vanderbilt University, USA)
     Biography: Yin Aphinyanaphongs has currently a MD grade and he recently defended his PhD thesis titled “Identifying High Quality MEDLINE Articles and Web Sites Using Machine Learning.” His dissertation focused on using machine learning to build models of quality in treatment, diagnosis, prognosis, etiology, cost, clinical prediction guide, and economics. The models were compared to other state of the art technologies and measures to identify high quality articles and were found to over-perform and never underperform previous methods. In 2006, Yin was awarded a Medical Library Association Donald Lindberg research fellowship grant to expand this work. Today’s talk is the final aim of his dissertation. Yin applied the machine learning pattern recognition techniques to identify web pages that make unproven cancer treatment claims.
Currently, Yin is taking a few months off before beginning his clinical rotations at Vanderbilt University.
     Abstract: The nature of the internet as a non-peer-reviewed (and largely unregulated) publication medium has allowed wide-spread promotion of inaccurate and unproven medical claims in unprecedented scale. Patients with conditions that are not currently fully treatable are particularly susceptible to unproven and dangerous promises about miracle treatments. In extreme cases, fatal adverse outcomes have been documented. Most commonly, the cost is financial, psychological, and delayed application of imperfect but proven scientific modalities. To help protect patients, who may be desperately ill and thus prone to exploitation, we explored the use of machine learning techniques to identify web pages that make unproven claims. This feasibility study shows that the resulting models can identify web pages that make unproven claims in a fully automatic manner, and substantially better than previous web tools and state-of-the-art search engine technology.
12h30 - 13h00:
CISMeF: Catalog & Index of Health Resources in French on the Internet
     Stefan Darmoni (CISMeF, France)
     Biography: Stefan Darmoni is the director of the group GCSIS (Gestion de la Connaissance et Systèmes Informations de Santé/Knowledge management and Health Information Systems) within the laboratory LITIS. Since 2001, the team of GCSIS has regularly accommodated postgraduate students in their studies for a Masters degree or their Science PhD. Stefan Darmoni's work is primarily devoted to the set of themes related to knowledge management and can be described through the various levels of this research: The fundamental aspects, the technological research and the aspects of valorisation; Creation of a CISMeF terminology; Optimization of information retrieval; Automatic indexing (texts & images); Quality standardization of the information of health on the Internet; Health Information systems on the Internet. Amongst the major accomplishments can be included, the design, development and evaluation of several management tools of knowledge:
- Design, installation and evaluation of Vidal Electronics since 1992.
- Design and development of CISMeF (Catalogue and Index of the French-speaking Medical Sites) since 1995.
- Establishment of quality standards of health information on the Internet, in collaboration with the Central School of Paris since 1997.
- CISMeF and Net Scoring integrated in the French-speaking Virtual Medical University project.
- In 1992, a cost-effective study was carried out, of the bibliographical data base Medline
- The design, development and evaluation of medical expert systems in hepatology, obstetrics, ophthalmology and toxicology from 1987 to 1995.
- Development of a tool for computer-assisted planning for care personnel (HoroPLAN) from 1993 to 1996.
     Abstract: The Internet became a major source of health information for the health professional and the Netizen. The objective of Doc'CISMeF (D'C) was to create a powerful generic search tool based on an structured information model which 'encapsulates' the MeSH thesaurus to index and retrieve quality health resources on the Internet. To index resources, D'C uses four sections in its information model: 'meta-term', keyword, subheading, and resource type. Two search options are available: simple and advanced. The simple search requires the end-user to input a single term or expression. If this term belongs to the D'C information structure model, it will be exploded. If not, a full-text search is performed. In the advanced search, complex searches are possible combining Boolean operators with meta-terms, keywords, subheadings and resource types. D'C uses two standard tools for organising information: the MeSH thesaurus and the Dublin Core metadata format. Resources included in D'C are described according to the following elements: title, author or creator, subject and keywords, description, publisher, date, resource type, format, identifier, and language.

13H30 Conclusion


