Tuesday, 16 April 2013

Tools for keeping up to date

Previously I have discussed how RSS news feeds benefit regulatory professionals by streamlining the process of keeping up to date with new information as it emerges.

Unfortunately, we have also seen that not all regulatory agencies have RSS feeds. This leaves the problem of what are the best ways of keeping up to date with agencies, especially those that do not have RSS news feeds. In this article, I will discuss some of the tools that I have found very helpful in achieving this.

Without the use of special tools, the method used by many regulatory professionals is to manually visit the web sites of each regulatory authority, which can be very time-consuming. Fortunately there are a number of useful tools which can at least provide some level of automated checking for updates.

E-mail newsletters

Many of the regulatory authorities have e-mail newsletters, for which one can sign up, for example the MHRA's Email Updates. These vary considerably in frequency and content from one regulatory authority to another. The MHRA's service comprises regular updates of new information as it becomes available, whereas the EMA provide a monthly Human medicines highlights news service.

Browser-based tools

Two free browser-based tools which I have found useful for monitoring for changes in web pages are Page Monitor (specific to Google Chrome) and Update Scanner (specific to Firefox). I am not aware of similar tools for Internet Explorer. There are more details of these two extensions, as well as several other solutions on the Web Data Extraction blog.

Web-based tools

There are two useful web-based tools which can monitor web pages and provide a history of the changes, as well as send e-mail alerts of the changes that occur. These services are WatchThatPage and TrackEngine.

Below are examples from monitoring the CMDh What's new page. The first shows and extract from an e-mail received from WatchThatPage.


The second example shows an extract of the same updates in an e-mail received from TrackEngine.


These two tools provide a way of monitoring for changes in key regulatory pages, without the need to manually visit those pages. Furthermore, by receiving these documented changes by e-mail it is easier to keep up to date as part of the day to day e-mail management, including on hand-held devices when out of the office.

Generating RSS from sites where none is available

As previously discussed, not all regulatory authorities in the EEA have RSS feeds. However, in my recent TOPRA article I discussed how data on the World Wide Web (WWW) conforms to certain structures. The structures within a web page can be used to provide even more structured data, such as RSS news feeds. Two basic pieces of information are needed to access this data. Firstly, the URL of the web page defines where to find the data on the web. Secondly, the XML Path Language (XPath) describes where within the structure of the web page data elements can be found. Using tools such as Yahoo Pipes the appropriate data elements can be filtered and an RSS feed constructed. In this way it is possible to generate RSS news feeds for regulatory authorities such as the Spanish Agency for Medicines and Health Products. Using the URL for news page on their web site (http://www.agemed.es/informa/novedades/home.htm) together with the appropriate Xpath query a Yahoo Pipe has been created to provide an RSS feed.

Some other tools that can be used to extract data structures for analyses include Yahoo Query Language (YQL) which is like the web equivalent of SQL and Google Docs spreadsheets.

RSS news feeds generated in this way are useful on their own to speed up the monitoring of changes, as previously described. Similarly, a collection of RSS feeds can be merged and filtered to create a consolidated data feed.


Using a variety of tools such as those above makes it much easier to keep up to date while spending less time doing so. However, it must be borne in mind that, as with everything on the web, these sources are subject to changes. For example, regulatory authorities are constantly changing their web sites (and consequently URLs). Also, as pages and web design are changed, so are the data structures. So even with automated or semi-automated process, it is always necessary to conduct QA assessments of the data from time to time.

No comments:

Post a comment

If you have any comments or questions please let me know.