Hundreds of holidaymakers could face losing their flight bookings after Ryanair announced plans to cancel all those made through online “screen scraping” programmes.
Often used by booking and price comparison websites, “screen scraping” lifts details from one website to produce a mirror of the original page on another, allowing bookings to be made with Ryanair without visiting the airline’s official website.
The carrier says it will begin cancelling these bookings from August 11.
The budget airline is currently embroiled in legal battles with several companies over the issue, which it claims infringes copyright law and the terms and conditions of its website.
Irish site BravoFly Ltd has discontinued the practice, after a legal challenge was made by Ryanair, while an injunction has been secured against German-based V-tours.
Other websites highlighted by Ryanair include Opodo.com, Atrapalo.com and OTBeach.com.
“We are determined to take strong action against these sites,” said Daniel de Carvalho, a spokesman for Ryanair. He said that stopping the practice was difficult as the creators of screenscraping devices can repeatedly rewrite the programmes to avoid detection.
Mr de Carvalho said that when passengers book through “screen scrapers” it places the airline’s server under pressure, slowing the website down and affecting the experience of its customers.
He added that websites who use screenscraping will often mislead passengers into paying additional service charges and handling fees.
Furthermore, the practice can prevent customers from receiving email updates from Ryanair when flights are changed or cancelled.
“We have received complaints from passengers who have not been informed of changes to their flight, because our notifications are sent to the email of the price comparison company which booked the flight, not the customer,” said Mr de Carvalho.
A spokeswoman for budget airline Flybe said that it had experienced problems in the past with screen scraping, but it now allows third-party access to its pricing and booking engines, eliminating the impact on its own website.
From Wikipedia, the free encyclopedia
Screen scraping is a technique in which a computer program extracts data from the display output of another program. The program doing the scraping is called a screen scraper. The key element that distinguishes screen scraping from regular parsing is that the output being scraped was intended for final display to a human user, rather than as input to another program, and is therefore usually neither documented nor structured for convenient parsing. Screen scraping often involves ignoring binary data (usually images or multimedia data) and formatting elements that obscure the essential, desired text data. Optical character recognition software is a kind of visual scraper.
There are a number of synonyms for screen scraping, including: Data scraping, page spidering, web crawling, data extraction, web scraping, page scraping, web page wrapping and HTML scraping (the last four being specific to scraping web pages).
Normally, data transfer between programs is accomplished using data structures suited for automated processing by computers, not people. Such interchange formats and protocols are typically rigidly structured, well-documented, easily parsed, compact, and keep ambiguity and duplication to a minimum. Very often, these transmissions are not human-readable at all.
In contrast, output intended to be human-readable is often the antithesis of this, with display formatting, redundant labels, superfluous commentary, and other information which is either irrelevant or inimical to automated processing. However, when the only output available is such a human-oriented display, screen scraping becomes the only automated way of accomplishing a data transfer.
Originally, screen scraping referred to the practice of reading text data from a computer display terminal’s screen. This was generally done by reading the terminal’s memory through its auxiliary port, or by connecting the terminal output port of one computer system to an input port on another. By analogy, screen scraping has also come to mean computerized parsing of the HTML text in web pages. In all cases, the screen scraper has to be programmed to not only process the text data of interest, but also to recognize and discard unwanted data, images, and display formatting.
Screen scraping is most often done to either (1) interface to a legacy system which has no other mechanism which is compatible with current hardware, or (2) interface to a third-party system which does not provide a more convenient API. In the second case, the operator of the third-party system may even see screenadvertisement revenue, or the loss of control of the information content. scraping as unwanted, due to reasons such as increased system load, the loss of advertisement revenue, or the loss of control of the information content.
Screen scraping is generally considered an ad-hoc, inelegant technique, often used only as a “last resort” when no other mechanism is available. Aside from the higher programming and processing overhead, output displays intended for human consumption often change structure frequently. Humans can cope with this easily, but computer programs will often crash or produce incorrect results.
Screen scraping generally requires intensive text parsing algorithms. Computer languages that have strong support for regular expressions and other text processing are thus a popular choice for writing screen scraping programs.
Web scraping:
Web pages are built using text-based mark-up languages (HTML and XHTML), and frequently contain a wealth of useful data in text form. However, most web pages are designed for human consumption, and frequently mix content with presentation. Thus, screen scrapers were reborn in the web era to extract machine-friendly data from HTML and other markup. Even general-purpose search engines and other web crawlers use many techniques in the same vein as web scraping.
Tweet this!
