Screen scraping is where someone applies a piece of software to your site to download large volumes of data in an automated way. The software literally mimics a user accessing your site, scraping/collecting data from the various screens. It is a technique usually used as an efficient way of getting access to huge amounts of data rather quickly.
From the perspective of the website owner there are a few negatives;
1. the website is being accessed in a way that not intended – most websites are developed for individuals browsing through web browsers
2. the people who are doing the scraping are typically collecting the data for purposes that is not in line with how the website owner intends – either to re-present the data to others via their own websites or to gain intelligence about the company’s business
3. because screen-scraping mimics users but typically in large volumes, it can put significant pressure on a web server and either slow down the responsiveness of a website or cause it to crash altogether
While screen-scraping may be a handy way to cost-effectively get access to your competitors’ data, you certainly don’t want anyone doing it to YOUR website.
Many sites that have issues with screen scrapers introduce a CAPTCHA (hard to read squigely text only readable by humans that you are asked to repeat into a box). This has the negative result of making life a little more difficult for your normal users who have to try and interpret the often quite difficult to read text. While there are some other coding techniques that can prevent screen scraping there is a more fundamental question of whether you can do anything to legally prevent others from running screen scraping bots or scripts against your website.
Since the Internet is so new and activity crosses international boundaries it is going to be difficult to find a law that would deal with this very specific activity, in Ireland particularly. It might be more worthwhile to focus on more generalised laws that are in place to protect individuals and businesses from abusive practices. Examples of such illegal practices might include stealing intellectual property, copyright infringement or the more generalised unfair practices.
Stealing intellectual property would occur where the third-party is using screen scraping software to gain access to your business rules and intelligence. By accessing publicly available information in a bulk manner the third-party is essentially in a position to de-code how you go about doing your business.
Copyright law protects information and data that you have created and most websites do expressely state that all of the content is copyrighted. There are international conventions, such as the Universal Copyright Convention, which have been agreed by most countries that can be appealed to where there is an infingement.
Unfair practices can be wide-ranging, from “computer fraud and abuse”, “actions leading to damage and loss”, “unauthorised access”, “trespass”, “inteference with business relations”, “misappropriation” and “unjust enrichment”. These were a number of the legal complaints cited in a case between Southwest Airlines and Outtask in the US. Southwest was attempting to prevent Outtask from gathering flight information from its site using screen scraping software. The southwest airlines outtake dallas court ruling with Southwest Airlines’ claims on a number of fronts including:
1. loss or damage - defined as “any reasonable cost to any victim, including the cost of responding to an offense, conducting a damage assessment, and restoring the data, program, system or information to its condition prior to the offense, and any revenue lost, cost incurred, or other consequential damages incurred because of interruption of service.”
2. missappropriation - constituted of the following elements; ”(i) the creation of plaintiff’s product through extensive time, labor, skill, and money, (ii) the defendant’s use of that product in competition with the plaintiff, thereby gaining a special advantage in that competition (i.e., a “free ride”) because defendant is burdened with little or none of the expense incurred by the plaintiff, and (iii) commercial damage to the plaintiff.”
3. interference with business relations – since “there was a reasonable probability that customers who purchased [their] tickets on defendant’s application would have entered into a contractual relationship directly with Southwest”
4. harmful access by computer – which stipulates that “a person commits an offense if the person knowingly accesses a computer, computer network, or computer system without the effective consent of the owner”
While these laws are specific to the US and in some instance the state of Texas I’m sure that a good Irish solicitor could route out similar grounds for a complaint in an Irish court.
BTW, if you find that your site is being screen scraped it is very easy to identify who is doing it via and reverse IP lookup and then either simply block that IP from accessing your website or pursue the legal route. Waiting for the first Irish test case…