Using "zone files" from Network Solutions (which list all .com domains in existence), we obtained a list of the first 1,000 active ".com" domains on the Internet as of June 14, 2000. From this sample, we determined how many of these sites were blocked by AOL Parental Controls (under the "mature teen" setting), and of those blocked sites, how many were actually pornographic.
Of the first 1,000 working .com domains, only five were blocked by AOL Parental Controls under the "mature teen" setting:
Of these five, only http://a-aji.com was blocked in error -- a site selling vinegar and seasoning sauces (the domain name "a-aji" means "good taste").
Even though this error rate was lower than the error rates for SurfWatch, Cyber Patrol, Bess, and SafeServer, the tradeoff is that AOL Parental Controls blocked far fewer pornographic sites than any of the other programs.
AOL does not publish criteria for sites blocked by AOL Parental Controls; the Parental Controls help screen for AOL customers states that AOL uses Cyber Patrol's database of blocked sites, but when we did a similar experiment with Cyber Patrol, the list of blocked sites was completely different.
We obtained these results using AOL 5.0 for Windows 98 with Parental Controls turned on and set to the "mature teen" setting. Under the "mature teen" setting, access to all of the Web is allowed, except for specific sites that have been identified and blocked by AOL (or by Cyber Patrol, which supplies the list of blocked sites that AOL uses).
We started with zone files from Network Solutions listing all .com domains in alphabetical order. Michael Sims supplied the first 10,000 domains in alphabetical order from that list, after eliminating sites at the top whose names started mostly with all "-" dashes. (A disproportionate number of these were pornographic sites that chose their domain name solely in order to show up at the top of an alphabetical listing, so a sample that included these sites would not be a representative cross-section.)
Jamie McCarthy supplied the perl script which isolated the first 1,000 domains that were actually "up":
gunzip -c com.20000614.entire.sorted.gz | grep '^a' | grep -v '.*--\|-.*-.*-' | perl -ne 'chomp; $a = system("ping -c 1 -q www.$_ >/dev/null 2>&1"); print "$_\n" if !$a;' | head -1000
We used this script to narrow down the list to the first 1,000 pingable domains sorted alphabetically by domain name.
We used the first 1,000 working domain names in our sample in order to make our sample "provably random". A truly random sample chosen from the entire list of domain names would have been better, but it would be impossible to prove that such a sample had really been chosen randomly; a third party could easily claim that we had "stacked the deck" by choosing a disproportionate number of sites blocked incorrectly by AOL.
The sample of blocked sites used here was too small to draw any precise conclusions about the error rate. If one out of five blocked sites was blocked incorrectly, the actual error rate might still be anywhere from 5% to 75%. (In similar experiments that we did with SurfWatch, Cyber Patrol, Bess, and SafeServer, the number of blocked sites was much higher.)
Even though AOL Parental Controls generated fewer wrongly blocked sites than any other program, it also blocked fewer pornographic sites than any of the others. This suggests a tradeoff between the two goals: if a program blocks fewer sites, it is easier to maintain a low error rate by ensuring that all blocked sites really are pornographic, but there are also many more pornographic sites that won't be blocked.