Bess error rate for 1,000 .com domains

Bennett Haselton,


Using "zone files" from Network Solutions (which list all .com domains in existence), we obtained a list of the first 1,000 active ".com" domains on the Internet as of June 14, 2000. However, in an experiment that we did with SurfWatch in August 2000, we tested the first 1,000 .com domains to find the error rate for SurfWatch. In September, N2H2, the makers of Bess, reviewed the first 1,000 ".com" domains to correct any errors made by their product in that set of sites, apparently expecting us to test their product using the same list. Since these results would have no longer been valid, we tested Bess using the second 1,000 .com domains. From this sample, we determined how many of these sites were blocked by Bess, and of those blocked sites, how many were actually pornographic.


Of the second 1,000 working .com domains, 176 were blocked by Bess. Of these 176 blocked sites, we eliminated 150 sites that were "Under construction" pages (the list of 150 non-functioning sites is here).

Of the remaining 26 sites, 7 were errors (i.e. sites that did not meet the criteria for any of Bess's blocking categories), and 19 were non-errors (i.e. sexually explicit sites). This is an error rate of 7/26 = 27%, or roughly one domain incorrectly domains blocked for every three domains blocked that meet Bess's blocking criteria.

Error rate for domains: 27% About one domain incorrectly domains blocked for every three domains blocked that meet Bess's blocking criteria.

N2H2's criteria for their blocking categories are published at (follow the link and click on "Filtering Methods" -- the site uses frames and JavaScript to prevent users from going to that frame directly). We tested a Bess proxy server with the following categories enabled: "Adults Only", "Hate/Discrimination", "Illegal", "Porn Site", "Sex", "Violence", "Alcohol", "Chat", "Drugs", "Free Pages", "Gambling", "Tasteless / Gross", "Profanity", "Lingerie", "Nudity", "Personal Information", "School Cheating Info", "Suicide / Murder", "Tobacco", "Weapons", and "Personals". Two "special categories" were also enabled -- "Block search engine results based on key words" and "Block urls based on key words" -- but none of the sites that we found to be blocked had any offensive words in the URL's, so we concluded that this did not affect testing.

However, there were no blocked sites that we considered to be "borderline" cases, e.g. non-pornographic black-and-white nude photography sites. The sites in this sample were either pornographic sites (e.g., or they met the criteria for one of Bess's other categories (e.g. or they were completely innocuous sites (e.g.

We considered the following blocked sites to be "errors":

We considered the following blocked sites to be "non-errors":


We obtained these results with a Bess proxy server being used by a high school (the school is not named in this report, to protect the identity of the student who helped run the tests with Bess). The server was using a blocked-site list that was current as of October 2000. Only the categories listed above (under "Results") were enabled.

How the list of 1,000 domains was constructed

We started with zone files from Network Solutions listing all .com domains in alphabetical order. Michael Sims supplied the first 10,000 domains in alphabetical order from that list, after eliminating sites at the top whose names started mostly with all "-" dashes. (A disproportionate number of these were pornographic sites that chose their domain name solely in order to show up at the top of an alphabetical listing, so a sample that included these sites would not be a representative cross-section.)

Jamie McCarthy supplied the perl script which isolated the first 1,000 domains that were actually "up":

gunzip -c com.20000614.entire.sorted.gz | grep '^a' | grep -v
'.*--\|-.*-.*-' | perl -ne 'chomp; $a = system("ping -c 1 -q
www.$_ >/dev/null 2>&1"); print "$_\n" if !$a;' | head -1000

We used this script to narrow down the list to the first 1,000 pingable domains sorted alphabetically by domain name. However, after we used this list of 1,000 domains to test SurfWatch in August, N2H2 apparently reviewed the first 1,000 .com domains to un-block any sites in that list that were incorrectly blocked -- anticipating that we would eventually do the same study for their product. Since the results were no longer valid for the first 1,000 .com domains, we used the second 1,000 pingable domains from the list instead.

A truly random sample chosen from the entire list of domain names would have been better, but it would be impossible to prove that such a sample had really been chosen randomly; a third party could easily claim that we had "stacked the deck" by choosing a disproportionate number of sites blocked incorrectly by Bess.

Potential sources of error

A sample of 26 "real" sites that are blocked by Bess, is a small sample from which to draw any precise conclusions. The problem with the sample size is that we had to start with 1,000 randomly chosen Web domains just to get a sample of 26 blocked domains. The 27% figure should not be taken as being accurate to even two significant figures; across all .com domains in existence, the error rate for Bess could be as low as 15%. However, the test does establish that the likelihood of Bess having an error rate of, say, less than 1% across all domains, is virtually zero.

A note on interpreting these results: the results are not weighted by Web site traffic, so some of the sites in this experiment may cause more "Access Denied" messages than others. The 27% error rate should also not be interpreted to apply across all domains, since we only used .com domains in our experiment, which are more likely to contain commercial pornography than, say, .org domains. (In other words, we should expect the error rate to be even higher for .org sites that are blocked.)


Peter Nickerson, the CEO of N2H2, stated in Congressional testimony given in 1998:

All sites that are blocked are reviewed by N2H2 staff before being added to the block lists.

Given the high error rate for sites blocked by N2H2 under default school categories, we believe that N2H2's claim of "100% human review" is false.