Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Web sites filtering with Squid

News Social Sites as intelligence collection tools Recommended Links Squid logs  Perl web log analysers Squid-Log-Analyzers Web sites filtering with Squid
  Calamaris Squid2MySQL Pwebstats Sysadmin Horror Stories  Humor Etc

Basic filtering capabilities which are enough to clock "obnoxious snoopers" like Facebook are present in squid. Here is a description on those mechanism borrowed from  Web site blocking techniques

In my experience, Squid’s built-in blocking mechanism or access control is the easiest method to use for implementing web site blocking policy. All you need to do is modify the Squid configuration file.

Before you can implement web site blocking policy, you have to make sure that you have already installed Squid and that it works. You can consult the Squid web site_ _to get the latest version of Squid and a guide for installng it.

To deploy the web-site blocking mechanism in Squid, add the following entries to your Squid configuration file (in my system, it’s called squid.conf and it’s located in the /etc/squid directory):

acl bad url_regex "/etc/squid/squid-block.acl"
http_access deny bad

The file /etc/squid/squid-block.acl contains web sites or words you want to block. You can name the file whatever you like. If a site has the URL or word listed in squid-block.acl file, it won’t be accesible to your users. The entries below are found in squid-block.acl file used by my clients:

.oracle.com
.playboy.com.br
sex
...

With the squid-block.acl file in action, internet users cannot access the following sites:

You should beware that by blocking sites containing the word “sex”, you will also block sites such as Middlesex University, Sussex University, etc. To resolve this problem, you can put those sites in a special file called squid-noblock.acl:

^http://www.middlesex.ac.uk
^http://www.sussex.ac.uk 

You must also put the “no-block” rule before the “block” rule in the Squid configuration file:

...
acl special_urls url_regex "/etc/squid/squid-noblock.acl"
http_access allow admin_ips special_urls

acl bad url_regex "/etc/squid/squid-block.acl"
http_access deny bad
...

Sometimes you also need to add a no-block file to allow access to useful sites

After editing the ACL files (squid-block.acl and squid-noblock.acl), you need to restart Squid. If you install the RPM version, usually there is a script in the /etc/rc.d/init.d directory to help you manage Squid:

# /etc/rc.d/init.d/squid reload

To test to see if your Squid blocking mechanism has worked, you can use your browser. Just enter a site whose address is listed on the squid-block.acl file in the URL address.

In the example above, I block .oracle.com, and when I try to access oracle.com, the browser returns an error page.

You can also use additional software to filter websites in Squid. One option is to use Content Security for Squid from QuintoLabs (qlproxy) but I am not sure it is worth troubles. It is an ICAP daemon/URL rewriter that integrates with existing Squid proxy server and provides rich content filtering functionality to sanitize web traffic passing into internal home / enterprise network. It may be used to block illegal or potentially malicious file downloads, remove annoying advertisements, prevent access to various categories of the web sites and block resources with explicit content (i.e. prohibit explicit and adult content).

NOTE: there are other tools except qlproxy that have almost the same functionality. Some of the well-known are SquidGuard (SG) and DansGuardian (DG). While these tools are ok from the theoretical perspective you need to install them both to get the same functionality as qlproxy. SG runs as URL Rewriter and DG is even as a separate proxy itself. It also does not support SMP processing relying on resource ineffective process-per-connection server model leading to exploded requirements on e.g. URL block database. It is also a problem to tie SG and DG together as they have different configuration directives and largely independent of each other forcing the admin to look into two different places when he needs to adjust only one filtering policy.

The most prominent feature of qlproxy is a policy based web filtering when users of the proxy are organized into several groups with different levels of strictness.

By default qlproxy comes with three polices preinstalled. Strict policy contains web filter settings put on maximum level and is supposed to protect minors and K12 students from inappropriate contents on the Internet. Relaxed policy blocks only excessive advertisements and was supposed to be used by network administrators, teachers and all those who do not need filtered access to web but would like to evade most ads. The last group is default and contains less restrictive web filtering settings suitable for normal web browsing without explicitly adult contents shown.


Top updates

Bulletin Latest Past week Past month
Google Search


NEWS CONTENTS

Old News ;-)

[Apr 13, 2012] Web site blocking techniques

In my experience, Squid's built-in blocking mechanism or access control is the easiest method to use for implementing web site blocking policy. All you need to do is modify the Squid configuration file.

Before you can implement web site blocking policy, you have to make sure that you have already installed Squid and that it works. You can consult the Squid web site_ _to get the latest version of Squid and a guide for installng it.

To deploy the web-site blocking mechanism in Squid, add the following entries to your Squid configuration file (in my system, it's called squid.conf and it's located in the /etc/squid directory):

acl bad url_regex "/etc/squid/squid-block.acl"
http_access deny bad

The file /etc/squid/squid-block.acl contains web sites or words you want to block. You can name the file whatever you like. If a site has the URL or word listed in squid-block.acl file, it won't be accesible to your users. The entries below are found in squid-block.acl file used by my clients:

.oracle.com
.playboy.com.br
sex
...

With the squid-block.acl file in action, internet users cannot access the following sites:

You should beware that by blocking sites containing the word "sex", you will also block sites such as Middlesex University, Sussex University, etc. To resolve this problem, you can put those sites in a special file called squid-noblock.acl:

^http://www.middlesex.ac.uk
^http://www.sussex.ac.uk 

You must also put the "no-block" rule before the "block" rule in the Squid configuration file:

...
acl special_urls url_regex "/etc/squid/squid-noblock.acl"
http_access allow admin_ips special_urls

acl bad url_regex "/etc/squid/squid-block.acl"
http_access deny bad
...

Sometimes you also need to add a no-block file to allow access to useful sites

After editing the ACL files (squid-block.acl and squid-noblock.acl), you need to restart Squid. If you install the RPM version, usually there is a script in the /etc/rc.d/init.d directory to help you manage Squid:

# /etc/rc.d/init.d/squid reload

To test to see if your Squid blocking mechanism has worked, you can use your browser. Just enter a site whose address is listed on the squid-block.acl file in the URL address.

In the example above, I block .oracle.com, and when I try to access oracle.com, the browser returns an error page.

[Apr 13, 2012] How to Use Squid to Block Websites eHow.com

Instructions

1 Ensure that Squid is properly installed. Consult the documentation for your particular distribution and operating system if needed.

2 Configure the Access Control List (ACL) in the Squid configuration file "squid.conf". This file should be found in "etc/squid/". Add the following lines to the configuration file:

acl bad url_regex "/etc/squid/squid-block.acl"
http_access deny bad

Save changes to the squid.conf file.

Enter the URLs and keywords you want to block in the newly created /etc/squid/squid-block.acl file. You can add partial or whole URLs and keywords that will block any pages where the keyword is found. Remember to describe the URLs you plan to block carefully, as you could inadvertently block desirable sites that contain the blocked URL description. Entering ".com" into the file will block every site ending in ".com". Likewise, choose keywords carefully so that Squid does not block desirable sites that contain the keyword.

Example:

badwords.com
bored

These entries will cause the filter to block any URL that contains "badwords.com." The filter will also block any site that contains the word "bored."

4 Save changes to the squid-block.acl file.

5 Create a squid-noblock.acl file to allow certain exceptions to the filtering rules to pass. Websites that contain URLs or words where keywords are part of the word will be blocked by the policy. Enter the URLs and keywords of sites not to block in this file.

Example:

www.stopbadwords.com
boredom

These entries will cause the URL "www.stopbadwords.com" to be allowed through the filter, even though it contains the blocked URL "badwords.com."

6 Save changes to the squid-noblock.acl file.

7 Add the "no-block" rule to the squid.conf configuration file before the "block" rule in order to put the "no-blocking" rules into action:

acl special_urls url_regex "/etc/squid/squid-noblock.acl"
http_access allow admin_ips special_urls

Save changes to the squid.conf file.

8 Restart Squid in order for the new filtering policies to take effect.

[Apr 13, 2012] Web Filtering On Squid Proxy

I am not sure that installing version 2.0 of qlproxy worth troubles...
Linux Howtos and Tutorials

Our goal is to set up a free Linux based server running Squid and deploy web filtering application on it saving bandwidth, speeding up web access and blocking obsessive and potentially illegal and malicious web files.

... ... ...

Next step is to install Content Security for Squid from QuintoLabs (I will refer to it as qlproxy further in text). For those who do not know, QuintoLabs Content Security is an ICAP daemon/URL rewriter that integrates with existing Squid proxy server and provides rich content filtering functionality to sanitize web traffic passing into internal home / enterprise network. It may be used to block illegal or potentially malicious file downloads, remove annoying advertisements, prevent access to various categories of the web sites and block resources with explicit content (i.e. prohibit explicit and adult content).

NOTE: there are other tools except qlproxy that have almost the same functionality. Some of the well-known are SquidGuard (SG) and DansGuardian (DG). While these tools are ok from the theoretical perspective you need to install them both to get the same functionality as qlproxy. SG runs as URL Rewriter and DG is even as a separate proxy itself. It also does not support SMP processing relying on resource ineffective process-per-connection server model leading to exploded requirements on e.g. URL block database. It is also a problem to tie SG and DG together as they have different configuration directives and largely independent of each other forcing the admin to look into two different places when he needs to adjust only one filtering policy.

We will use version 2.0 of qlproxy that was released this month. The most prominent feature of that release is a policy based web filtering when users of the proxy are organized into several groups with different levels of strictness.


[SOLVED] Block https Facebook in Squid proxy server ~ ServerComputing

Q: How to block Facebook in squid proxy? (any secure https sites)

Here the post will show you how to block complete "http" and "https" facebook access in office times in your squid proxy server. Create an acl with facebook domain (dstdomain) and deny both http and https access.


Add the Configurations to squid.conf
[root@server ~#]vi /etc/squid/squid.conf

#1: Create an acl for proxy clients.
acl accountant src 192.168.10.50/32


#2: Create an acl for facebook domain (any required sites)
acl fb dstdomain .facebook.com

#3: Create an acl office time for Mon-Sat, 10:00 to 17:00 (24hrs)

acl officetime time MTWHFA 10:00-17:00

#4: Deny access to "http" facebook to accountant only in office times

http_reply_access deny fb accountant officetime

#5: The below line will deny access to "https" secured facebook to the proxy user "accountant" in office times. Squid proxy will deny access to "https" facebook to accountant only in office times.
http_access deny CONNECT fb accountant officetime

#(save the squid.conf configuration file)

#6: And finaly reload squid service to take effect changes

[root@server ~#]service squid reload

Tips: The way to include multiple sites in one ACL
acl badsites dstdomain .facebook.com .twitter.com .blogger.com

Note: Tested in squid-3.1 (tested using squid-3.1.16-1.fc15.x86_64 in CentOS 6)

Author said...

@rizaal,
always place the facebook deny ACLs above of all other browse allowing ACL
like below
http_reply_access deny fb accountant
http_reply_access allow lan Author said...
@Unknown, As mentioned in this post, add the fillowing
3 lines to squid.conf (most simplest solution)
acl fb dstdomain .facebook.com
http_access deny CONNECT fb
http_reply_access deny fb

Done, now restart the squid daemon
#service squid restart

rajasekaran said...

service squid reload
2012/04/24 05:45:20| aclParseAclList: ACL name 'CONNECT' not found.
FATAL: Bungled squid.conf line 65: http_access deny CONNECT fb accountant officetime
Squid Cache (Version 3.1.4): Terminated abnormally.
CPU Usage: 0.013 seconds = 0.004 user + 0.009 sys
Maximum Resident Size: 21968 KB
Page faults with physical i/o: 0

April 24, 2012 at 5:44 PM
Author said...
@rajasekaran, thanks for feedback.
Add the below "CONNECT" acl to your squid.conf file
acl CONNECT method CONNECT

then try restarting squid

April 10, 2013 at 6:25 PM
Auronrev said...
Thanks, very nice info!! ;)

But, I've a little problem. When squid blocks any https site, squid page error isn't shown, it's shown an explorer error. Http blocked page works fine, it shows squid error page. Any idea to solve this?

Enhancing your privacy using Squid and Privoxy Christian Schenk

January 27th, 2007 | Tech

If you would like to surf the internet anonymously I'll show you how to use Squid and Privoxy for this purpose. First we'll configure Squid to filter some HTTP header fields. After this, web servers will most likely think that we aren't requesting content through a proxy but rather directly with our browser. We will see that we can't manipulate all HTTP header fields without running into problems: Privoxy will help us here.

You can test your setup with ProxyJudge or SamAir; there are a lot of other tools which provide this functionality. While SamAir just checks some HTTP header fields, ProxyJudge will do a more comprehensive check. It will calculate your level of anonymity: it ranges from 1 to 5 where level 1 is excellent and 5 bad. If you're already using a proxy, your level of anonymity might be bad: go check it right now so you can compare the results later.

Configuring Squid

If you don't want to use Privoxy you can still set some options in your squid.conf, which will get you up on level 1 or 2 at ProxyJudge. Here they are:

via off
forwarded_for off

header_access From deny all
header_access Server deny all
header_access WWW-Authenticate deny all
header_access Link deny all
header_access Cache-Control deny all
header_access Proxy-Connection deny all
header_access X-Cache deny all
header_access X-Cache-Lookup deny all
header_access Via deny all
header_access Forwarded-For deny all
header_access X-Forwarded-For deny all
header_access Pragma deny all
header_access Keep-Alive deny all

These directives control some HTTP header fields, which are set by Squid or another proxy if your Squid is part of a hierarchy of proxies. The Via and Forwarded-For fields are set to indicate that this request was forwarded by a proxy. This is something we don't want, because this would leak the information that we're using a proxy. Due to this reason the bunch of header_access lines deny some other fields too.

After you've done this you should have a rating of 1 or 2: you only get a 1 if you haven't got reverse DNS enabled for your IP. More often than not this is something you can't control but your ISP. If you don't want every web server to know your current IP you can setup Squid to use another proxy as parent, e.g. a proxy provided by your ISP. Be aware that this might result in a bad rating, because the parent proxy might set the mentioned HTTP header fields and obviously you can't change that.

squid_redirect – Freecode

Written in Perl

squid_redirect uses a list of patterns to zap annoying ad banners from Web pages, inserting a placeholder image. It lives in a Web proxy and so requires no special browser facilities. It's readily customizable, small, fast, and easy to install.

Recommended Links

Top Visited

Bulletin Latest Past week Past month
Google Search



Quick HOWTO Ch32 Controlling Web Access with Squid - Linux Home Networking

How to install squid proxy on centos 6

Web Filtering On Squid Proxy by sichent



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: January 09, 2020