Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Lower case requests

News Case sensitivity of URLs .htaccess file HTTP Protocol Apache Webserver Apache Security Troubleshooting
Probes from bc.googleusercontent.com mod rewrite Apache authentication and authorization using LDAP AWstats Server Side Includes (SSI) Web site monitoring Perl HTTP Logs Processing Scripts
Referrer Spam Bots that couse consistent 500 errors Broken or undebugged robots Requests for crossdomain.xml and other XML files Frivolous POSTs Lower case requests Non-PHP Web probes
PHP probes Fighting rogue robots Trailing junk in requests Cheap Web hosting with SSH access HTTP Return Codes Humor Etc

This is very widespread type of Web sites abuse which is very difficult to exterminate. So blocking them is futile exercise. You better redirect them to some page after detection that this is a low case request. 

Robots which submit requests with all directories put in lower case in total probably represent the highest frequency of hits user of your Web site. I wonder if  this is the case when authors of brain-dead Windows-based robots are trying to debug something on this site.

For Windows this abomination is not a big problem. For example there is an option in iis7 Enforce lowercase URLs (iis 7 - IIS7 and Enforce lowercase URLs - Stack Overflow)

Unix is and always was case sensitive as for both directories and files. You can get general impression about this activity using something like

cut -d ' ' -f 7 http_logs120903_0530.log | egrep "/[a-z]+/" | sort | uniq
For example
/utilities/teraterm.shtml
/authentication/kerberos.shtml
/links/russian/culture/music/female_singers/valentina_tolkunova.shtml
/links/russian/culture/music/romances/yesenin_romances.shtml
/solaris/security/rbac.shtml
/history/multix.shtml
/solaris/processes_and_memory/swap_space_management.shtml
/social/toxic_managers/communication/negative_politeness.shtml
/links/russian/culture/music/russian_duets.shtml
/links/russian/culture/music/female_singers/lyudmila_gurchenko.shtml
/people/gurtyak/programs/keyrus/keyrus73/keyrus.txt
/tools/cat.shtml
/lang/pl1.shtml
/algorithms/des.shtml
/tools/uniq.shtml
Sometime they refer to non existent directories
/upc/share/jdk1.2/docs/api/java/security/package-summary.html
/upc/share/jdk1.2/docs/api/java/sql/package-summary.html
/upc/share/jdk1.2/docs/api/java/util/package-summary.html
/upc/share/jdk1.2/docs/api/javax/accessibility/package-summary.html
/upc/share/jdk1.2/docs/api/overview-summary.html
/cdrom/jrl/index.htm
/upc/share/jdk1.2/docs/index.html
/Afs/rpi.edu/home/34/floydb1/html/composit.html
Most commonly all the request is lower case. For example:
/admin/tivoli/tec
/bookshelf/classic.shtml
/tools/tail.shtml
/office/open_office.shtml
/social/toxic_managers/bullies.shtml
/scripting/perlorama/modules/expect.shtml
/links/russian/ice_skating/ice_age/navka_basharov.shtml
/links/russian/culture/music/russian_waltzs.shtml
/links/russian/culture/music/male_singers/vyacheslav_dobryinin.shtml
/links/russian/culture/music/female_singers/tamara_miansarova.shtml
Sometimes only one directory (this is typical for PHP probes) or a couple of directories are in lower case:
/Access_control/admin/categories.php/login.php?cPath=&action=new_product_preview
/Admin/tips.shtml//wp-content/themes/MyApp/timthumb.php?src=http://wordpress.com.suppaddleboard.com/eva.php

Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Nov 03, 2015] Are upper- and lower-case URLs the same page

WebmasterWorld

tedster

The mess is created by Windows servers. In their default configuration they are not case sensitive, but most of the other operating systems are, including those used by Google and other search engines.

There is some spidering evidence of Google trying to discover which sites are on non-case-sensitive servers, but that's a crazy job and I would not depend on Google or any other Search Engine getting it accurately sorted.

Help them out - if you can make all urls lower case, that is the best practice. If you can configure your server to be case sensitive, that's another best practice. If you have a URL that is already well ranked and it uses some uppercase, then know that changing those letters to lowercase does create alternative urls.

It is a rare thing to acquire a duplicate "penalty", but when the same content appears on technically different urls, then that kind of duplication has negative effects. Backlink influence gets split up, one or more of the url versions gets filtered out of search results and so on.

This is not a true penalty, as in a black mark against your domain. However, the ranking and traffic problems that are generated can feel like one.

Onders

From personal experience we have been trying to move over mixed case URL's to lower case ones for a while..
Although from a ranking perspective both can do equally well (on our site at least!) - the thinking is that from a user perspective, having easy URL's all of one type is much more memorable.. (we also have some hyphens and some underscores..)

With the mixed case URL's which have backlinks to them we've been more reluctant. With the ones where there is no external influence we've just changed the URL and put a 301 on the old URL and no problem. If there are backlinks I'd think about trying to get them redirected and then putting a 301 on the old page...

But as was mentioned - if it aint broke.... Are you really sure you want to be tampering with it?!

[Nov 03, 2015] iis 7 - IIS7 and Enforce lowercase URLs

Jan 20, 2011 | Stack Overflow
I was just reading an article on writing rules, from Scott Gu

Tips/Trick: Fix Common SEO Problems Using the URL Rewrite Extension

He talks about the issue of excluding static files (.jpeg, .jpg, .gif, etc.) from the lowercase rewrite, and shows how you can add conditions to exclude files. Another article is where I found the condition for excluding more than just Scott's example

Mike's Umbraco blog - URL Rewriting and SEO

He adds the condition:

<add input="{URL}" pattern="^.*\.(axd|css|js|jpg|jpeg|png|gif)$" negate="true" ignoreCase="true" />

I hope this helps you in future rewrites.

[Nov 03, 2015] 10 URL Rewriting Tips and Tricks RuslanY Blog

Some webmasters convert all requests to low case. See below Enforce Lower Case URLs

This post describes some of the tips and tricks that one may find useful when solving URL-based problems for their web server or web site. Each tip/trick has a description of a problem and then an example of how it can be solved with IIS 7 URL Rewrite Module.

  1. Add or Remove Trailing Slash
  2. Enforce Lower Case URLs
  3. Canonical Hostnames
  4. Redirect to HTTPS
  5. Return HTTP 503 Status Code in Response
  6. Prevent Image Hotlinking
  7. Reverse Proxy to Another Site/Server
  8. Preserve Protocol Prefix in Reverse Proxy
  9. Rewrite/Redirect Based on Query String Parameter
  10. Avoid Rewriting of Requests for ASP.NET Web Resources
1. Add or Remove Trailing Slash

Many web applications use "virtual URLs" – that is the URLs that do not directly map to the file and directory layout on web server's file system. An example of such application may be an ASP.NET MVC application with URL format similar to this: http://stackoverflow.com/questions/60857/modrewrite-equivalent-for-iis-7-0 or a PHP application with URL format that looks like this: http://ruslany.net/2008/11/url-rewrite-module-release-to-web/. If you try to request these URLs with or without trailing slash you will still get the same page. That is OK for human visitors, but may be a problem for search engine crawlers as well as for web analytics services. Different URLs for the same page may cause crawlers to treat the same page as different pages, thus affecting the page ranking. They will also cause Web Analytics statistics for this page to be split up.

This problem is very easy to fix with a rewrite rule. Having or not having a trailing slash in the URL is a matter of taste, but once you've made a choice you can enforce the canonical URL format by using one of these rewrite rules:

To always remove trailing slash from the URL:

view plaincopy to clipboardprint?
  1. <rule name="Remove trailing slash" stopProcessing="true">
  2. <match url="(.*)/$" />
  3. <conditions>
  4. <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
  5. <add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
  6. </conditions>
  7. <action type="Redirect" redirectType="Permanent" url="{R:1}" />
  8. </rule>

To always add trailing slash to the URL:

view plaincopy to clipboardprint?
  1. <rule name="Add trailing slash" stopProcessing="true">
  2. <match url="(.*[^/])$" />
  3. <conditions>
  4. <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
  5. <add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
  6. </conditions>
  7. <action type="Redirect" redirectType="Permanent" url="{R:1}/" />
  8. </rule>
2. Enforce Lower Case URLs

A problem similar to the trailing slash problem may happen when somebody links to your web page by using different casing, e.g. http://ruslany.net/2008/07/IISNET-Uses-Url-Rewrite-Module/ vs. http://ruslany.net/2008/07/iisnet-uses-url-rewrite-module/. In this case again the search crawlers will treat the same page as two different pages and two different statistics sets will show up in Web Analytics reports.

What you want to do is to ensure that if somebody comes to your web site by using a non-canonical link, then you redirect them to the canonical URL that uses only lowercase characters:

view plaincopy to clipboardprint?
  1. <rule name="Convert to lower case" stopProcessing="true">
  2. <match url=".*[A-Z].*" ignoreCase="false" />
  3. <action type="Redirect" url="{ToLower:{R:0}}" redirectType="Permanent" />
  4. </rule>
3. Canonical Hostnames

Very often you may have one IIS web site that uses several different host names. The most common example is when a site can be accessed via http://www.yoursitename.com and via http://yoursitename.com. Or, perhaps, you have recently changed you domain name from oldsitename.com to newsitename.com and you want your visitors to use new domain name when bookmarking links to your site. A very simple redirect rule will take care of that:

view plaincopy to clipboardprint?
  1. <rule name="Canonical Host Name" stopProcessing="true">
  2. <match url="(.*)" />
  3. <conditions>
  4. <add input="{HTTP_HOST}" negate="true" pattern="^ruslany\.net$" />
  5. </conditions>
  6. <action type="Redirect" url="http://ruslany.net/{R:1}" redirectType="Permanent" />
  7. </rule>

To see an example of how that works try browsing to http://www.ruslany.net/2008/10/aspnet-postbacks-and-url-rewriting/. You will see in the browser's address bar that "www" is removed from the domain name.

4. Redirect to HTTPS

When a site that requires SSL is accessed via non-secure HTTP connection, IIS responds with HTTP 403 (Unauthorized) status code. This may be fine if you always expect that your site visitors will be typing "https://…" in the browser's address bar. But if you want your site to be easily discoverable and more user friendly, you probably would not want to return 403 response to visitors who came over unsecure HTTP connection. Instead you would want to redirect them to the secure equivalent of the URL they have requested. A typical example is this URL: http://www.paypal.com. If you follow it you will see that browser gets redirected to https://www.paypal.com.

With URL Rewrite Module you can perform this kind of redirection by using the following rule:

view plaincopy to clipboardprint?
  1. <rule name="Redirect to HTTPS" stopProcessing="true">
  2. <match url="(.*)" />
  3. <conditions>
  4. <add input="{HTTPS}" pattern="^OFF$" />
  5. </conditions>
  6. <action type="Redirect" url="https://{HTTP_HOST}/{R:1}" redirectType="Permanent" />
  7. </rule>

Note that for this rule to work within the same web site you will need to disable "Require SSL" checkbox for the web site. If you do not want to do that, then you can create two web sites in IIS – one with http binding and another with https binding – and then add this rule to the web.config file of the site with http binding.

5. Return HTTP 503 Status Code in Response

HTTP status code 503 means that the server is currently unable to handle the request due to maintenance. This status code implies that the outage is temporary, so when search engine crawler gets HTTP 503 response from your site, it will know not to index this response, but instead to come back later.

When you stop the IIS application pool for your web site, IIS will return HTTP 503 for all requests to that site. But what if you are doing maintenance to a certain location of the web site and you do not want to shut down the entire site because of that? With URL Rewrite Module you can return 503 response only when HTTP requests are made to a specific URL path:

view plaincopy to clipboardprint?
  1. <rule name="Return 503" stopProcessing="true">
  2. <match url="^products/sale/.*" />
  3. <action type="CustomResponse" statusCode="503"
  4. subStatusCode="0"
  5. statusReason="Site is unavailable"
  6. statusDescription="Site is down for maintenance" />
  7. </rule>
6. Prevent Image Hotlinking

Image Hotlinking is the use of an image from one site into a web page belonging to a second site. Unauthorized image hotlinking from your site increases bandwidth use, even though the site is not being viewed as intended. There are other concerns with image hotlinking, for example copyrights or usage of images in an inappropriate context.

With URL Rewrite Module, it is very easy to prevent image hotlinking. For example the following rewrite rule prevents hotlinking to all images on a web site http://ruslany.net:

view plaincopy to clipboardprint?
  1. <rule name="Prevent image hotlinking">
  2. <match url=".*\.(gif|jpg|png)$"/>
  3. <conditions>
  4. <add input="{HTTP_REFERER}" pattern="^$" negate="true" />
  5. <add input="{HTTP_REFERER}" pattern="^http://ruslany\.net/.*$" negate="true" />
  6. </conditions>
  7. <action type="Rewrite" url="/images/say_no_to_hotlinking.jpg" />
  8. </rule>

This rule will rewrite a request for any image file to /images/say_no_to_hotlinking.jpg only if the HTTP Referer header on the request is not empty and is not equal to the site's domain.

7. Reverse Proxy To Another Site/Server

By using URL Rewrite Module together with Application Request Routing module you can have IIS 7 act as a reverse proxy. For example, you have an intranet web server and you want to expose its content over internet. To enable that you will need to perform the following configuration steps on the server that will act as a proxy:

Step1: Check the "Enable proxy" checkbox located in Application Request Routing feature view is IIS Manager.

Step2: Add the following rule to the web site that will be used to proxy HTTP requests:

view plaincopy to clipboardprint?
  1. <rule name="Proxy">
  2. <match url="(.*)" />
  3. <action type="Rewrite" url="http://internalserver/{R:1}" />
  4. </rule>

Note the http:// prefix in the rewrite rule action. That is what indicates that this request must be proxy'ed, instead of being rewritten. When rule has "Rewrite" action with the URL that contains the protocol prefix, then URL Rewrite Module will not perform its standard URL rewriting logic. Instead it will pass the request to Application Request Routing module, which will proxy that request to the URL specified in the rule.

8. Preserve Protocol Prefix in Reverse Proxy

The rule in previous tip always uses non-secure connection to the internal content server. Even if the request came to the proxy server over HTTPS, the proxy server will pass that request to the content server over HTTP. In many cases this may be exactly what you want to do. But sometimes it may be necessary to preserve the secure connection all the way to the content server. In other words, if client connects to the server over HTTPS, then the proxy should use "https://" prefix when making requests to content server. Similarly, if client connected over HTTP, then proxy should use "http://" connection to content server.

This logic can be easily expressed by this rewrite rule:

view plaincopy to clipboardprint?
  1. <rule name="Proxy">
  2. <match url="(.*)" />
  3. <conditions>
  4. <add input="{CACHE_URL}" pattern="^(https?)://" />
  5. </conditions>
  6. <action type="Rewrite" url="{C:1}://internalserver/{R:1}" />
  7. </rule>
9. Rewrite/Redirect Based on Query String Parameters

When rewriting/redirection decisions are being made by using values extracted from the query string, very often one cannot rely on having the query string parameters always listed in exact same order. So the rewrite rule must be written in such a way so that it can extract the query string parameters independently of their relative order in the query string.

The following rule shows an example of how two different query string parameters are extracted from the query string and then used in the rewritten URL:

view plaincopy to clipboardprint?
  1. <rule name="Query String Rewrite">
  2. <match url="page\.asp$" />
  3. <conditions>
  4. <add input="{QUERY_STRING}" pattern="p1=(\d+)" />
  5. <add input="##{C:1}##_{QUERY_STRING}" pattern="##([^#]+)##_.*p2=(\d+)" />
  6. </conditions>
  7. <action type="rewrite" url="newpage.aspx?param1={C:1}&param2={C:2}" appendQueryString="false"/>
  8. </rule>

With this rule, when request is made to page.asp?p2=321&p1=123, it will be rewritten to newpage.aspx?param1=123&param2=321. Parameters p1 and p2 can be in any order in the original query string.

10. Avoid Rewriting of Requests for ASP.NET Web Resources

ASP.NET-based web applications very often make requests to WebResources.axd file to retrieve assembly resources and serve them to the Web browser. There is no such file exists on the server because ASP.NET generates the content dynamically when WebResources.axd is requested. So if you have a URL rewrite rule that does rewriting or redirection only if requested URL does not correspond to a file or a folder on a web server's file system, that rule may accidentally rewrite requests made to WebResources.axd and thus break your application.

This problem can be easily prevented if you add one extra condition to the rewrite rule:

view plaincopy to clipboardprint?
  1. <rule name="RewriteUserFriendlyURL1" stopProcessing="true">
  2. <match url="^([^/]+)/?$" />
  3. <conditions>
  4. <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
  5. <add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
  6. <!-- The following condition prevents rule from rewriting requests to .axd files -->
  7. <add input="{URL}" negate="true" pattern="\.axd$" />
  8. </conditions>
  9. <action type="Rewrite" url="article.aspx?p={R:1}" />
  10. </rule>

Recommended Links

Google matched content

Softpanorama Recommended

Top articles

Sites

Top articles

Sites

Enforce lowercase URLs using URL- Rewrite - How to configure on IIS 8 in Windows Server 2012 - YouTube

Internal links



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: March, 29, 2020