袁来如此的工作笔记
袁来如此的工作笔记
竹杖芒鞋轻胜马,谁怕? 一蓑烟雨任平生。

.htaccess文件的详解以及404页面的设置

浏览量:908


打开记事本,写入以下代码:

ErrorDocument 404 /404.html

保存成.htaccess文件上传到网站的根目录。


/404.html是目录名和文件名,可以改成自己的名字。

QUOTE:


.htaccess文件是非常有用的


• Part 1 – Introduction介绍

• Part 2 - .htaccess Commande命令

• Part 3 - Password protection密码保护


Part 1 – Introduction介绍


Introduction 介绍


In this tutorial you will find out about the .htaccess file and the power it has to improve your website. Although .htaccess is only a file, it can change settings on the servers and allow you to do many different things, the most popular being able to have your own custom 404 error pages. .htaccess isn't difficult to use and is really just made up of a few simple instructions in a text file.

从本指南中,您可以学习到有关.htaccess文档及其功能,用以优化您的网站。尽管.htaccess只是一个文档,但它可以更改服务器的设置,允许您做许多不同的事情,最流行的功能是您可以创建自定义的“404 error”页面。.htaccess 并不难于应用,只是在一个text文档中添加几条简单的指令而已。


Will My Host Support It? 我的主机支持它吗?


This is probably the hardest question to give a simple answer to. Many hosts support .htaccess but don't actually publicise it and many other hosts have the capability but do not allow their users to have a .htaccess file. As a general rule, if your server runs Unix or Linux, or any version of the Apache web server it will support .htaccess, although your host may not allow you to use it.

这可能很难用简单的答案来回答。许多主机支持.htaccess但实际上并不会明确声明,许多其他类型的主机有能力但并不允许他们的用户使用.htaccess文档。作为一般性的规则,如果你的主机使用Unix或Linux系统,或任何版本的Apache网络服务器,它一般来说是支持.htaccess的,尽管你的主机服务器可能不允许你使用它。


A good sign of whether your host allows .htaccess files is if they support password protection of folders. To do this they will need to offer .htaccess (although in a few cases they will offer password protection but not let you use .htaccess). The best thing to do if you are unsure is to either upload your own .htaccess file and see if it works or e-mail your web host and ask them.

你的主机是否允许.htaccess,很好的一个标志是它是否支持文件夹的密码保护。为此,他们将提供.htaccess支持(尽管少数情况下他们提供密码保护功能但不允许你使用.htaccess)。如果你不确定,最好的办法一是上传你自己的.htaccess文档看看是否有用,二是e-mail给你的主机服务商询问。


What Can I Do? 我该怎么做?


You may be wondering what .htaccess can do, or you may have read about some of its uses but don't realise how many things you can actually do with it.

你可能疑惑.htaccess到底能做些什么,或者你可能曾知道它的一些功能但并不真正了解你实际到底可以用它来做多少事情。


There is a huge range of things .htaccess can do including: password protecting folders, redirecting users automatically, custom error pages, changing your file extensions, banning users with certian IP addresses, only allowing users with certain IP addresses, stopping directory listings and using a different file as the index file.

.htaccess可以做大量范围的事情,包括:文件夹密码保护、用户自动重新指向、自定义错误页面、变更你的文件扩展名、屏蔽特定的用户IP地址、只允许特定的IP地址、停止目录表以及使用其他文件作为index文件。


Creating A .htaccess File 创建一个.htaccess文档


Creating a .htaccess file may cause you a few problems. Writing the file is easy, you just need enter the appropriate code into a text editor (like notepad). You may run into problems with saving the file. Because .htaccess is a strange file name (the file actually has no name but a 8 letter file extension) it may not be accepted on certain systems (e.g. Windows 3.1). With most operating systems, though, all you need to do is to save the file by entering the name as:

创建.htaccess文档也许会给你带来几个问题。写文档很容易,你只需要在文字编缉器(例如:写字板)里写下适当的代码。然后,你可能会遇到保存文档的困难,因为.htaccess是一个古怪的文件名(文档事实上没有文件名,只有一个由8个字母组成的扩展名),一些系统(例如windows 3.1)无法接受这样的文件名。在大多数的操作系统中,你需要做的是将文档保存成名为:


".htaccess"

(including the quotes). If this doesn't work, you will need to name it something else (e.g. htaccess.txt) and then upload it to the server. Once you have uploaded the file you can then rename it using an FTP program.

(包括引号)。如果这也不行,你需要将其命名为其他的名字(例如htaccess.txt),再将其上传到服务器上,此后你就可以利用FTP软件来重命名它了。


Warning 警告


Before beginning using .htaccess, I should give you one warning. Although using .htaccess on your server is extremely unlikely to cause you any problems (if something is wrong it simply won't work), you should be wary if you are using the Microsoft FrontPage Extensions. The FrontPage extensions use the .htaccess file so you should not really edit it to add your own information. If you do want to (this is not recommended, but possible) you should download the .htaccess file from your server first (if it exists) and then add your code to the beginning.

在使用.htaccess之前,我必须给你一些警告。尽管在服务器上使用.htaccess绝对不太可能给你带来任何麻烦(如果有些东西错了,它只是没效用罢了),但如果你使用Microsoft FrontPage Extensions你就需要小心些。FrontPage Extensions使用了.htaccess,因此你不能编辑它,加入你自己的信息。如果你需要(并不推荐,但是可能)你应该先从服务器上下载.htaccess文档(如果存在),之后在前面加上你的代码。


Custom Error Pages 自定义错误页


The first use of the .htaccess file which I will cover is custom error pages. These will allow you to have your own, personal error pages (for example when a file is not found) instead of using your host's error pages or having no page. This will make your site seem much more professional in the unlikely event of an error. It will also allow you to create scripts to notify you if there is an error (for example I use a PHP script on Free Webmaster Help to automatically e-mail me when a page is not found).

我想介绍的.htaccess第一个应用是自定义错误页面,这使得你可以拥有自己的、个性化的错误页面(例如找不到文件时),而不是你的服务商提供的错误页或没有任何页面。这会让你的网站在出错的时候看上去更加专业。你还可以利用脚本程序在发生错误的时候通知你(例如我使用Free Webmaster Help的PHP脚本程序,当找不到页面的时候自动e-mail给我)。


You can use custom error pages for any error as long as you know its number (like 404 for page not found) by adding the following to your .htaccess file:

任何你知道代码的错误(像404找不到页面),你都可以将其变成自定义页面,要做的只是在.htaccess文件里加入以下一段:


ErrorDocument errornumber /file.html

For example if I had the file notfound.html in the root direct

ory of my site and I wanted to use it for a 404 error I would use:

例如,如果我的根目录下有一个nofound.html文档,我想使用它作为404 error的页面:


ErrorDocument 404 /notfound.html

If the file is not in the root directory of your site, you just need to put the path to it:

如果文件不在网站的根目录下,你只需要把路径设置为:


ErrorDocument 500 /errorpages/500.html

These are some of the most common errors:

以下是一些最常用的错误:


401 - Authorization Required

400 - Bad request

403 - Forbidden

500 - Internal Server Error

404 - Wrong page

Then, all you need to do is to create a file to display when the error happens and upload it and the .htaccess file.

这样,你要做的只是生成一个错误显示文档,然后把它们上传。



Part 2 - .htaccess 命令


Introduction 介绍


In the last part I introduced you to .htaccess and some of its useful features. In this part I will show you how to use the .htaccess file to implement some of these.

上一部分中我已经将你引入了.htaccess以及它的一些有用的功能,在这一部分里,我将向您演示如何应用.htaccess文档去实现这些功能。


Stop A Directory Index From Being Shown 停示显示目录索引


Sometimes, for one reason or another, you will have no index file in your directory. This will, of course, mean that if someone types the directory name into their browser, a full listing of all the files in that directory will be shown. This could be a security risk for your site.

有些时候,由于某种原因,你的目录里没有index文件,当然这样意味着如果有人在浏览器地址栏键入了该目录的路径,该目录下所有的文件都会显示出来,这造成了网站的安全威胁。


To prevent against this (without creating lots of new 'index' files, you can enter a command into your .htaccess file to stop the directory list from being shown:

为了避免这种情况(而不必创建一堆的新index文档),你可以在你的.htaccess文档中键入以下命令,用以阻止目录索引的显示:


Options -Indexes

Deny/Allow Certian IP Addresses 阻止/允许特定的IP地址


In some situations, you may want to only allow people with specific IP addresses to access your site (for example, only allowing people using a particular ISP to get into a certian directory) or you may want to ban certian IP addresses (for example, keeping disruptive memembers out of your message boards). Of course, this will only work if you know the IP addresses you want to ban and, as most people on the internet now have a dynamic IP address, so this is not always the best way to limit usage.

有些情况下,你可能只想允许某些特定IP的用户可以访问你的网站(例如:只允许使用特定ISP的用户进入某个目录),或者你想拦截某些特定的IP地址(例如:将低级用户隔离于你的信息面版外)。当然,这只在你知道你想拦截的IP地址时才有用,然而现在网上的大多数用户都使用动态IP地址,所以这并不是限制使用的常用方法。


You can block an IP address by using:

你可以使用以下命令拦截一个IP地址:


deny from 000.000.000.000

where 000.000.000.000 is the IP address. If you only specify 1 or 2 of the groups of numbers, you will block a whole range.

被拦截的IP地址则为000.000.000.000,如果你只指定其中1或2个代码组,你可以拦截整个区域的地址。


You can allow an IP address by using:

你可以使用以下命令允许一个IP地址的访问:


allow from 000.000.000.000

where 000.000.000.000 is the IP address. If you only specify 1 or 2 of the groups of numbers, you will allow a whole range.

被允许的IP地址则为000.000.000.000,如果你只指定其中1或2个代码组,你可以允许整个区域的地址。


If you want to deny everyone from accessing a directory, you can use:

如果你想阻止所有人访问该文件目录,你可以使用:


deny from all

but this will still allow scripts to use the files in the directory.

但这将仍然允许脚本程序使用这个目录下的文档。


Alternative Index Files 替代的index文档


You may not always want to use index.htm or index.html as your index file for a directory, for example if you are using PHP files in your site, you may want index.php to be the index file for a directory. You are not limited to 'index' files though. Using .htaccess you can set foofoo.blah to be your index file if you want to!

也许你不想一直使用index.htm或index.html来作为目录的索引文档,例如你的站点使用PHP文档,你会想使用 index.php来作为该目录的索引文档。当然也不必局限于“index”文档,如果你愿意,你尽管使用foofoo.balh来作为你的索引文档!


Alternate index files are entered in a list. The server will work from left to right, checking to see if each file exists, if none of them exisit it will display a directory listing (unless, of course, you have turned this off).

替代的索引文档可以排成一个列表,服务器会从左至右进行寻找,看看哪个文档在真实的目录中存在。如果一个也找不到,它将会把目录清单显示出来(当然除非你关闭了显示目录文件清单)。


DirectoryIndex index.php index.php3 messagebrd.pl index.html index.htm

Redirection 重新指向


One of the most useful functions of the .htaccess file is to redirect requests to different files, either on the same server, or on a completely different web site. It can be extremely useful if you change the name of one of your files but allow users to still find it. Another use (which I find very useful) is to redirect to a longer URL, for example in my newsletters I can use a very short URL for my affiliate links. The following can be done to redirect a specific file:

.htaccess其中一个极其有用的功能,就是将请求重新指向站内或站外的不同文档。当你改变了一个文档名称时但仍然想让用户仍然可以用旧链接找到它,这个时候此功能将变得极其有用。另一个应用(我发现的很有用的)是重新指向一个长URL,例如在我的时事信息中,我可以使用一个很简短的URL来指向我的联合链接。以下是一个重新指向特定文档的例子:


Redirect /location/from/root/file.ext http://www.othersite.com/new/file/location.xyz

In this above example, a file in the root directory called oldfile.html would be entered as:

上述的例子中,访问在root目录下的名为oldfile.html可以键入:


/oldfile.html

and a file in the old subdirectory would be entered as:

访问一个旧次级目录中的文件可以键入:


/old/oldfile.html

You can also redirect whole directoires of your site using the .htaccess file, for example if you had a directory called olddirectory on your site and you had set up the same files on a new site at: http://www.newsite.com/newdirectory/ you could redirect all the files in that directory without having to specify each one:

你也可以使用.htaccess将整个网站的目录都做重新指向,假如你的网站上有一个名为olddirectory的目录,并且你已经在一个新网站http://www.newsite.com/newdirectory/上建立了与上相同的文档,你可以将旧目录下所有的文件做一次重新指向而不必一一声明:


Redirect /olddirectory http://www.newsite.com/newdirectory

Then, any request to your site below /olddirectory will bee redirected to the new site, with the

extra information in the URL added on, for example if someone typed in:

这样,任何指向到站点中/olddirectory的请求都将被重新指向新的站点,包括附加的额外URL信息。例如有人键入:



They would be redirected to:

请求将被重新指向到:



This can prove to be extremely powerful if used correctly.

如果正确使用,本功能将极其强大。


Part 3 – 密码保护


Introduction 介绍


Although there are many uses of the .htaccess file, by far the most popular, and probably most useful, is being able to relaibly password protect directories on websites. Although JavaScript etc. can also be used to do this, only .htaccess has total security (as someone must know the password to get into the directory, there are no 'back doors')

尽管有各种各样的.htaccess用法,但至今最流行的也可能是最有用的做法是将其用于网站目录的可靠的密码保护。尽管JavaScrip等也能做到,但只有.htaccess具有完美的安全(即访问者必须知晓密码才可以访问目录,并且绝无“后门”可走)。


The .htaccess File


Adding password protection to a directory using .htaccess takes two stages. The first part is to add the appropriate lines to your .htaccess file in the directory you would like to protect. Everything below this directory will be password protected:

利用.htaccess将一个目录加上密码保护分两个步骤。第一步是在你的.htaccess文档里加上适当的几行代码,再将.htaccess文档放进你要保护的目录下:


AuthName "Section Name"

AuthType Basic

AuthUserFile /full/path/to/.htpasswd

Require valid-user

There are a few parts of this which you will need to change for your site. You should replace "Section Name" with the name of the part of the site you are protecting e.g. "Members Area".

有几个小部分你可能需要根据你的网站情况而修改一下。用被保护部分的名字替换掉”Section Name”,例如"Members Area"。


The /full/parth/to/.htpasswd should be changed to reflect the full server path to the .htpasswd file (more on this later). If you do not know what the full path to your webspace is, contact your system administrator for details.

另外/full/parth/to/.htpasswd 应该替换为指向.htpasswd文档(后面详述该文档)的完整服务器路径。如果你不知道你网站空间的完整路径,询问一下你的系统管理员。


The .htpasswd File


Password protecting a directory takes a little more work than any of the other .htaccess functions because you must also create a file to contain the usernames and passwords which are allowed to access the site. These should be placed in a file which (by default) should be called .htpasswd. Like the .htaccess file, this is a file with no name and an 8 letter extension. This can be placed anywhere within you website (as the passwords are encrypted) but it is advisable to store it outside the web root so that it is impossible to access it from the web.

目录的密码保护比.htaccess的其他功能要麻烦些,因为你必须同时创建一个包含用户名和密码的文档,用于访问你的网站,相关信息(默认)应位于一个名为.htpasswd的文档里,像.htaccess一样,.htpasswd也是一个没有文件名且具有8位扩展名的文档,可以放置在你网站里的任何地方(此时密码应加密),但建议你将其保存在网站根目录外,这样通过网络就无法访问到它了。


Entering Usernames And Passwords 输入用户名和密码


Once you have created your .htpasswd file (you can do this in a standard text editor) you must enter the usernames and passwords to access the site. They should be entered as follows:

创建好.htpasswd文档后(可以通过文字编辑器创建),下一步是输入用于访问网站的用户名和密码,应为:


username:password

where the password is the encrypted format of the password. To encrypt the password you will either need to use one of the premade scripts available on the web or write your own. There is a good username/password service at the KxS site which will allow you to enter the user name and password and will output it in the correct format.

“password”的位置应该是加密过的密码。你可以通过几种方法来得到加密过的密码:一是使用一个网上提供的permade脚本或自己写一个;另一个很不错的username/password加密服务是通过KxS网站,这里允许你输入用户名及密码,然后生成正确格式的密码。


For multiple users, just add extra lines to your .htpasswd file in the same format as the first. There are even scripts available for free which will manage the .htpasswd file and will allow automatic adding/removing of users etc.

对于多用户,你只需要在.htpasswd文档中新增同样格式的一行即可。另外还有一些免费的脚本程序可以方便地管理.htpasswd文档,可以自动新增/移除用户等。


Accessing The Site 访问网站


When you try to access a site which has been protected by .htaccess your browser will pop up a standard username/password dialog box. If you don't like this, there are certain scripts available which allow you to embed a username/password box in a website to do the authentication. You can also send the username and password (unencrypted) in the URL as follows:

当你试图访问被.htaccess密码保护的目录时,你的浏览器会弹出标准的username/password对话窗口。如果你不喜欢这种方式,有些脚本程序可以允许你在页面内嵌入username/password输入框来进行认证,你也可以在浏览器的URL框内以以下方式输入用户名和密码(未加密的):


http://username:password@www.website.com/directory/

Summary 小结


.htaccess is one of the most useful files a webmaster can use. There are a wide variety of different uses for it which can save time and increase security on your website.

.htaccess是一个站点管理员可以应用的一个强大的工具,有更多的变化以适应不同的用途,可以节约时间及提高网站的安全性。

SEO与404错误处理方式

在“通过HTTP状态码查看搜索引擎蜘蛛如何爬行你的网站”一文中,我介绍了一些经常涉及到的HTTP状态码及含义,譬如大家经常探讨并且与本文相关的Http状态码:


404:服务器找不到指定的资源,请求的网页不存在(譬如浏览器请求的网页被删除或者移位,但不排除日后该链接有效的可能性);

410:请求的网页不存在(注意:410表示永久性,而404表示临时性);

200:服务器成功返回请求的网页;

301:网址永久性重定向

302:网址临时性重定向


注意:大部分搜索引擎将“404”与“410”状态同等对待,如Google。(参见Matt Cutts的说明)


对HTTP404状态码的理解


  HTTP 404 错误意味着链接指向的网页不存在,即原始网页的URL失效,这种情况经常会发生,很难避免,比如说:网页URL生成规则改变、网页文件更名或移动位置、导入链接拼写错误等,导致原来的URL地址无法访问;当Web 服务器接到类似请求时,会返回一个404 状态码,告诉浏览器要请求的资源并不存在。但是,Web服务器默认的404错误页面,无论Apache还是IIS,均十分简陋、呆板且对用户不友好,无法给用户提供必要的信息以获取更多线索,无疑这会造成用户的流失。


  因此,很多网站均使用自定义404错误的方式以提供用户体验避免用户流失。一般而言,自定义404页面通用的做法是在页面中放置网站快速导航链接、搜索框以及网站提供的特色服务,这样可以有效的帮助用户访问站点并获取需要的信息。


HTTP404对SEO的影响


  自定义404错误页面是提供用户体验的很好的做法,但在应用过程中往往并未注意到对搜索引擎的影响,譬如:错误的服务器端配置导致返回“200”状态码或自定义404错误页面使用Meta Refresh导致返回“302”状态码。正确设置的自定义404错误页面,不仅应当能够正确地显示,同时,应该返回“404”错误代码,而不是“200”或“302”。虽然对访问的用户而言,HTTP状态码究竟是“404”还是“200”来说并没有什么区别,但对搜索引擎而言,这则是相当重要的。


(一)自定义404错误页返回“200”状态码


  当搜索引擎蜘蛛在请求某个URL时得到“404”状态回应时,即知道该URL已经失效,便不再索引该网页,并向数据中心反馈将该URL表示的网页从索引数据库中删除,当然,删除过程有可能需要很长时间;而当搜索引擎得到“200”状态回应时,则会认为该url是有效的,便会去索引,并会将其收录到索引数据库,这样的结果便是这两个不同的url具有完全相同的内容:自定义404错误页面的内容,这会导致出现复制网页问题。对搜索引擎而言,特别是Google,不但很难获得信任指数TrustRank,也会大大降低Google对网站质量的评定。(为什么会出现返回“200”状态码的情况??请参看下面内容“自定义404错误页面的基本原则”)


  我一直在使用Google Sitemap,当我们提交XML格式网站地图文件时,Google会验证我们的身份以确保是网站合法的管理者。验证方式有两种:上传指定名称的html页到网站根目录或者在网页meta区域添加一个标识身份的meta标签。我通常是使用上传html网页的方式,但Google却提示我网站根目录下找不到这个网页(实际上我已上传,并且通过浏览器可以访问),这是一个很可怕的问题,见图:


(二)自定义404错误页使用Meta Refresh返回“302”状态码


  常常看到许多网站的自定义404错误页面采取类似这样的形式:首先显示一段错误信息,然后,通过Meta Refresh将页面跳转到网站首页、网页地图或其他类似页。根据具体实现方式不同,这类404页面可能返回“200”状态码,也可能返回“302”,但不论哪种,从SEO技术角度看,均不是一种合适的选择。


  对“200”状态的情况我们上面已经谈过,那么,当404页面返回“302”时,搜索引擎会怎么对待呢?从理论上说,对“302”错误,搜索引擎认为该网页是存在的,只不过临时改变了地址,仍然会索引收录该页,这样,同样会出现类似于“200”状态码时的重复文本问题;其次,以google为代表的主流搜索引擎对302重定向的适用范围要求越来越严格,这类不当使用302重定向的情况存在很大的风险。


确保自定义404错误页面能够返回“404”状态码


  在自定义404错误页面设置完毕后,一定要检查一下其是不是能够正确地返回“404”状态码。可以使用Server Header检查工具,输入一个不存在网页的url,查看一下HTTP Header的返回情况,确信其返回的是“404 Not found”。



404错误的处理方式

(一)定制404错误页面的基本原则


  首先应明确的是,404错误应工作在服务器级而不是网页级。对定制使用动态页面如PHP脚本类型的404页时,必须确保在PHP执行前服务器已经顺利地送出“404”状态码,不然,一旦执行到了ISAPI级别,返回的状态码便只能是“200”或其他如“302”之类的重定向状态码了。


  其次,在自定义网站的404错误页面时,对设置的错误页面URL链接应使用相对路径而不是绝对路径,而且自定义404页面应该放在网站根目录下。尽管无效链接可能是多种形式的URL,但当发生404访问错误时,WEB服务器会自动将其转到自定义的当404错误页中,这跟URL的形似没有关系。


(二)Apache下设置404错误页面


  为Apache Server设置 404错误页面的方法很简单,只需在.htaccess 文件中加入如下内容即可:


ErrorDocument 404 /notfound.php


注意:

1.切记不要将404错误转向到网站主页,否则可能会导致主页在搜索引擎中消失

2.切记不要使用绝对URL(例如:http://www.bloghuman.com/nofound.php形式),如果使用绝对URL返回的状态码是“302”+“200”(已测试)



(三)IIS/ASP.net下设置404错误页面


首先,修改应用程序根目录的设置,打开 “web.config” 文件编辑,在其中加入如下内容:

<configuration>

<system.web>

<customErrors mode=”On” defaultRedirect=”error.asp”>

<error statusCode=”404″ redirect=”notfound.asp” />

</customErrors>

</system.web>

</configuration>


注:上文例中“error.asp”为系统默认的404页面,“notfound.asp”为自定义的404页面,使用时请修改相应文件名。


然后,在自定义的404页面“notfound.asp”中加入:


<%

Response.Status = “404 Not Found”

%>


这样,便可以保证IIS能够正确地返回“404”状态码


(四)在IIS/ASP.net下设置404静态页面


设置静态404错误页面的方法则比较简单,在IIS管理器中右键单击要管理的网站,打开“属性”中的“自定义错误信息”页,为“404”设定相应的错误信息页即可


打赏