.htaccess 如何阻止yandex

pinkon5k 于 2022-11-16 发布在其他

关注(0)|答案(2)|浏览(216)

我试图阻止yandex从我的网站。我已经尝试了解决方案张贴在其他线程，但他们不工作，所以我想知道，如果我做错了什么？
用户代理字符串为：

Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots

我已尝试以下操作（一次一个）。RewriteEngine已打开

SetEnvIfNoCase User-Agent "^yandex.com$" bad_bot_block
    Order Allow,Deny
    Deny from env=bad_bot_block
    Allow from ALL

    SetEnvIfNoCase User-Agent "^yandex.com$" bad_bot_block
    <RequireAll>
    Require all granted
    Require not env bad_bot_block       
    </RequireAll>

有没有人能看出以上方法不起作用的原因或有其他建议？

.htaccess

来源：https://stackoverflow.com/questions/73125671/how-to-block-yandex

2条答案

按热度按时间

polhcujo1#

如果其他人有这个问题，下面的工作对我来说：

RewriteCond %{HTTP_USER_AGENT} ^.*(yandex).*$ [NC]
    RewriteRule .* - [F,L]

赞(0）回复(0）举报 2022-11-16

yc0p9oo02#

SetEnvIfNoCase User-Agent "^yandex.com$" bad_bot_block

通过正则表达式中的 start 和 end-of-string 锚点，您基本上可以检查User-Agent字符串是否完全等于“yandex.com“（除了.是 any 字符），这显然与指定的User-Agent字符串不匹配。
您需要检查User-Agent的头文件 * 是否包含 *“YandexBot”（或“yandex.com“）。您也可以在这里使用区分大小写的匹配，因为真实的的Yandex bot不会改变大小写。
例如，请尝试以下方法：

SetEnvIf User-Agent "YandexBot" bad_bot_block

请考虑改用BrowserMatch指示词，这是SetEnvIf User-Agent的捷径。
如果您使用的是Apache 2.4，则应使用两个代码块的Require（第二个）变体。Order、Deny和Allow指令是Apache 2.2指令，以前在Apache 2.4上已弃用。
但是，请考虑使用using robots.txt，而不是首先阻止 crawling。Yandex supposedly supports robots.txt。

赞(0）回复(0）举报 2022-11-16

我来回答

.htaccess 如何阻止yandex

2条答案

相关问题

热门标签

最新问答