magento 如何确定对象从Varnish Cache中删除的原因?

wydwbb8l  于 2024-01-08  发布在  其他
关注(0)|答案(1)|浏览(175)

我们有一个Varnish服务器在我们的Magento网站前面运行。我们发现Magento是相当慢,当涉及到服务的网页,因此,我们希望有Varnish从该高速缓存服务所有类别和产品的网页。
我们已经对VCL文件进行了更改,以确保TTL和宽限期设置为365 d,以确保页面尽可能长时间地保留在内存中。我们每天运行缓存预热器,它查看站点Map并向每个URL发出请求,以便预热该高速缓存。我们看到的问题是,在运行该高速缓存预热器时,某些页面是热的(根据HIT/MISS标头)在以后查看时不再获得缓存命中。
我知道有几种方法可以使对象从该高速缓存中失效:

  • 请求(这些在我们的VCL中被禁用)
  • 禁止请求(我们利用这些来禁止库存变化时的类别和产品页面)
  • TTL(我们的TTL设置为365 d,并且完成了完全复温,所以不应该老化)
  • Nuked(通过varnishstat确认我们没有任何n_lru_nuked对象)

有没有一种方法可以使用有问题的页面的URL来确定上述哪种情况导致对象无效?这将有助于我们跟踪导致对象无效的原因并防止这种情况发生。
VCL(如果相关):

# VCL version 5.0 is not supported so it should be 4.0 even though actually used Varnish version is 5
vcl 4.0;

import std;
# The minimal Varnish version is 5.0
# For SSL offloading, pass the following header in your proxy server or load balancer: 'X-Forwarded-Proto: https'

backend default {
    .host = "127.0.0.1";
    .port = "8181";
    .first_byte_timeout = 600s;
    .probe = {
        .url = "/pub/health_check.php";
        .timeout = 2s;
        .interval = 5s;
        .window = 10;
        .threshold = 5;
   }
}

#acl purge {
#    "127.0.0.1";
#}

sub vcl_recv {
    if (req.method == "PURGE") {
        #if (client.ip !~ purge) {
            return (synth(405, "Method not allowed"));
        #}
        # To use the X-Pool header for purging varnish during automated deployments, make sure the X-Pool header
        # has been added to the response in your backend server config. This is used, for example, by the
        # capistrano-magento2 gem for purging old content from varnish during it's deploy routine.
        #if (!req.http.X-Magento-Tags-Pattern && !req.http.X-Pool) {
        #    return (synth(400, "X-Magento-Tags-Pattern or X-Pool header required"));
        #}
        #if (req.http.X-Magento-Tags-Pattern) {
        #  ban("obj.http.X-Magento-Tags ~ " + req.http.X-Magento-Tags-Pattern);
        #}
        #if (req.http.X-Pool) {
        #  ban("obj.http.X-Pool ~ " + req.http.X-Pool);
        #}
        #return (synth(200, "Purged"));
    }

    if (req.method != "GET" &&
        req.method != "HEAD" &&
        req.method != "PUT" &&
        req.method != "POST" &&
        req.method != "TRACE" &&
        req.method != "OPTIONS" &&
        req.method != "DELETE") {
          /* Non-RFC2616 or CONNECT which is weird. */
          return (pipe);
    }

    # We only deal with GET and HEAD by default
    if (req.method != "GET" && req.method != "HEAD") {
        return (pass);
    }

    # Bypass shopping cart, checkout and search requests
    if (req.url ~ "/checkout" || req.url ~ "/catalogsearch") {
        return (pass);
    }

    # Bypass health check requests
    if (req.url ~ "/pub/health_check.php") {
        return (pass);
    }

    # Set initial grace period usage status
    set req.http.grace = "none";

    # normalize url in case of leading HTTP scheme and domain
    set req.url = regsub(req.url, "^http[s]?://", "");

    # collect all cookies
    std.collect(req.http.Cookie);

    # Compression filter. See https://www.varnish-cache.org/trac/wiki/FAQ/Compression
    if (req.http.Accept-Encoding) {
        if (req.url ~ "\.(jpg|jpeg|png|gif|gz|tgz|bz2|tbz|mp3|ogg|swf|flv)$") {
            # No point in compressing these
            unset req.http.Accept-Encoding;
        } elsif (req.http.Accept-Encoding ~ "gzip") {
            set req.http.Accept-Encoding = "gzip";
        } elsif (req.http.Accept-Encoding ~ "deflate" && req.http.user-agent !~ "MSIE") {
            set req.http.Accept-Encoding = "deflate";
        } else {
            # unknown algorithm
            unset req.http.Accept-Encoding;
        }
    }

    # Remove Google gclid parameters to minimize the cache objects
    set req.url = regsuball(req.url,"\?gclid=[^&]+$",""); # strips when QS = "?gclid=AAA"
    set req.url = regsuball(req.url,"\?gclid=[^&]+&","?"); # strips when QS = "?gclid=AAA&foo=bar"
    set req.url = regsuball(req.url,"&gclid=[^&]+",""); # strips when QS = "?foo=bar&gclid=AAA" or QS = "?foo=bar&gclid=AAA&bar=baz"

    # Static files caching
    if (req.url ~ "^/(pub/)?(media|static)/") {
        # Static files should not be cached by default
        #return (pass);

        # But if you use a few locales and don't use CDN you can enable caching static files by commenting previous line (#return (pass);) and uncommenting next 3 lines
        unset req.http.Https;
        unset req.http.X-Forwarded-Proto;
        unset req.http.Cookie;
    }

    #bypass for elasticsuite trackers
    if(req.url ~ "elasticsuite/tracker"){
        return (pass);
    }

    #bypass api requests
    if(req.url ~ "/rest/"){
        return (pass);
    }

    #bypass sale nav
    if(req.url ~ "saleNavMarkup.php"){
    return (pass);
    }

    return (hash);
}

sub vcl_hash {
#    if (req.http.cookie ~ "X-Magento-Vary=") {
#        hash_data(regsub(req.http.cookie, "^.*?X-Magento-Vary=([^;]+);*.*$", "\1"));
#    }

    # For multi site configurations to not cache each other's content
    if (req.http.host) {
        hash_data(req.http.host);
    } else {
        hash_data(server.ip);
    }

    # To make sure http users don't see ssl warning
    if (req.http.X-Forwarded-Proto) {
        hash_data(req.http.X-Forwarded-Proto);
    }
    
}

sub vcl_backend_response {
    set beresp.ttl = 365d;
    set beresp.grace = 365d;

    if (beresp.http.content-type ~ "text") {
        set beresp.do_esi = true;
    }

    if (bereq.url ~ "\.js$" || beresp.http.content-type ~ "text") {
        set beresp.do_gzip = true;
    }

    if (beresp.http.X-Magento-Debug) {
        set beresp.http.X-Magento-Cache-Control = beresp.http.Cache-Control;
    }

    # cache only successfully responses and 404s
    if (beresp.status != 200 && beresp.status != 404) {
        set beresp.ttl = 0s;
        set beresp.uncacheable = true;
        return (deliver);
    } elsif (beresp.http.Cache-Control ~ "private") {
        set beresp.uncacheable = true;
        set beresp.ttl = 120s;
        return (deliver);
    }

    # validate if we need to cache it and prevent from setting cookie
    if (beresp.ttl > 0s && (bereq.method == "GET" || bereq.method == "HEAD")) {
        unset beresp.http.set-cookie;
    }

   # If page is not cacheable then bypass varnish for 2 minutes as Hit-For-Pass
   if (beresp.ttl <= 0s ||
       beresp.http.Surrogate-control ~ "no-store" ||
       (!beresp.http.Surrogate-Control &&
       beresp.http.Cache-Control ~ "no-cache|no-store") ||
       beresp.http.Vary == "*") {
        # Mark as Hit-For-Pass for the next 2 minutes
        set beresp.ttl = 120s;
        set beresp.uncacheable = true;
    }

    return (deliver);
}

sub vcl_deliver {
    #if (resp.http.X-Magento-Debug) {
        if (resp.http.x-varnish ~ " ") {
            set resp.http.X-Magento-Cache-Debug = "HIT";
            set resp.http.Grace = req.http.grace;
        } else {
            set resp.http.X-Magento-Cache-Debug = "MISS";
        }
    #} else {
    #    unset resp.http.Age;
    #}

    # Not letting browser to cache non-static files.
    if (resp.http.Cache-Control !~ "private" && req.url !~ "^/(pub/)?(media|static)/") {
        set resp.http.Pragma = "no-cache";
        set resp.http.Expires = "-1";
        set resp.http.Cache-Control = "no-store, no-cache, must-revalidate, max-age=0";
    }

    unset resp.http.X-Magento-Debug;
    unset resp.http.X-Magento-Tags;
    unset resp.http.X-Powered-By;
    unset resp.http.Server;
    unset resp.http.X-Varnish;
    unset resp.http.Via;
    unset resp.http.Link;
}

sub vcl_hit {
    if (obj.ttl >= 0s) {
        # Hit within TTL period
        return (deliver);
    }
    if (std.healthy(req.backend_hint)) {
        if (obj.ttl + 300s > 0s) {
            # Hit after TTL expiration, but within grace period
            set req.http.grace = "normal (healthy server)";
            return (deliver);
        } else {
            # Hit after TTL and grace expiration
            return (miss);
        }
    } else {
        # server is not healthy, retrieve from cache
        set req.http.grace = "unlimited (unhealthy server)";
        return (deliver);
    }
}

字符串

z9zf31ra

z9zf31ra1#

日志是您最好的朋友:varnishlog可以非常详细地回答这个问题。挑战是(一如既往)在查看varnishlog输出时重现问题。
我会制定一些策略来让这变得更容易。

追踪vxid

每个事务都有一个 vxid,这是一个唯一的事务ID,通过X-Varnish响应头公开。
日志项也由该事务ID标识。如果您知道miss的X-Varnish标头的值,则可以在日志中查找它。
假设您试图跟踪一个事务,其中X-Varnish: 5头部是响应的一部分。这将导致以下varnishlog命令:

sudo varnishlog -d -g request -q "vxid == 5"

字符串
正如您所看到的,-q选项用于根据日志中的vxid标记进行过滤。-d选项将转储Varnish共享内存中的内容,而不是寻找新的输入。

存储在内存中

但是,由于Varnish Shared Memory日志存储在内存中以避免性能下降,因此您必须幸运地看到您正在寻找的事务尚未被覆盖。
有两种方法可以解决这个潜在的限制:
1.增加内存VSL缓冲区大小
1.日志到磁盘

增加VSL buffer

增加VSL缓冲区可以在varnishadm中完成。默认情况下,vsl_space运行时参数设置为80 MB。如果我们暂时将其增加到500 MB,则可以在Varnish服务器上运行以下命令:

sudo varnishadm param.set vsl_space 500M
sudo varnishadm stop
sudo varnishadm start


需要重新启动子进程才能更改vsl_space的值。重新启动varnishd将撤消此设置。
stopstart命令将导致清空该高速缓存。请记住这一点。
运行以下命令以验证更改:

sudo varnishadm param.show vsl_space
vsl_space
        Value is: 500M [bytes]
        Default is: 80M
        Minimum is: 1M
        Maximum is: 4G

        The amount of space to allocate for the VSL fifo buffer in the
        VSM memory segment.  If you make this too small,
        varnish{ncsa|log} etc will not be able to keep up.  Making it
        too large just costs memory resources.

        NB: This parameter will not take any effect until the child
        process has been restarted.

日志到磁盘

另一种选择是将日志存储在磁盘上。虽然这能够捕获更多相关事务,但也存在风险:

  • 您很容易就会耗尽磁盘空间
  • 将日志写入磁盘可能会导致性能下降

如果您确实想将日志写入磁盘,我建议将它们存储为二进制格式,以便稍后使用varnishlog重放它们。
下面是你需要的命令:

sudo varnishlog -g session -a -w /var/log/vsl.log


如果您已经知道要过滤的URL,则在varnishlog命令中包含URL过滤器将有所帮助。下面是主页的示例:

sudo varnishlog -g session -q "ReqUrl eq '/'" -a -w /var/log/vsl.log


URL过滤器将大大减少服务器上的压力,并使整个事情变得更加轻量级。
然后可以使用以下命令在varnishlog中重放日志:

sudo varnishlog -g request -r /var/log/vsl.log

了解导致该高速缓存未命中的原因

有一系列的VSL标签可以暴露你所面临的问题。完整的列表可以在https://varnish-cache.org/docs/trunk/reference/vsl.html上找到。
就你的情况而言,我建议你密切关注以下几点:

  • ExpBan
  • ExpKill
  • Hit
  • HitMiss
  • HitPass
  • TTL
  • VCL_call
  • VCL_return

后续步骤

当您跟踪到一个最终失败的事务的完整日志时,请不要犹豫,将此事务的完整varnishlog输出附加到您的问题中。
等你有了信息我会帮你调试的

相关问题