{"id":378,"date":"2010-05-11T13:03:42","date_gmt":"2010-05-11T11:03:42","guid":{"rendered":"http:\/\/www.shukko.com\/x3\/?p=378"},"modified":"2010-05-11T13:03:42","modified_gmt":"2010-05-11T11:03:42","slug":"nginx-caching-proxy","status":"publish","type":"post","link":"https:\/\/www.shukko.com\/x3\/2010\/05\/11\/nginx-caching-proxy\/","title":{"rendered":"Nginx: Caching Proxy"},"content":{"rendered":"<h1>TAKEN FROM: http:\/\/www.rfxn.com\/nginx-caching-proxy\/<\/h1>\n<h1>Nginx: Caching Proxy<\/h1>\n<div>\n<p>Recently I started to tackle a load problem on one of my  personal sites, the issue was that of a poorly written but exceedingly  MySQL heavy application and the load it would induce on the SQL server  when 400-500 people were hammering the site at once. Further compounding  this was Apache\u2019s horrible ability to gracefully handle excessive  requests on object heavy pages (i.e: images). This left me with a site  that was almost unusable during peak hours \u2014 or worse \u2014 would crash the  MySQL server and take Apache with it by frenzied F5ing from users.<\/p>\n<p>I went through all the usual rituals in an effort to better the  situation, from PHP APC then Eaccelerator, to mod_proxy+mod_cache, to  tuning Apache timeouts\/prefork settings and adjusting MySQL cache\/buffer  options. The extreme was setting up a MySQL replication cluster with  MySQL-Proxy doing RW splitting\/load balancing across the cluster and  memcached, but this quickly turned into a beast to manage and memcached  was eating memory at phenomenal rates.<\/p>\n<p>Although I did improve things a bit, I had done so at the expense of  vastly increased hardware demand and complexity. However, the site was  still choking during peak hours and in a situation where switching  applications and\/or getting it reprogrammed is not at all an option, I  had to start thinking outside the box or more to the point, outside  Apache.<\/p>\n<p>I have experience with lighttpd and pound reverse proxy, they are  both phenomenal applications but neither directly handles caching in a  graceful fashion (in pounds case not at all). This is when I took a look  a nginx which to date I had never tried but heard many great things  about. I fired up a new Xen guest running CentOS 5.4, 2GB RAM &amp; 2  CPU cores\u2026.. an hour later I had nginx installed, configured and  proxy-caching traffic for the site in question.<\/p>\n<p><strong>The impact was immediate and significant<\/strong> \u2014 the SQL  server loads dropped from an average of 4-5 down to 0.5-1.0 and the web  server loads were near non-existent from previously being on the brink  of crashing every afternoon.<\/p>\n<p><strong>Enough with my ramblings, lets get into nginx<\/strong>. You  can download the latest release from <a href=\"http:\/\/nginx.org\/\">http:\/\/nginx.org<\/a> and although I could not find a binary version of it, compiling was  straight forward with no real issues.<\/p>\n<p>First up we need to satisfy some requirements for the configure  options we will be using, I encourage you to look at \u2018.\/configure \u2013help\u2019  list of available options as there are some nice features at your  disposal.<\/p>\n<pre>yum install -y zlib zlib-devel openssl-devel gd gd-devel pcre pcre-devel<\/pre>\n<p>Once the above packages are installed we are good to go with  downloading and compiling the latest version of nginx:<\/p>\n<pre>wget http:\/\/nginx.org\/download\/nginx-0.8.36.tar.gz\r\ntar xvfz nginx-0.8.36.tar.gz\r\ncd nginx-0.8.36\/\r\n.\/configure --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_image_filter_module --with-http_gzip_static_module\r\nmake &amp;&amp; make install<\/pre>\n<p>This will install nginx into \u2018\/usr\/local\/nginx\u2019, if you would like to  relocate it you can use \u2018\u2013prefix=\/path\u2019 on the configure options. The  path layout for nginx is very straight forward, for the purpose of this  post we are assuming the defaults:<\/p>\n<pre>[root@atlas ~]# ls \/usr\/local\/nginx\r\nconf  fastcgi_temp  html  logs  sbin\r\n\r\n[root@atlas nginx]# cd \/usr\/local\/nginx\r\n\r\n[root@atlas nginx]# ls conf\/\r\nfastcgi.conf  fastcgi.conf.default  fastcgi_params  fastcgi_params.default  koi-utf  koi-win  mime.types  mime.types.default  nginx.conf  nginx.conf.default  win-utf\r\n<\/pre>\n<p>The layout will be very familiar to anyone that has worked with  Apache and true to that, nginx breaks the configuration down into a  global set of options and then the individual web site virtual host  options. The \u2018conf\/\u2019 folder might look a little intimidating but you  only need to be concerned with the nginx.conf file which we are going to  go ahead and overwrite, a copy of the defaults is already saved for you  as nginx.conf.default.<\/p>\n<p>My nginx configuration file is available at <a href=\"http:\/\/www.rfxn.com\/downloads\/nginx.conf.atlas\">http:\/\/www.rfxn.com\/downloads\/nginx.conf.atlas<\/a>,  be sure to rename it to nginx.conf or copy the contents listed below  into \u2018conf\/nginx.conf\u2019:<\/p>\n<pre>user  nobody nobody;\r\n\r\nworker_processes     4;\r\nworker_rlimit_nofile 8192;\r\n\r\npid \/var\/run\/nginx.pid;\r\n\r\nevents {\r\n  worker_connections 2048;\r\n}\r\n\r\nhttp {\r\n    include       mime.types;\r\n    default_type  application\/octet-stream;\r\n\r\n    log_format main '$remote_addr - $remote_user [$time_local] '\r\n                    '\"$request\" $status  $body_bytes_sent \"$http_referer\" '\r\n                    '\"$http_user_agent\" \"$http_x_forwarded_for\"';\r\n\r\n    access_log  logs\/nginx_access.log  main;\r\n    error_log  logs\/nginx_error.log debug;\r\n\r\n    server_names_hash_bucket_size 64;\r\n    sendfile on;\r\n    tcp_nopush     on;\r\n    tcp_nodelay    off;\r\n    keepalive_timeout  30;\r\n\r\n    gzip  on;\r\n    gzip_comp_level 9;\r\n    gzip_proxied any;\r\n\r\n    proxy_buffering on;\r\n    proxy_cache_path \/usr\/local\/nginx\/proxy levels=1:2 keys_zone=one:15m inactive=7d max_size=1000m;\r\n    proxy_buffer_size 4k;\r\n    proxy_buffers 100 8k;\r\n    proxy_connect_timeout      60;\r\n    proxy_send_timeout         60;\r\n    proxy_read_timeout         60;\r\n\r\n    include \/usr\/local\/nginx\/vhosts\/*.conf;\r\n}\r\n<\/pre>\n<p>Lets take a moment to review some of the more important options in  nginx.conf before we move along\u2026<\/p>\n<p><strong>user nobody nobody;<\/strong><br \/>\nIf you are running this on a server with an apache install or other  software using the user \u2018nobody\u2019, it might be wise to create a user  specifically for nginx (i.e: <strong>useradd nginx -d \/usr\/local\/nginx  -s \/bin\/false<\/strong>)<\/p>\n<p><strong>worker_processes     4;<\/strong><br \/>\nThis should reflect the number of CPU cores which you can find out by  running \u2018<strong>cat \/proc\/cpuinfo  | grep processor<\/strong>\u2018 \u2014 I  recommend a setting of at least 2 but no more than 6, nginx is VERY  efficient.<\/p>\n<p><strong>proxy_cache_path \/usr\/local\/nginx\/proxy \u2026 inactive=7d  max_size=1000m; <\/strong><br \/>\nThe \u2018inactive\u2019 option is the maximum age of content in the cache path  and the \u2018max_size\u2019 is the maximum on disk size of the cache path. If you  are serving up lots of object heavy content such as images, you are  going to want to increase this.<\/p>\n<p><strong>proxy_send|read_timeout     60;<\/strong><br \/>\nThese timeout values are important, if you run any scripts through admin  interfaces or other maintenance URL\u2019s, these values will cause the  proxy to time them out \u2014 that said increase them to sane values as  appropriate, anything more than 300 is probably excessive and you should  consider running such tasks from cronjobs.<\/p>\n<p><strong>Apache style MaxClients<\/strong><br \/>\nFinally, maximum amount of connections, or MaxClients, that nginx can  accept is determined by <strong>worker_processes * worker_connections\/2<\/strong> (2 fd per session) <strong>=  8192 MaxClients<\/strong> in our  configuration.<\/p>\n<p>Moving along we need to create two paths that we defined in our  configuration, the first is the content caching folder and the second is  where we will create our vhosts.<\/p>\n<pre>mkdir \/usr\/local\/nginx\/proxy \/usr\/local\/nginx\/vhosts \/usr\/local\/nginx\/client_body_temp \/usr\/local\/nginx\/fastcgi_temp  \/usr\/local\/nginx\/proxy_temp\r\n\r\nchown nobody.nobody \/usr\/local\/nginx\/proxy \/usr\/local\/nginx\/vhosts \/usr\/local\/nginx\/client_body_temp \/usr\/local\/nginx\/fastcgi_temp  \/usr\/local\/nginx\/proxy_temp\r\n<\/pre>\n<p>Lets go ahead and get our initial vhosts file created, my template is  available from <a href=\"http:\/\/www.rfxn.com\/downloads\/nginx.vhost.conf\">http:\/\/www.rfxn.com\/downloads\/nginx.vhost.conf<\/a> and should be saved to \u2018\/usr\/local\/nginx\/vhosts\/myforums.com.conf\u2019, the  contents of which are as follows:<\/p>\n<pre>server {\r\n    listen 80;\r\n    server_name myforums.com alias www.myforuns.com;\r\n\r\n    access_log  logs\/myforums.com_access.log  main;\r\n    error_log  logs\/myforums.com_error.log debug;\r\n\r\n    location \/ {\r\n        proxy_pass http:\/\/10.10.6.230;\r\n        proxy_redirect     off;\r\n        proxy_set_header   Host             $host;\r\n        proxy_set_header   X-Real-IP        $remote_addr;\r\n        proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;\r\n\r\n        proxy_cache               one;\r\n        proxy_cache_key         backend$request_uri;\r\n        proxy_cache_valid       200 301 302 20m;\r\n        proxy_cache_valid       404 1m;\r\n        proxy_cache_valid       any 15m;\r\n        proxy_cache_use_stale   error timeout invalid_header updating;\r\n    }\r\n\r\n    location \/admin {\r\n        proxy_pass http:\/\/10.10.6.230;\r\n        proxy_set_header   Host             $host;\r\n        proxy_set_header   X-Real-IP        $remote_addr;\r\n        proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;\r\n    }\r\n}\r\n<\/pre>\n<p>The obvious changes you want to make are \u2018myforums.com\u2019 to whatever  domain you are serving, you can append multiple aliases to the  server_name string such as \u2018<strong>server_name domain.com alias  www.domain.com alias sub.domain.com;<\/strong>\u2018. Now, lets take a look at  some of the important options in the vhosts configuration:<\/p>\n<p><strong>listen 80;<\/strong><br \/>\nThis is the port which nginx will listen on for this vhost, by default  unless you specify an IP address with it, you will bind port 80 on all  local IP\u2019s for nginx \u2014 you can limit this by setting the value as \u2018<strong>listen  10.10.3.5:80;<\/strong>\u2018.<\/p>\n<p><strong>proxy_pass http:\/\/10.10.6.230;<\/strong><br \/>\nHere we are telling nginx where to find our content aka the backend  server, this should be an IP and it is also important to not forget  setting the \u2018proxy_set_header Host\u2019 option so that the backend server  knows what vhost to serve.<\/p>\n<p><strong>proxy_cache_valid<\/strong><br \/>\nThis allows us to define cache times based on HTTP status codes for our  content, for 99% of traffic it will fall under the \u2018200 301 302 20m\u2019  value. If you are running allot of dynamic content you may want to lower  this from 20m to 10m or 5m, any lower defeats the purpose of caching.  The \u2018404 1m\u2019 value ensures that not found pages are not stored for long  in case you are updating the site\/have a temporary error but also  prevent 404\u2019s from choking up the backend server. Then the \u2018any 15m\u2019  value grabs all other content and caches it for 15m, again if you are  running a very dynamic site you may want to lower this.<\/p>\n<p><strong>proxy_cache_use_stale<\/strong><br \/>\nWhen the cache has stale content, that is content which has expired but  not yet been updated, nginx can serve this content in the event errors  are encountered. Here we are telling nginx to serve stale cache data if  there is an error\/timeout\/invalid header talking to the backend servers  or if another nginx worker process is busy updating the cache. This is  really useful in the event your web server crashes, as to clients they  will receive data from the cache.<\/p>\n<p><strong>location \/admin<\/strong><br \/>\nWith this location statement we are telling nginx to take all requests  to \u2018http:\/\/myforums.com\/admin\u2019 and pass it off directly to our backend  server with no further interaction \u2014 no caching.<\/p>\n<p><strong>That\u2019s it!<\/strong> You can start nginx by running  \u2018\/usr\/local\/nginx\/sbin\/nginx\u2019, it should not generate any errors if you  did everything right! To start nginx on boot you can append the command  into \u2018\/etc\/rc.local\u2019. All you have to do now is point the respective  domain DNS records to the IP of the server running nginx and it will  start proxy-caching for you. If you wanted to run nginx on the same host  as your Apache server you could set Apache to listen on port 8080 and  then adjust the \u2018proxy_pass\u2019 options accordingly as \u2018proxy_pass  http:\/\/127.0.0.1:8080;\u2019.<\/p>\n<p><strong>Extended Usage:<\/strong><br \/>\nIf you wanted to have nginx serve static content instead of Apache,  since it is so horrible at it, we need to declare a new location option  in our vhosts\/*.conf file. We have two options here, we can either point  nginx to a local path with our static content or have nginx cache our  static content then retain it for longer periods of time \u2014 the later is  far simpler.<\/p>\n<p><strong>Serve static content from a local path:<\/strong><\/p>\n<pre>        location ~* ^.+.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|exe|pdf|ppt|txt|tar|mid|midi|wav|bmp|rtf|js)$ {\r\n            root   \/home\/myuser\/public_html;\r\n            expires 1d;\r\n        }<\/pre>\n<p>In the above, we are telling nginx that our static content is located  at \u2018\/home\/myuser\/public_html\u2019, paths must be relative!! When a user  requests \u2018http:\/\/www.mydomain.com\/img\/flyingpigs.jpg\u2019, nginx will look  for it at \u2018\/home\/myuser\/public_html\/img\/flyingpigs.jpg\u2019. The expires  option can have values in seconds, minutes, hours or days \u2014 if you have  allot of dynamic images on your site then you might consider an option  like 2h or 30m, anything lower defeats the purpose. Using this method  has a slight performance benefit over the cache option below.<\/p>\n<p><strong>Serve static content from cache:<\/strong><\/p>\n<pre>        location ~* ^.+.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|exe|pdf|ppt|txt|tar|mid|midi|wav|bmp|rtf|js)$ {\r\n             proxy_cache_valid 200 301 302 120m;\r\n             expires 2d;\r\n             proxy_pass http:\/\/10.10.6.230;\r\n             proxy_cache one;\r\n        }<\/pre>\n<p>With this setup we are telling nginx to cache our static content just  like we did with the parent site itself, except that we are defining an  extended time period for which the content is valid\/cached. The time  values are, content is valid for 2h (nginx updates cache) and every 2  days the content expires (client browsers cache expires and requests  again). Using this method is simple and does not require copying static  content to a dedicated nginx host.<\/p>\n<p>We can also do load balancing very easily with nginx, this is done by  setting an alias for a group of servers, we then define this alias in  place of addresses in our \u2018proxy_pass\u2019 settings. In the \u2018upstream\u2019  option shown below, we want to list all of our web servers that load  should be distributed across:<\/p>\n<pre>  upstream my_server_group {\r\n    server 10.10.6.230:8000 weight=1;\r\n    server 10.10.6.231:8000 weight=2 max_fails=3  fail_timeout=30s;\r\n    server 10.10.6.15:8080 weight=2;\r\n    server 10.10.6.17:8081\r\n  }\r\n<\/pre>\n<p>This must be placed in the \u2018http { }\u2019 section of the  \u2018conf\/nginx.conf\u2019 file, then the server group can be used in any vhost.  To do this we would replace \u2018proxy_pass http:\/\/208.76.83.135;\u2019 with  \u2018proxy_pass http:\/\/my_server_group;\u2019. The requests will be distributed  across the server group in a round-robin fashion with respect to the  weighted values, if any. If a request to one of the servers fails, nginx  will try the next server until it finds a working server. In the event  no working servers can be found, nginx will fall back to stale cache  data and ultimately an error if that\u2019s not available.<\/p>\n<p><strong>Conclusion:<\/strong><br \/>\nThis has turned into a longer post than I had planned but oh well, I  hope it proves to be useful. If you need any help on the configuration  options, please check out <a href=\"http:\/\/wiki.nginx.org\/NginxModules#Nginx_Core_Modules\">http:\/\/wiki.nginx.org<\/a>,  it covers just about everything one could need.<\/p>\n<p>Although I noted this nginx setup is deployed on a Xen guest (CentOS  5.4, 2GB RAM &amp; 2 CPU cores), it proved to be so efficient, that  these specs were overkill for it. You could easily run nginx on a 1GB  guest with a single core, a recycled server or locally on the Apache  server. I should also mention that I took apart the MySQL replication  cluster and am now running with a single MySQL server without issue \u2014  down from 4.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>TAKEN FROM: http:\/\/www.rfxn.com\/nginx-caching-proxy\/ Nginx: Caching Proxy Recently I started to tackle a load problem on one of my personal sites, the issue was that of a poorly written but exceedingly MySQL heavy application and the load it would induce on the SQL server when 400-500 people were hammering the site at once. Further compounding this [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-378","post","type-post","status-publish","format-standard","hentry","category-kategerisiz"],"_links":{"self":[{"href":"https:\/\/www.shukko.com\/x3\/wp-json\/wp\/v2\/posts\/378","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.shukko.com\/x3\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.shukko.com\/x3\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.shukko.com\/x3\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.shukko.com\/x3\/wp-json\/wp\/v2\/comments?post=378"}],"version-history":[{"count":1,"href":"https:\/\/www.shukko.com\/x3\/wp-json\/wp\/v2\/posts\/378\/revisions"}],"predecessor-version":[{"id":379,"href":"https:\/\/www.shukko.com\/x3\/wp-json\/wp\/v2\/posts\/378\/revisions\/379"}],"wp:attachment":[{"href":"https:\/\/www.shukko.com\/x3\/wp-json\/wp\/v2\/media?parent=378"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.shukko.com\/x3\/wp-json\/wp\/v2\/categories?post=378"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.shukko.com\/x3\/wp-json\/wp\/v2\/tags?post=378"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}