怎样保证WordPress MU + BuddyPress的性能

2018年11月14日 由 Amon 没有评论 »

《How to scale WordPress to half a million blogs and 8,000,000 page views a month》


We figured it was about time we shared some of the lessons we’ve learned scaling Edublogs to nearly half a million blogs and a place in the Quantcast top 5000 sites! So if you have grand plans for your site (or want to improve your existing setup / performance) read on and feel free to ask any questions :)

The fundamental principle in scaling a large WordPress installation runs along the same basic principles of scaling any large site.

The key component is to truly understand your application, the architecture and the potential areas of contention. For WordPress specifically, the two key points of contention and work is the page-generation time as well as the time spent with the database.

Database Layer:

Given the flexibility of WordPress, the database is the storage point not only for the “larger” items, such as users, posts, comments, but also for many little options and details. The nature of WordPress is that it may make many round trip calls to the database to load many of these options — each requiring database and network resources.

The first-level of “defense” on overloading the database would be to use the MySQL Query Cache.

The Query Cache is a nifty little feature in MySQL, where it stores — in a dedicated are within main memory — any results of a query for a table which has not recently changes.

That is, assuming a request comes in to retrieve a specific row in a table — and that table has not recently been modified in any way — and the cache has not filled up requiring purging/cleaning — the query/data can be satisfied from this cache. The major benefit here of course is the to satisfy the request — the database does not need to go to the disk (which is generally the slowest part of the system) and can be immediately satisfied.


The other major boost for the database would be to keep the working set in memory. The working set is loosely defined as the current set of data which will be aggressively referenced in a period of time. Your database can have 500GB worth of data — but the working set — the data actually needed NOW [and in the next N amount of time] is only 5GB.

If you can keep that 5GB within memory (either using generous key-caches & system I/O buffers for MyISAM or a large Buffer Pool for InnoDB) will of course reduce the required round-trip-time to the disk. If the contention in the database is write related, consider changing the storage engine for the WordPress tables to InnoDB. Depending on the number of tables — this can lead to memory starvation, so approach with caution.


The last point on databases is disks. In the even the working set doesn’t fit in memory (which is most of the time usually), have the the disk sub-system be as quick as possible. Trade in those “ultra-fast 3.0GB SATA” disks for high-speed SCSI disks. Consider a striped array (RAID-0) — but for safeties sake let it be a RAID-10. Spread the workload over multiple disks: for 150GB of disk space, consider getting several 50GB disks so that a large throughput can be obtained. If you will be doing heavy writes to this disk-subsystem, a battery-backed write-back cache. The throughput will be a lot higher.

The really nice “defense mechanism” for the database is to avoid the database all-together. As mentioned earlier, per-page WordPress tends to make many many database calls. If these calls can be drastically reduced or eliminated the database time goes down and page-generation time goes up. This is usually done by using memcached.

There are two types of cache: object-cache (which are loosely defined as be being things like options, settings, counts, etc.) and full-page cache. A full-page cache is a fully-generated page (HTML output and all) which is stuffed into cache. This type of cache of course virtually eliminates page-generation time altogether.

We should not forget to mention MySQL slave replication. If your single database server cannot keep up — consider using MySQL replication and using a plugin like MultiDB or HyperDB to split the reads and the writes. Keep in mind that you will always have to write to a single database — but should be able to read from many/any.

Page-Generation Time

WordPress spends a considerable amount of time compiling and generating the resultant HTML page ultimately served to the client. For many, the typical choice is using a server like Apache — which with its benefits also brings some limitations. By default, in Apache the PHP processes are built into the processes serving all pages on the site — regardless if they are PHP or not.
Guarantee Stamp 1.6 million WordPress Superheroes read and trust our blog. Join them and get daily posts delivered to your inbox – free!
Email address

By using an alternate web server (e.g. nginx, lighttpd, etc.) you essentially “box-in” all PHP requests — and send them directly to a PHP worker pool which can work on the page-generation part of the request. This leaves the web server free to continue serving static files — or anything else it needs to. Unlike Apache, the PHP worker pool does not even need to reside on the same physical server as the web server. The most widely used implementation is using PHP as a FastCGI process (with the php-fpm patches applied).

File Storage

When using multiple web-tier servers to compile and generate WordPress pages, one of the issues encountered is uploaded multi-media. In a single-server install, the files get placed into the wp-content/blogs.dir folder and we forget about it. If we introduce more than one server — we need to be careful that we no longer store these data files locally as they will not be accessible from the other servers.

To work around this issue, consider having a dedicated or semi-dedicated file server running a distributed file-system (NFS, AFS, etc.). When a user uploads a file, write it to the shared storage — which makes it accessible to all connected web-servers. Alternatively, you may opt to upload it to Amazon S3, Rackspace CloudFiles or some other Content Delivery Network. Either way, the key here is to make sure the files are not going to be local to a single web-server — as if they are — they will not be know to other servers.

On a distributed file-system, refrain — or never — serve files off this system directly. Place a web-server or some other caching services (varnish, squid) who is responsible from reading the data off the shared storage device and returning it to the web server for sending back to the client. One advantage of using something like varnish is that you can create a fairly large and efficient cache — in front of the shared file system. This allows the file-system to focus on serving new files and leaving the highly-requested files to the cache to serve.

Semi-static requests

For requests which can be viewed as semi-static, treat them so. Requests such as RSS feeds, although are technically updated and are available immediately following the publishing of a post, comment, etc. consider caching those for a period of time (5 minutes or so) in a caching proxy such as varnish, squid, etc. This way you can have a high number of requests for things like RSS feeds be satisfied almost for “free” — as they only need to be generated once and then fed by the cache hundreds or thousands of times.

What we use at Edublogs:

3x web-tier servers
2x database servers
1x file server

The web-tier service each has an nginx running, a php-fcgi pool and a memcached instance. The Edublogs.org name resolves to three IP addresses – each being fronted by one of the nginx servers. The nginx is configured to distribute the PHP requests to one of the three servers (itself or the other two in the pool).

The database servers in this case are functioning as a split-setup. The heavier traffic (e.g. blog content) is stored on one set of servers and the global data is stored on a separate set. “Global” data can be thought of options, settings, etc.

The file server is fronted by a varnish pool and connected via NFS to all three web servers. Each web server has a local copy of the PHP files which comprise the site (no reading off of NFS). The user uploads a multi-media file which then gets copied over to the NFS mounts. Upon subsequent requests — the data is server in return by varnish (who also caches it for future requests).

Global Tables, InnoDB & Memcache

The global tables are InnoDB as there are not that many of them and thus have better performance. One of the primary reasons for the individual blog tables are not InnoDB is because of InnoDB data dictionary issues. For large amounts of tables the dictionary can become too large and exhaust all memory on the system. Though there are patches available to change this behavior — the individual tables are still mostly read-only which MyISAM does quite well.

As for caching: We use the memcached-backed object cache and on top of that we also use Batcache (which utilizes the memcached-backed object cache).

We hope that helps… and special shout out to our SysAdmin Michael who pretty much wrote this guide :)


2018年11月14日 由 Amon 没有评论 »

全称:Useful Link Collections
描述:Plugin allow you to create useful links collection or favorite bookmarks and share links list with visitors.


2018年11月9日 由 Amon 没有评论 »

背景:一论坛具有千万记录规模的数据,并实时增长,通过 mysql 内建全文检索(耗时 40s 以上);原有搜索机制已无法满足站内搜索需要。

方案:对站内数据采集入库,并建立索引。通过 PHP+Python 构建多进程采集端,通过 Redis 实现多个服务器分布式并发采集,入库后采用 Sphinx 建立全文检索数据,使用 Bootstrap 框架 + PHP 上线网站。将搜索时间降低至 0.01s 以下。

全文检索: Sphinx/Coreseek
版本管理: Git/SVN
数据可视化: Gephi/SPSS

报错:curl: (60) SSL certificate problem: unable to get local issuer certificate

2018年11月8日 由 Amon 没有评论 »


1. 下载证书


将下载的证书放在 php.ini 的当前目录下的 extras/ssl/ 下面。

cd /usr/local/php/etc/ && mkdir extras && cd extras && wget https://github.com/bagder/ca-bundle/archive/e9175fec5d0c4d42de24ed6d84a06d504d5e5a09.zip && unzip e9175fec5d0c4d42de24ed6d84a06d504d5e5a09.zip && rm e9175fec5d0c4d42de24ed6d84a06d504d5e5a09.zip -y && mv ca-bundle-e9175fec5d0c4d42de24ed6d84a06d504d5e5a09 ssl

2. 编辑 php.ini

打开 /usr/local/php/etc/php.ini ,添加:



2018年11月6日 由 Amon 没有评论 »



本文将简单介绍备份的 321 原则,以及云端备份和本地备份的最佳实践。


  • 比如 GitLab.com 的运维人员就曾经误删除过数据:2017 年 2 月 1 日,运维人员使用了 root 账户错误登录到了主服务器上删除了核心服务数据。更严重的是,其中有五种备份方式都失效了。但幸好还留存着一个六小时前的备份,尽管网站在几个小时内无法访问,并且丢失了在备份之后产生的的很多数据,但最终还是恢复了绝大部分的数据,事件详情
  • WannaCry 病毒导致的数据丢失:WannaCry 是一种勒索病毒,针对 Windows 系统的一个漏洞去加密系统上的用户文件,导致用户无法访问这些文件,除非向一个比特币账户转账 $300 ~ $600(相当于几千人民币)。据报道有超过 30 万台电脑受到此病毒的感染。众多企业,包括银行、医院、铁路系统因病毒而无法正常运转。在中国,众多使用教育网的高校学生电脑受此病毒攻击,导致文件丢失。


最佳备份原则:321 原则

在进行备份的过程中,我们应该施行 321 原则,这样才能保证备份的可靠性与有效性。

  • 三份数据拷贝:除了原始的数据之外,要另存两份数据的备份。倘若这三个拷贝丢失的概率相互独立(均为 1%),那么三份拷贝同时丢失的概率就仅有 0.0001% 了,这比两个拷贝同时丢失的概率更低。
  • 两种存储介质:在同一种类型的存储介质上的数据更有可能同时丢失。比如你在电脑的内置存储器上存了三份数据拷贝,但如果电脑的磁盘彻底损坏、误格式化磁盘或者丢失了电脑,那么这些数据便一同丢失了。在上述案例中,另一种类型的存储介质可以是移动硬盘、SD 卡、U 盘、CD、DVD 等。
  • 一个异地备份:多个备份间的物理隔离是很重要的。如果这些备份都放在一个房间里,那么一场火灾就足以毁掉所有的备份。如果条件允许,跨城市(间隔 100km 以上)存储备份就已经很安全了。在家和公司分别存放备份也算作异地备份。

此外也应该注意备份的时效性,如果可能,要尽量缩短备份周期。比如每分钟备份的时效性就强于每小时备份。在数据丢失时,前者只会丢失最近 1 分钟的工作,而后者会丢失最近 1 小时的工作。


通常,云端备份是非常可靠的。云端服务器都会帮你做好 321 原则,你只需要选择一家云存储服务商并将要备份的文件上传上去即可。


一个典型的云端备份的例子是 iOS 中的 iCloud 备份功能,开启该功能后,iOS 设备会自动将图片、通讯录、文档、聊天记录、软件存档等个人数据上传到云端。在购买新的 iOS 设备后,这些数据都能够从云端自动恢复到新设备上。

< 使用对象存储进行简单的备份

定期将服务器上的重要文件打包上传到对象存储,即可实现简单的备份。可以直接使用 Amazon S3、Google Cloud Storage、阿里云 OSS、腾讯云 COS 的对象存储,上述服务均提供 99.999999999% 的持久性,即文件一旦上传完毕,几乎不可能意外丢失。云服务中的对象存储通常是在一个区域内的多个可用区(通常至少三个)进行存储,每个可用区内也包含文件的多个副本。各个可用区之间有一定的距离,所以这实现了异地关于区域和可用区,可以详细参考 AWS 的这篇文章

云服务的对象存储一般都可以选择地区。通常选择地理位置最近的地区以获得最低的延迟。这些服务通常是按照使用量计费的,主要包括在一定时间内占用的存储空间以及传输数据所用的流量费用。比如你要备份 1GB 的数据,那么每月可能只需要几块钱或几毛钱,甚至是免费的。(不同服务商收费不一)

很多服务器上的软件已经集成了使用对象存储进行备份的功能:在服务提供商开通了对象存储后,只需要在软件中配置好授权密钥,就可以使用对象存储进行备份了。如果软件没有集成这种备份功能,那么也可以手动实现简单的备份。比如,使用 mysqldump 导出数据库文件,使用 gziptar 命令压缩、打包要备份的文件。通常对象存储的提供商也有提供命令行工具,使用这些工具可以简单的将文件上传到对象存储中。如 AWS 有 aws,支持 S3 操作;Google Cloud Storage 有 gsutil;阿里云 OSS 有 ossutil;腾讯云有 tccli,支持 COS 操作。






在本地备份则需要自己做好 321 原则。你需要将数据备份到两个硬盘上(通过局域网或有线连接),并将其中一个硬盘存放在异地。很多桌面操作系统都支持了备份,你可以在最新的 Windows 系统的控制面板中找到备份功能,在 macOS 上使用时间机器(Time Machine)进行备份。建议配置好自动备份。




早期的版本可以有相对更长的时间间隔,以便节省空间:像 macOS 中的时间机器(Time Machine),它会保留过去 24 小时的每小时备份、过去一个月内的每日备份、以及过去一个月以上的尽可能多的每周备份,直到磁盘空间填满。

一些网络存储会自动保留历史版本,比如 Dropbox、Google 云端磁盘、iCloud 等。一些软件也会在本地磁盘里保留历史版本。比如 Git 就会保留每一次提交的历史。



对象存储的存储类别(Storage Class)


Amazon S3 的主要存储类型

按存储价格由高到低排序,持久性均为 99.999999999%,均为多个可用区。

  • STANDARD:默认,适合频繁访问的文件
  • STANDARD_IA:存储单价更低(默认的 54%),但有额外的检索费用。此外,此类型至少存储 30 天,至少 128kb。
  • GLACIER:存储单价最低(默认的 17%),不可实时访问,也有额外的检索费用

大于 128kb 的且不经常访问的备份建议存储到 STANDARD_IA,几乎不会再访问的早期的历史版本可以存储到 GLACIER。

Google Cloud Storage 的主要存储类型

按存储价格由高到低排序,持久性均为 99.999999999%,均为多个可用区。

  • Multi-Regional:多地区存储(比多可用区更强),此存储类型会在一个洲内的多个城市/国家存放文件。按照官网说法上传后的文件会在至少间隔 160 公里的至少两个数据中心存储。适合存放在全球频繁访问的文件
  • Regional:对应 S3 的 STANDARD
  • Nearline:对应 S3 的 STANDARD_IA,是 Regional 价格的 50%,没有最低文件大小限制
  • Coldline:对应 S3 的 GLACIER,且至少存储 90 天,但支持实时访问,是 Regional 价格的 35%,检索费用比 Nearline 更高。

同样的,不经常访问的备份建议存储到 Nearline,几乎不会再访问的早期的历史版本可以存储到 Coldline

阿里云 OSS 的主要存储类型

按存储价格由高到低排序,持久性均为 99.999999999%,均为多个可用区。

  • 标准型:对应 S3 的 STANDARD
  • 低频访问型:对应 S3 的 STANDARD_IA,是标准型价格的 67%,至少存储 30 天,至少 64kb。
  • 归档型:对应 S3 的 GLACIER,是标准型价格的 28%,至少 64kb。至少存储 60 天,检索费用比低频访问型更高。