mod_log_spread is designed as patch/enhancement to mod_log_config to allow logs to be multicast directly to client communication groups via spread. This problem was motivated by work at Community Connect, Inc. who was faced with the need to collect centralized logs (preferably in one central file) for a dynamically growing web farm. Performance imapct had to be minimal, the transportaion reliable, and a fully cusomizable log format needed to be available. mod_log_config supplied the the 3rd requirement, as well as laying out a solid framework into which a new transport could be added. PREREQUISITES: Spread: spread is a powerful group communication toolkit developed by the Johns Hopkins University Cewnter for Networking and Distributed Systems (http://www.cnds.jhu.edu). Since mod_log_spread logs to spread groups, it is a necessity. In addition to the spread binary, you also need the libraries and headers (libsp.a and sp.h, available in the standard spread distro.) INSTALLING: 1) Install Spread (available from http://www.spread.org) Compile mod_log_spread in one of three ways: Statically: 2a) Drop mod_log_spread.c into /src/modules/extras and configure with configure --enable-module=/src/modules/extras/mod_log_spread.c. Add -lsp and -L -I to CFLAGS prior to compile. 2b) Apply the mod_log_spread patch to mod_log_config.c, and -lsp to the top-level Makefile, as above, and compile as normal. or as a DSO: 2c) compile apache with --enable-module=so then build mod_log_spread with apxs by doing: (Assuming your spread libraries and headers are in /usr/local) apxs -c -I /usr/local/lib -L/usr/local/lib -lsp \ src/modules/extra/mod_log_spread.c apxs -i -a -n log_spread mod_log_spread.so 3) If you compile mod_log_spread as a DSO or as an optional module (i.e. not as just a patch to mod_log_config) then since mod_log_spread uses the same configuration directives as mod_log_config you need to take care of it's placement in the AddModule directives. The best solution is to simply comment out mod_log_config (since this module is simply an enhancement and fully supports all the mod_log_config directives). If you choose to leave them both activated, mod_log_spread should be listed AFTER mod_log_config. 4) to collect logs, build the log client (setailed in COLLECTING LOGS, below) on another spread-enabled machine. CONFIGURATION: mod_log_spread includes one new directive SpreadDaemon, used to specify the spread daemon the apache server will contact to multicast logs. It's recommended that on machines with reasonable load that a spread daemon be run on each webserver and that one just uses SpreadDaemon [id] This allows apache to contact Spread via a unix domain socket, whihc is much faster than a tcp connecton. If you need to contact a Spread daemon on a remote host, use the syntax: SpreadDaemon [id] id specifies a numeric id associated with that daemon, to allow for different CustomLog lines to log to different daemons. To specify what group to log to and what log format touse, mod_log_config supports a new CustomLog directive: CustomLog $[#id] For example, to log common log format logs to the group 'www_cluster' to the default local spread daemon listening on port 4308 the following directives should be added: SpreadDaemon 4308 CustomLog $www_cluster common To specify a daemon by id, postfix a '#' and the id number, so to send to the second daemon specified below you would add SpreadDaemon 3333 0 SpreadDaemon 4308 1 CustomLog $test1#0 common //sends to daemon 0 CustomLog $test2#1 common //sends to daemon 1 VIRTUAL HOST LOGGING: For easy per-host virtual host logging, mod_log_spread provides two 'magic' group names. $#hostname each request to be logged to the group with name equal to the Host: request header e.g. if the Host: www.my.server header is present, then the request will be logged to the spread group 'www.my.server'. This is effective for small virtual host setups (under Spread 3.12, 1000 is the maximum, though it is likely that spread convergence-related perfomance problems occur after ~100). For larger vhost setups, there is the magic name $#vhost. This cause each request to be logged to a group which is an integer hash of the Host: request header. It also prepends the Host: name to the begining of each request log line. Here, a request for www.my.server will be deterministically hashed to some integer (depending on the hash size as set in VhostLogHashSize). To write logs, a spreadlogd is brought up for each hash value (default VhostLogHashSize is 32). With a stock linux maximum file descriptors per process set at 1024, this allows for ~30k vhosts. With some kernel patching 200k+ vhosts should be possible with 32 log writers. FAILOVER LOGGING: If a second group name is listed before the format, like '$groupname,$groupname2' then the second group name is considered a failover group, and the log messages will be logged to the second group only if the first one fails. This can be used either for reliability purposes or to ease migration to a newer version of the spread daemon. COLLECTING LOGS: mod_log_spread is only half the picture. A logging client is necessary as well. The client does not need to be run on a webserver (though it could be), simply on a machine running spread. The log client currently takes two arguments - a spread group to join and listen for messages, and a file to write the messages to. The slick part about using spread for this is that you can have an entire cluster multicast their logs to a single group, then have a logging only host (or multiple ones) write a single log for the cluster. The Spread protocol is a reliable protocol, so you never have to worry about dropped logs (say if you logged to syslog which logged to a remote host.) Further, webservers and logging hosts can be brought up or brought down without having to touch any of the other group members. If you want to take a look at logs, just bring up a daemon on any machine and join the group - the webserver confs don't need to be modified or restarted. The original log_writer is no longer supported and has been replaced by the far superior speadlogd, written by Theo Schlossnagle. spreadlogd allows for listening to multiple spread rings and regex-matching for logging. To build spreadlogd, cd to the spreadlogd directory, do a configure and a make. ERROR LOGGING: Due to certain limitations in the apache API, direct error logging to spread is only possible through patching the core apache tree. If significant interest exists, I'll produce this. In the meantime, this functionality can be attained by setting ErrorLog to pipe to the perl script error_log_spread.pl PERFORMANCE TUNING: The README.spread file in this distribution now contains spread-specific tunings.