There are times when you get urgent ping from you customer or hosting provider operations team. They complain about significant drop in application performances. Or that something is eating all the available server bandwidth or similar. The reasons for those problems could be various. If application logs and metrics showing normal behavior, before digging deeper, it is a good idea to exclude “external factor” first. GoAccess could get very handy in this situations, to analyze web server logs on the fly.
What is GoAccess?
GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
GoAccess was designed to be a fast, terminal-based log analyzer. Its core idea is to quickly analyze and view web server statistics in real time without needing to use your browser (great if you want to do a quick analysis of your access log via SSH, or if you simply love working in the terminal).
While the terminal output is the default output, it has the capability to generate a complete real-time HTML report, as well as a JSON, and CSV report.
GoAccess allows any custom log format string. Predefined options include, but not limited to:
- Amazon CloudFront (Download Distribution).
- Amazon Simple Storage Service (S3)
- AWS Elastic Load Balancing
- Combined Log Format (XLF/ELF) Apache | Nginx
- Common Log Format (CLF) Apache
- Google Cloud Storage.
- Apache virtual hosts
- Squid Native Format.
- W3C format (IIS).
GoAccess can be installed and run on *nix systems. The easiest way is to install it using a package manager on your Linux distribution (though it would probably not be the latest version). You can also compile the project by your own. Find it on GitHub.
There is also a docker image you can directly use, if you are playing with containers.
You run the tool using a command line. Here is one simple example of analyzing Apache HTTP web server log file:
goaccess -f apache.log
When you run this command, on the next screen you should select the format of you log file. After that, you will get get to textual format Dashboard. You can scroll down to see different data. When you press a specific number, you have some options to change, like changing he sorting of that statistic etc.
I use this tool a lot to check the amount of load to the server which is generated by some specific IPs. Sometimes you can see that Google crawler and others are a bit more active and use some resources of your system.