Simple Apache Log Parser
Posted by Jeron_Baffom@reddit | linuxadmin | View on Reddit | 4 comments
I was trying to find a simple CLI tool (for Linux) to parse apache log, do some stats and create a plain text output with some simple aggregate data (ex: view counter). Then, this plain text output would be submitted to MySQL via cronjob.
The advantage I see by doing this way, is that the database would be hit outside the page request and in batches.
I could find several tools to plot graphs and do some realtime monitoring (ex: GoAccess, AWStats, ApacheTop ...), but none that would create a simple plain text output. Hence, I was left only with bad alternatives:
- Create myself a parser script using 'awk', 'grep', 'cut', 'sed', 'tail -f' ...
- Or, use LogStash. Which is an overkill for me.
Question
Any recommendation of a simple CLI tool to parse Apache Logs into plain text ?
Complex-Internal-833@reddit
This post might be too late for you but does all and more than your requirements. I just finished and released it this week. Here's a complete open-source Apache Log Parser & Data Normalization Solution. Python module imports Apache2 Access (LogFormats = vhost_combined, combined, common, extended) & Error logs into MySQL Schema of tables, views & functions designed to normalize data. Client & Server components capable of consolidating logs from multiple web servers & sites with complete Audit Trail & Error Logging! https://github.com/WillTheFarmer/ApacheLogs2MySQL
Jeron_Baffom@reddit (OP)
It seems you've been working hard for while ...
Did you do all this by yourself?
Before hitting the database, is it possible to:
Complex-Internal-833@reddit
Have you run it yet? All that can be done once into MySQL. MySQL is doing all the data manipulation. I initially started doing it in Python but SQL is way better at it.
A pre-import Stored Procedure could be executed on the LOAD DATA tables prior to executing the import Stored Procedure. The import processes is where the data normalization occurs. Once the normalization is done it becomes very clear what data is Good and Bad. It could easily be implemented in a post-import process as well.
Yes, I designed and developed every bit of this application.
I've been designing databases and data processes professionally since 1993.
https://farmfreshsoftware.com
Jeron_Baffom@reddit (OP)
No, not yet. But it is on the radar for a next development iteration.
Impressive.
Are you somewhat connected with Linus Torvalds or Richard Stallman's open source projects ??