Coping With Large Log Files
Log files can easily become very, very large, taking hours to download at normal modem speeds. In this document, we detail a few strategies for dealing with large log files.
FastStats can access your log files three ways: (1) it can read them from your hard drive, (2) it can download them from an FTP server, and (3) it can download them from a web server. FastStats will intelligently download the files that you tell it to -- that is, it will not download a log file unless it has changed since its last download.
Most of the time, it is perfectly acceptable to give FastStats your FTP server name and a wildcard that covers all of your log files, and let it manage the downloading process. However, you can get extra performance by manually downloading and deleting your log files. The boost in performance depends on how your ISP accumulates log files:
A new, uniquely named log file each day
A "rotating name" log file system
A "one file" system
You should probably check with your web hosting provider before deleting any log files. If you have any other good tips for managing log files, e-mail us at analyzer -at- mach5.com and we'll post them here.
You should check your log files; the time data may be written in European (DD-MM-YY) format. FastStats, by default, reads log files in United States (MM-DD-YY) date format.
To change the offset, go to the Global Options. Make sure All log files are in European date format is checked. The changes will take effect when the report is regenerated.
This problem is caused by an very old version of the COMCTL32.DLL file being installed on your system. You can download a new version of the COMCTL32.DLL file from this URL:
The COMCTL32.DLL file is used by a wide variety of applications, and upgrading will most likely slightly improve user interface performance of those applications and eliminate some errors, in addition to enabling FastStats to work.
This help topic tells you about common problems in configuring log file analysis that can cause FastStats to mess up.
1. If your log files are stored locally and you tell FastStats to parse an entire directory, beware! FastStats will not only parse every file in the directory you told it to, but it will parse every file in every subdirectory of the directory you specified. People have run into problems where a directory containing .EXE files is a subdirectory of the log file directory. FastStats does its best to ignore the .EXE file, but it may try to parse them and crash in some situations.
2. Never let FastStats parse your error_log file (if you have one). If you have multiple log files, only let FastStats parse your access_log file. A common problem is to have a directory with access_log, agent_log, referer_log, and possibly error_log. If you tell FastStats to analyze this entire directory, it will give you an error and may even crash on the error_log file. You should tell FastStats to only analyze access_log.
Why is the "Geographical Location" report missing (Analyzer 3 and earlier only)
The Geographical Location report is generated by analyzing the domain name suffixes of users accessing your web site. For example, a hit from "user12.isp.us" is considered to be a hit from a US address, a hit from "user13.isp.de" is a hit from Germany, etc.
The Geographical Location report is enabled if FastStats comes across any domain names in the log file. The majority of log files store IP addresses, numbers that look like "22.214.171.124". Unless your web server automatically stores domain names instead of IP addresses in your log file, it is necessary for FastStats to perform a "Reverse DNS lookup" and translate IP addresses into domain names. Doing a Reverse DNS lookup dramatically slows down the web server log file analysis process.
In summary, the Geographical Location report is enabled if one or more of the following is true:
It is common for the time data in your log files to be recorded in an inappropriate time zone. To correct this, FastStats allows you to add or subtract an offset to the hour recorded in the log file. This problem is common if a thirty party hosts your web site; the third party’s servers may be located on the opposite side of the country, and in a completely different time zone. You can adjust the offset to any value between -12 and 12. The best way to determine the offset for your log files is to examine the times recorded near the beginning of your log file. If you know what time your web-hosting provider generates the log files, then you can just subtract to determine the offset. It may take some experimentation. Note that some servers record their log files in GMT time (which is 5-8 hours off times in the continental US).
To change the offset, go to the Report menu and choose Options. The option to change the Time Offset is down at the bottom of the page.
Most web servers, especially Apache, buffer the log file data. That is, they store the hit data internally, in memory, and only write it to the log file after it reaches a certain threshold -- generally when the log file data occupies more than a few kilobytes in memory. A 2-3 kilobyte buffer can hold several hundred hits.
You most likely ran into this problem when testing out FastStats. You visited your web site, pressed 'Reload' to ensure that a few hits were registered, and then re-downloaded your log files. Unless a few hundred people have "hit" the server between the time you visited your site and the time you downloaded the log file, your visits are most likely not in the log file (and therefore will not show up in the 'Recent Accesses' report).
Note: this information only applies if your web hosting provider (or system administrator) allows you to access the current day's log files. Most web hosts only let you access yesterday's log files (for a lot of good reasons).
A great way to debug wildcards is to do them from DOS. Click Start and then Run. Then type "command" and press enter. You should now have a DOS window on your screen. Now type dir "c:\logfiles\ex0009*.log". If this returns some files, then there's definately a FastStats problem. If there are no files, then your wildcards is incorrect.
If it's a FastStats problem, we suggest your next step for debugging should be to make sure that you have the "Parse all logs that match this wildcard" option selected, and not "...one file" or "...entire directory".
If your issue isn't answered by these topics, please contact us for prompt assistance