Log File Rotation

Discusses decisions associated with log file size and rotation.

With web traffic analysis that relies on web server logs, the first consideration you must make is how long to hold onto the raw, unaggregated log files. You may need to access old log files to reanalyze them. For example, you might want to reanalyze raw data based on new configuration settings. Or you might need to reanalyze the log file from a server belonging to a cluster that was not available at the original time of analysis, and then add that reanalysis into an entire day’s worth of logs.

In a log file, a typical hit might range roughly from 250 to 750 bytes. Given that number, consider what happens if your site experiences an average of 10,000 hits per day. This means that your log file can be anywhere from 2.5 MB to 7.5 MB. If your site experiences up to 5,000,000 hits per day (not unusual for enterprise-level organizations) your log file size can easily be several gigabytes. For large organizations with extremely active web sites, generating terabytes of data per year is common.

Because data activity file sizes for even a daily web data activity file can require gigabytes of storage space, most organizations implement a log file rotation scheme that keeps computing resources available for processing tasks. Depending on the volume of traffic that your site experiences, you may wish to rotate (roll over) log files daily, weekly, or monthly.

Note: When an IIS server rolls over daily, it closes one log file and starts a new file at 12:00 A.M. Greenwich Mean Time, not at midnight local time.
Log file rotation involves archiving of data activity files for specified time periods into folders. The folders are then compressed and copied to the analysis server.

Rotation schedules can also depend on how you access your log files, and how often you intend to report on those log files. Webtrends should always be configured to analyze log files that have been closed; that is, log files that have been rotated and will no longer be written to with new traffic. If the log files will be analyzed using FTP, the entire log file needs to be transferred to the analysis engine before analysis.

Note: Creating a network share on the folder where the logs are located on the remote machine will improve performance, as the log files can then be analyzed directly from the remote machine without the need to first be copied to the analysis server.

Typically, organizations rotate their log files daily, however, you can rotate log files more frequently if needed. After you rotate the log files and analyze them, determine how long to archive them. How long you archive log files depends on your reasons for keeping the data. Some organizations never intend to reanalyze their data, so they discard data shortly after analysis. Other organizations keep their data forever. Most organizations archive data for a period between one quarter and one year.

Recommendations: Log Files

  • Rotate log files daily. Consider rotating log files hourly if you access your log files using FTP, and if your site has a large amount of traffic.
  • Archive analyzed log files for one year.

Was this topic helpful? Send feedback.