This node reads Apache log files.
You can select one or more sources to read from. A source can be one of the following:
Local directory; all files inside the directory which match the pattern (see below) are read
URL denoting a file; all supported protocols are possible, e.g
(if you have
installed the SSH extension)
URL denoting a directory; this is only supported for
and the URL must end with
;type=d. Recursive reading of sub-directories is not supported. You must make sure that
either directory does not contain any sub directories or that you exclude any subdirectories using
the directory contents pattern. Otherwise you may get an error while reading.
When reading all files in a directory, you can specify a regular expression (not a wildcard expression!) to
which the files in the directory must match.
Now you have to select the format of the log files. First you need to specify which locale is used on the
server. This is necessary for parsing dates since e.g. the month names are different in different locales.
all log files are created with an english locale, the default is
. You only need to change this if your
webserver uses a different locale when writing the log files. The next
step is to specify the date format used
in the log file. Again you only need to change this, if you are using
a non-standard date format. The format specification
identical to the one in the Apache configuration but instead uses the Java syntax. Take a look at the
The last piece is the actual format specification of a complete log line. This format is
identical to the one in the Apache configuration, i.e. you can simply copy it from there. The full syntax is
given in the
. The most commonly used fields are:
- the size of the response in bytes
- the clients IP address or name
- the value of the request header
- the remote logname, if ident is used
- the request itself
- the HTTP status code
- the request's timstamp
- the remote user, if authentication is used
- the virtual host this request was sent to
- %0 - this special field can be used to process unknown fields
The input fields contains the two most commonly used format,
. If you click on Analyze log
the first line of the first file is read and analyzed with the given
format. If the format matches, you will get a preview of the columns and types in the table below. The
types and columns names are hard-coded and cannot be changed.
In the second tab you can specify time ranges for requests you want to included in the output. All request
the specified range are filtered out. The start date is inclusive whereas the end date is exclusive.
to filter all request from March 2013, you would specify
as start date and
as end date. If you are using flow variables to specify dates, you must use the date format as it is used
in the log file and specified in the dialog.