Another log parsing/data munging tool.

Input jPOS, csv, log4j or JVM garbage collection logs.

Output csv, gnuplot charts, HdrHistograms.

Download the latest build

Examples

These examples all all based on some sample data kindly provided by @apr.

If you want to try them out:

  1. git clone https://github.com/binaryfoo/lagotto.git

  2. cd lagotto

  3. mvn package -DskipTests

  4. ./lago <…​ example …​>

Filtering

Only output entries with an MTI (field 0) of 2100 using --field.

lago log_samples/q2.log --field mti=2100 -n 1
<log realm="channel/127.0.0.1:9001" at="Thu Dec 18 15:12:40 UYST 2014.330" lifespan="1ms">
  <send>
    <isomsg direction="outgoing">
      <!-- org.jpos.iso.packager.GenericPackager[cfg/jcard.xml] -->
      <field id="0" value="2100"/>
      <field id="2" value="6009330000000021"/>
      <field id="3" value="300000"/>
      <field id="4" value="8582000000000000"/>
      <field id="7" value="1218151240"/>
      <field id="11" value="000000000002"/>
      <field id="12" value="20141218151240"/>
      <field id="14" value="1001"/>
      <field id="15" value="20141218"/>
      <field id="17" value="1218"/>
      <field id="22" value="10000000080000000004000001000000" type="binary"/>
      <field id="32" value="000001"/>
      <field id="37" value="180312760317"/>
      <field id="41" value="29110001"/>
      <field id="42" value="001001"/>
      <isomsg id="43">
        <field id="2" value="jCard Selftest system"/>
        <field id="4" value="Montevideo"/>
        <field id="5" value="MV"/>
        <field id="7" value="UY"/>
      </isomsg>
      <isomsg id="113">
        <field id="2" value="106"/>
      </isomsg>
    </isomsg>
  </send>
</log>

The --field option take an argument of the form path=value. Path can be dot delimited like 48.1.1. The -n limits the number of entries output. Like in 'nix head and tail.

To make the remaining example output shorter let’s introduce the idea tabular output (--table). List out just the fields you’re interested in.

lago log_samples/q2.log --table time,2,11 -f mti=2100 -n 1
time,2,11
15:12:40.330,6009330000000021,000000000002

Whilst comma and tab separated data might be good for import into Excel/Google Docs/Libre Office/MySQL an ASCII table like mysql’s command line client is a bit easier for humans to parse on the command line.

lago log_samples/q2.log -o ascii --table time,2,11 -f mti=2100 -n 1
==================================================
| time         | 2                | 11           |
==================================================
| 15:12:40.330 | 6009330000000021 | 000000000002 |
==================================================

Back to filtering, you can filter using somewhat familiar operators: <, >, !=, ~, and ~/regex/. The last two are contains and regex.

To show log events that took more than 300ms to receive or send use >. The following two receives aren’t really interesting. A send with a non zero lifespan usually means a GC pause or TCP back pressure.

lago log_samples/q2.log -o ascii --table time,msgType,lifespan,src -f 'lifespan>300'
===========================================================
| time         | msgType         | lifespan | src         |
===========================================================
| 15:12:41.281 | peer-disconnect | 362      | q2.log:5544 |
| 15:12:41.281 | receive         | 356      | q2.log:5549 |
===========================================================

You can also use --grep to just search the log entries as text.

lago log_samples/q2.log --grep suspicious -o ascii -t time,63,src
================================================
| time         | 63              | src         |
================================================
| 15:12:40.537 | Card suspicious | q2.log:1020 |
| 15:12:40.538 | Card suspicious | q2.log:1056 |
| 15:12:40.539 | Card suspicious | q2.log:1092 |
| 15:12:40.760 | Card suspicious | q2.log:1626 |
| 15:12:40.761 | Card suspicious | q2.log:1661 |
| 15:12:40.763 | Card suspicious | q2.log:1696 |
| 15:12:40.767 | Card suspicious | q2.log:1812 |
================================================

Aggregation

Some rather basic SQL like group by and aggregation operators are supported.

Show a table of message type and count.

lago log_samples/q2.log -o ascii -t mti,count
================
| mti  | count |
================
|      | 92    |
| 2800 | 2     |
| 2810 | 2     |
| 2100 | 35    |
| 2110 | 33    |
| 2420 | 3     |
| 2430 | 3     |
================

Count rows matching a condition (like approved).

lago log_samples/q2.log -o ascii -t 'mti,count,count(39=0000)'
=================================
| mti  | count | count(39=0000) |
=================================
|      | 92    | 0              |
| 2800 | 2     | 0              |
| 2810 | 2     | 2              |
| 2100 | 35    | 0              |
| 2110 | 33    | 0              |
| 2420 | 3     | 0              |
| 2430 | 3     | 0              |
=================================

Combine with filtering to get the spread of response codes for auths.

lago log_samples/q2.log -o ascii -t '39,count' --field '39!=0000' --field mti=2110
================
| 39   | count |
================
| 1011 | 2     |
| 1015 | 3     |
| 1002 | 5     |
| 1014 | 2     |
| 1803 | 21    |
================

Bucketing results by time. You can use a joda time format or a few extras to aggregate by 10 minute (HH:m0) or 10 second (HH:mm:s0) buckets.

lago log_samples/q2.log -o ascii -t 'time(HH:mm:s0),count'
==========================
| time(HH:mm:s0) | count |
==========================
| 15:12:30       | 18    |
| 15:12:40       | 152   |
==========================

The results aren’t that interesting with such a small data set.

TODO group_concat

TODO group_sample

Pairing Requests & Responses

Match responses to requests using --pair. Show one line per pair including the round trip time (rtt): response - request time

lago log_samples/q2.log -o ascii -t time,mti,11,rtt --pair
============================================
| time         | mti  | 11           | rtt |
============================================
| 15:12:40.288 | 2800 | 000000000001 | 4   |
| 15:12:40.287 | 2800 | 000000000001 | 6   |
| 15:12:40.331 | 2100 | 000000000002 | 141 |
| 15:12:40.330 | 2100 | 000000000002 | 143 |
| 15:12:40.496 | 2100 | 000000000003 | 6   |
| 15:12:40.496 | 2100 | 000000000003 | 7   |
| 15:12:40.515 | 2100 | 000000000004 | 22  |
| 15:12:40.515 | 2100 | 000000000004 | 23  |
| 15:12:40.548 | 2100 | 000000000005 | 149 |
| 15:12:40.547 | 2100 | 000000000005 | 150 |
| 15:12:40.711 | 2100 | 000000000006 | 49  |
| 15:12:40.710 | 2100 | 000000000006 | 51  |
| 15:12:40.775 | 2100 | 000000000007 | 6   |
| 15:12:40.774 | 2100 | 000000000007 | 8   |
| 15:12:40.794 | 2100 | 000000000008 | 10  |
| 15:12:40.793 | 2100 | 000000000008 | 12  |
| 15:12:40.817 | 2100 | 000000000009 | 8   |
| 15:12:40.817 | 2100 | 000000000009 | 8   |
| 15:12:40.837 | 2100 | 000000000010 | 9   |
| 15:12:40.836 | 2100 | 000000000010 | 11  |
| 15:12:40.862 | 2100 | 000000000011 | 5   |
| 15:12:40.861 | 2100 | 000000000011 | 7   |
| 15:12:40.878 | 2100 | 000000000012 | 6   |
| 15:12:40.877 | 2100 | 000000000012 | 8   |
| 15:12:40.896 | 2100 | 000000000013 | 6   |
| 15:12:40.896 | 2100 | 000000000013 | 7   |
| 15:12:40.918 | 2420 | 000000000014 | 6   |
| 15:12:40.917 | 2420 | 000000000014 | 8   |
============================================

Cleaning Data

Tidy up a field a little using a regex replacement. This feature is probably somewhat gratuitous.

lago log_samples/q2.log -o ascii -t 'time(HH:mm),mti,11(/^0+//),rtt' --pair
=========================================
| time(HH:mm) | mti  | 11(/^0+//) | rtt |
=========================================
| 15:12       | 2800 | 1          | 4   |
| 15:12       | 2800 | 1          | 6   |
| 15:12       | 2100 | 2          | 141 |
| 15:12       | 2100 | 2          | 143 |
| 15:12       | 2100 | 3          | 6   |
| 15:12       | 2100 | 3          | 7   |
| 15:12       | 2100 | 4          | 22  |
| 15:12       | 2100 | 4          | 23  |
| 15:12       | 2100 | 5          | 149 |
| 15:12       | 2100 | 5          | 150 |
| 15:12       | 2100 | 6          | 49  |
| 15:12       | 2100 | 6          | 51  |
| 15:12       | 2100 | 7          | 6   |
| 15:12       | 2100 | 7          | 8   |
| 15:12       | 2100 | 8          | 10  |
| 15:12       | 2100 | 8          | 12  |
| 15:12       | 2100 | 9          | 8   |
| 15:12       | 2100 | 9          | 8   |
| 15:12       | 2100 | 10         | 9   |
| 15:12       | 2100 | 10         | 11  |
| 15:12       | 2100 | 11         | 5   |
| 15:12       | 2100 | 11         | 7   |
| 15:12       | 2100 | 12         | 6   |
| 15:12       | 2100 | 12         | 8   |
| 15:12       | 2100 | 13         | 6   |
| 15:12       | 2100 | 13         | 7   |
| 15:12       | 2420 | 14         | 6   |
| 15:12       | 2420 | 14         | 8   |
=========================================

Sorting

Sort using a field with --sort rtt

lago log_samples/q2.log -o ascii -t 'time,mti,11(/^0+//),rtt' --pair --sort rtt
==========================================
| time         | mti  | 11(/^0+//) | rtt |
==========================================
| 15:12:40.288 | 2800 | 1          | 4   |
| 15:12:40.862 | 2100 | 11         | 5   |
| 15:12:40.287 | 2800 | 1          | 6   |
| 15:12:40.496 | 2100 | 3          | 6   |
| 15:12:40.775 | 2100 | 7          | 6   |
| 15:12:40.878 | 2100 | 12         | 6   |
| 15:12:40.896 | 2100 | 13         | 6   |
| 15:12:40.918 | 2420 | 14         | 6   |
| 15:12:40.496 | 2100 | 3          | 7   |
| 15:12:40.861 | 2100 | 11         | 7   |
| 15:12:40.896 | 2100 | 13         | 7   |
| 15:12:40.774 | 2100 | 7          | 8   |
| 15:12:40.817 | 2100 | 9          | 8   |
| 15:12:40.817 | 2100 | 9          | 8   |
| 15:12:40.877 | 2100 | 12         | 8   |
| 15:12:40.917 | 2420 | 14         | 8   |
| 15:12:40.837 | 2100 | 10         | 9   |
| 15:12:40.794 | 2100 | 8          | 10  |
| 15:12:40.836 | 2100 | 10         | 11  |
| 15:12:40.793 | 2100 | 8          | 12  |
| 15:12:40.515 | 2100 | 4          | 22  |
| 15:12:40.515 | 2100 | 4          | 23  |
| 15:12:40.711 | 2100 | 6          | 49  |
| 15:12:40.710 | 2100 | 6          | 51  |
| 15:12:40.331 | 2100 | 2          | 141 |
| 15:12:40.330 | 2100 | 2          | 143 |
| 15:12:40.548 | 2100 | 5          | 149 |
| 15:12:40.547 | 2100 | 5          | 150 |
==========================================

Or sort descending using --sort rtt desc

lago log_samples/q2.log -o ascii -t 'time,mti,11(/^0+//),rtt' --pair --sort 'rtt desc'
==========================================
| time         | mti  | 11(/^0+//) | rtt |
==========================================
| 15:12:40.547 | 2100 | 5          | 150 |
| 15:12:40.548 | 2100 | 5          | 149 |
| 15:12:40.330 | 2100 | 2          | 143 |
| 15:12:40.331 | 2100 | 2          | 141 |
| 15:12:40.710 | 2100 | 6          | 51  |
| 15:12:40.711 | 2100 | 6          | 49  |
| 15:12:40.515 | 2100 | 4          | 23  |
| 15:12:40.515 | 2100 | 4          | 22  |
| 15:12:40.793 | 2100 | 8          | 12  |
| 15:12:40.836 | 2100 | 10         | 11  |
| 15:12:40.794 | 2100 | 8          | 10  |
| 15:12:40.837 | 2100 | 10         | 9   |
| 15:12:40.774 | 2100 | 7          | 8   |
| 15:12:40.817 | 2100 | 9          | 8   |
| 15:12:40.817 | 2100 | 9          | 8   |
| 15:12:40.877 | 2100 | 12         | 8   |
| 15:12:40.917 | 2420 | 14         | 8   |
| 15:12:40.496 | 2100 | 3          | 7   |
| 15:12:40.861 | 2100 | 11         | 7   |
| 15:12:40.896 | 2100 | 13         | 7   |
| 15:12:40.287 | 2800 | 1          | 6   |
| 15:12:40.496 | 2100 | 3          | 6   |
| 15:12:40.775 | 2100 | 7          | 6   |
| 15:12:40.878 | 2100 | 12         | 6   |
| 15:12:40.896 | 2100 | 13         | 6   |
| 15:12:40.918 | 2420 | 14         | 6   |
| 15:12:40.862 | 2100 | 11         | 5   |
| 15:12:40.288 | 2800 | 1          | 4   |
==========================================

More Filtering

Show only pairs where the response took more than 100ms using --field

lago log_samples/q2.log -o ascii -t time,mti,11,rtt --pair --field rtt>100
============================================
| time         | mti  | 11           | rtt |
============================================
| 15:12:40.331 | 2100 | 000000000002 | 141 |
| 15:12:40.330 | 2100 | 000000000002 | 143 |
| 15:12:40.548 | 2100 | 000000000005 | 149 |
| 15:12:40.547 | 2100 | 000000000005 | 150 |
============================================

Remember you’ll need to use quotes 'rtt>100' or a slash rtt\>100 to escape the '>' when using bash.

Add the file name and line number to get more detail (or check lago is not lying)

lago log_samples/q2.log -o ascii -t time,mti,11,rtt,src --pair --field rtt>100
==========================================================
| time         | mti  | 11           | rtt | src         |
==========================================================
| 15:12:40.331 | 2100 | 000000000002 | 141 | q2.log:378  |
| 15:12:40.330 | 2100 | 000000000002 | 143 | q2.log:347  |
| 15:12:40.548 | 2100 | 000000000005 | 149 | q2.log:1285 |
| 15:12:40.547 | 2100 | 000000000005 | 150 | q2.log:1254 |
==========================================================

Use some aggregate operators min(), max() and avg() to get basic stats about the round trip times

lago log_samples/q2.log -o ascii -t mti,min(rtt),max(rtt),avg(rtt),count --pair
=================================================
| mti  | min(rtt) | max(rtt) | avg(rtt) | count |
=================================================
| 2800 | 4        | 6        | 5        | 2     |
| 2100 | 5        | 150      | 35       | 24    |
| 2420 | 6        | 8        | 7        | 2     |
=================================================

Find delays of more than 100ms in a message sequence

lago log_samples/q2.log -o ascii -t time,delay,src --field delay>100
======================================
| time         | delay | src         |
======================================
| 15:12:37.626 | 113   | q2.log:55   |
| 15:12:37.792 | 126   | q2.log:105  |
| 15:12:40.014 | 2218  | q2.log:120  |
| 15:12:40.272 | 227   | q2.log:268  |
| 15:12:40.472 | 141   | q2.log:411  |
| 15:12:40.697 | 149   | q2.log:1318 |
| 15:12:41.279 | 342   | q2.log:5519 |
======================================

Show a table of exceptions. Not very interesting with only one exception but useful when there are many

lago log_samples/q2.log -o ascii -t exception,src --field exception!=
==================================================================================
| exception                                                        | src         |
==================================================================================
| Sourced file: inline evaluation of: ``DATE=new Date();      MTI= | q2.log:5479 |
==================================================================================

Show a breakdown of message types

lago log_samples/q2.log -o ascii -t msgType,count
=============================
| msgType           | count |
=============================
| info              | 39    |
| jce-provider      | 1     |
| local-master-keys | 1     |
| iso-server        | 2     |
| connect           | 1     |
| session-start     | 1     |
| trace             | 30    |
| send              | 28    |
| receive           | 29    |
| abort             | 12    |
| debug             | 13    |
| commit            | 1     |
| error             | 10    |
| peer-disconnect   | 1     |
| session-end       | 1     |
=============================

Dictionaries

TODO Convert a snippet of jPOS' CMF into a dictionary to show how translation works

lago log_samples/q2.log -o ascii -t 'translate(39),39,count' --field mti=2110
================================
| translate(39) | 39   | count |
================================
|               | 1011 | 2     |
|               | 1015 | 3     |
|               | 1002 | 5     |
|               | 1014 | 2     |
|               | 1803 | 21    |
================================

Histograms

You can output an HDR Histogram using --histogram. Has some use for plotting response times (rtt).

lago log_samples/q2.log --pair --histogram rtt
Value     Percentile TotalCount 1/(1-Percentile)
       4.000 0.000000000000          1           1.00
       6.000 0.100000000000          8           1.11
       6.000 0.200000000000          8           1.25
       7.000 0.300000000000         11           1.43
       8.000 0.400000000000         16           1.67
       8.000 0.500000000000         16           2.00
       8.000 0.550000000000         16           2.22
       9.000 0.600000000000         17           2.50
      11.000 0.650000000000         19           2.86
      12.000 0.700000000000         20           3.33
      22.000 0.750000000000         21           4.00
      23.000 0.775000000000         22           4.44
      49.000 0.800000000000         23           5.00
      51.000 0.825000000000         24           5.71
      51.000 0.850000000000         24           6.67
     141.000 0.875000000000         25           8.00
     141.000 0.887500000000         25           8.89
     143.000 0.900000000000         26          10.00
     143.000 0.912500000000         26          11.43
     143.000 0.925000000000         26          13.33
     149.000 0.937500000000         27          16.00
     149.000 0.943750000000         27          17.78
     149.000 0.950000000000         27          20.00
     149.000 0.956250000000         27          22.86
     149.000 0.962500000000         27          26.67
     150.000 0.968750000000         28          32.00
     150.000 1.000000000000         28
#[Mean    =       31.286, StdDeviation   =       48.111]
#[Max     =      150.000, Total count    =           28]
#[Buckets =           32, SubBuckets     =         2048]

JSON export

Export each log entry as a single line of JSON. The driver idea here was to allow import into Apache Spark for query using Spark SQL.

lago log_samples/q2.log --field mti=2100 -n 1 --json
{"at":"2014-12-18T15:12:40.330-0200","lifespan":1,"realm":"channel/127.0.0.1:9001","mti":"2100","pan":"6009330000000021","processing_code":"300000","transaction_amount":8582000000000000,"transmission_date_and_time":"1218151240","stan":2,"local_transaction_time":"20141218151240","expiration_date":"1001","settlement_date":"20141218","capture_date":"1218","point_of_service_entry_mode":"10000000080000000004000001000000","acquiring_institution_identification_code":"000001","rrn":"180312760317","card_acceptor_terminal_identification":"29110001","card_acceptor_identification_code":"001001","city":"jCard Selftest system","country":"Montevideo","f_43_5":"MV","f_43_7":"UY","f_113_2":"106","msgType":"send"}

Digest format

Output a slightly less wordy output. Supports translation of field paths to names using a dictionary (TODO explain…​).

lago log_samples/q2.log --field mti=2100 -n 1 --digest
<log realm="channel/127.0.0.1:9001" at="2014-12-18 15:12:40.330" type="send" lifespan="1">
  0: 2100 [mti]
  2: 6009330000000021 [Primary account number]
  3: 300000 [Processing code]
  4: 8582000000000000 [Transaction amount]
  7: 1218151240 [Transmission date and time]
  11: 000000000002 [System trace audit number]
  12: 20141218151240 [Local transaction time]
  14: 1001 [Expiration date]
  15: 20141218 [Settlement date]
  17: 1218 [Capture date]
  22: 10000000080000000004000001000000 [Point of service entry mode]
  32: 000001 [Acquiring institution identification code]
  37: 180312760317 [Retrieval reference number]
  41: 29110001 [Card acceptor terminal identification]
  42: 001001 [Card acceptor identification code]
  43.2: jCard Selftest system [City]
  43.4: Montevideo [Country]
  43.5: MV
  43.7: UY
  113.2: 106
</log>

Gnuplot script generation

lago log_samples/q2.log --pair --plot 'time vs (rtt)' --gnuplot rtts
Wrote rtts.csv
Wrote rtts.gp

Produces a .csv file to be plotted

time,rtt
15:12:40.288,4
15:12:40.287,6
15:12:40.331,141
15:12:40.330,143
15:12:40.496,6
15:12:40.496,7
15:12:40.515,22
15:12:40.515,23
15:12:40.548,149
15:12:40.547,150
15:12:40.711,49
15:12:40.710,51
15:12:40.775,6
15:12:40.774,8
15:12:40.794,10
15:12:40.793,12
15:12:40.817,8
15:12:40.817,8
15:12:40.837,9
15:12:40.836,11
15:12:40.862,5
15:12:40.861,7
15:12:40.878,6
15:12:40.877,8
15:12:40.896,6
15:12:40.896,7
15:12:40.918,6
15:12:40.917,8

A gnuplot script to plot the .csv data

#!/usr/bin/env gnuplot
name = 'rtts'
csv = name.'.csv'
set datafile separator ','
set terminal svg enhanced mouse standalone size 1280,960; set output name.'.svg'
#set terminal pngcairo size 1280,960; set output name.'.png'
set xdata time
set timefmt '%H:%M:%S'
set format x '%H:%M:%S'
set format y '%.1s %c'
set lmargin 10
set xrange ['15:12:40.288':'15:12:40.917']
set multiplot layout 1,1 title 'rtts'

plot csv using 1:2 w lines lt 2 t column(2)

Executes the gnuplot script to produce a chart

Example plot

When you need more

See examples or the sbt shell script examples.

log4j

Given a set of log4j entries that look like:

[01 Jan 1970 10:00:57,135] ERROR [a.ClassName]: amphigenesis receival sulphurlike manlikely worricow postflexion
lago log_samples/log4j.txt.gz --in-format log4j --table 'time(HH),level,count' -n 3
time(HH),level,count
10,ERROR,21
10,INFO,23
11,INFO,18

Garbage Collection logs

lago log_samples/gc.log.gz --in-format gc --plot 'datetime vs (pause) as points labelled by (type), (heapBefore,heapAfter,heapMax) as points, (PSYoungGenBefore,PSYoungGenAfter,PSYoungGenMax) as points, (ParOldGenBefore,ParOldGenAfter,ParOldGenCapacityBefore)' --sort datetime --gnuplot gc-example-chart
Wrote gc-example-chart.csv
Wrote gc-example-chart.gp

Produces a .csv to be plotted, a .gp to do the plotting and a chart

Example garbage collection chart

The output for all examples is generated by RunExamples.scala.

Reference

Filters

Use these with the filters: -f and --field.

Example Description

mti=0200

Field 0 (MTI) is 0200

mti!=0200

Field 0 (MTI) is anything other than 0200

39!=

Field 39 is not empty

rtt>500

Round trip time (available with --pair) is greater than or equal to 500ms

11<123456

STAN is less than or equal to 123456

48~magic

Field 48 contains magic (case sensitive)

mti~/08[0-3]0/

Field 0 (MTI) matches the regular expression 08[0-3]0.

If you specify multiple filters only records that match all filters will be returned.

To match one thing or another you can use a regex like 48~/(one thing)|(or another)/. A little contrived but it works for now.

Field Expressions

Use these with the table output format --table (with --out-format csv, tsv, ascii, html or jira).

Example Description

0

Field 0 (the MTI) from a jPOS <log> entry. Aliased by mti.

48.3.2.1

A quite deeply nested subfield

field(/regex/$1/)

Apply the regular expression to the value of field. Ouput the replacement. A bit like sed but with Java’s regex flavour.

jPOS

In a jPOS log entry a bunch of fields are exposed. For example:

<log realm="some.channel/10.0.0.1:4321" at="Mon Nov 24 01:31:59 EST 2014.292" lifespan="10005ms">
  <receive>
    <isomsg direction="incoming">
      <!-- org.jpos.iso.packager.XMLPackager -->
      <field id="0" value="0800"/>
      <field id="7" value="1124000003"/>
      <field id="11" value="28928"/>
      <isomsg id="48">
        <field id="1" value="nested subfield"/>
      </isomsg>
    </isomsg>
  </receive>
</log>
Name Value in example

realm

some.channel/10.0.0.1:4321

ipAddress

10.0.0.1

socket

10.0.0.1:4321

link

some.channel

timestamp

2014-11-08 01:31:59.292

lifespan

10005

0

0800

mti

0800

48.1

nested subfield

msgType

receive

Some extra derived fields are available too.

Name Value in example

icon

summary

← 0800 (Network Management Request)

length(48.1)

15

log4j

In a log record like [08 Nov 2014 13:54:59,001] INFO [a.ClassName]: Did something useful the following fields are available.

Name Value in example

timestamp

2014-11-08 13:54:59.001

level

INFO

category

a.ClassName

message

Did something useful

If the log4j record contains a jPOS message, like the following example, all the usual jPOS fields are available. If there is a name collision, the value from the jPOS entry is available with the prefix jpos.. For example timestamp and jpos.timestamp in the following example.

[08 Nov 2014 01:00:20,529] ERROR [some.package]: <log realm="some.channel/172.0.1.7:4779" at="Sat Nov 08 01:00:21 EST 2014.001" lifespan="290ms">
    <receive>
      <exception name="Oops">
      Oops
      </exception>
    </receive>
  </log>
Name Value in example

timestamp

2014-11-08 01:00:20.529

jpos.timestmap

2014-11-08 01:00:21.001

exception

Oops

realm

some.channel/172.0.1.7:4779

More fields available on all log types

Name Description

delay

Milliseconds elapsed between two consecutive messages

src

File name and line number output row is derived from

file

File name output row is derived from

line

Line number output row is derived from

field(/regex/replacement/)

Regex replacement

elapsed

Time since the first log entry

elapsedSince(filter)

Time since the last log entry matching filter

if(filter,trueExpr,falseExpr)

If filter is true, show trueExpr, otherwise falseExpr. Either expression can be blank.

distinct(field)

Only one row per unique value of field.

Aggregation

Use count, sum, min, max, avg, group_concat, group_sample, count(condition)

Can also use distinct(). Eg group_concat(distinct(2))

group_sample(src 3) picks 3 random src values from the N values in a group

Pairing

--pair option will match responses to requests based on MTI, STAN and realm. Only works for jPOS entries.

The following additional fields will be available to filter, aggregate and display.

Name Description

request.<field>

Value of <field> on the request. Eg request.timestamp or request.2

req.<field>

Same as above

response.<field>

Value of <field> on the response. Eg response.timestamp or resp.39

resp.<field>

Same as above

rtt

Round trip time in milliseconds: response.timestamp - request.timestamp.

By default fields are looked up on the request. If not found then on the response. Eg timestamp is the same as req.timestamp whilst 39 would typically fall through to resp.39 as 39 is typically a response field.

Input Formats

By default the input type is sniffed (automagically detected) by looking at the start of each line. jPOS and log4j entries are detected.

Other formats can be explicitly specified with --in-format.

Format Description

csv

Comma separated values. The field names are taken from the first row: the header.

icsv

Comma separated values without a header row. Fields are numbered starting at zero rather than named.

tsv

Tab separated values. The field names are taken from the first row: the header.

gc

Java Garbage Collection log output. Quite brittle depending on the JVM and version.

apache

Apache access logs.

Output Formats

By default the same text that’s input: just act as a filter not an aggregator or formatter.

Flag Example Format

--table

--table mti,nmic,count

Output just the specified field values. By default comma separated. Change the format using --out-format.

--out-format tsv

--table mti,nmic,count --out-format tsv

Tab separated field values. Abbreviated form -o tsv.

--out-format html

--table time,summary,src -o html

HTML table

--out-format ascii

-t exception,count-o ascii

ASCII 'art' table. Like mysql client.

--plain

--t time,summary,src --plain

Avoid unicode characters that might not print on your terminal.

--out-format jira

-o jira --table exception,count

Table that can be pasted into jira

--digest

No value required

Slightly shorter. Nesting compacted to one line.

--json

No value required

A single line of JSON per input record.

--table -o sqlIn

--table distinct(ipAddress) -o sqlIn

A list of values that can be used in a SQL IN clause.

You can use --live with --table to spit out the results incrementally (as they are discovered) rather than waiting until all field widths are known.

Others

--names to add a name attribute to each <field> element

--ui to open the results in a browser

--merge to remove duplicates when reading partially overlapping files

--highlight for ANSI colour codes