Why I care about Logging

  • Linux Kernel
  • Rackspace:
    • I work on Cloud Monitoring
    • API Driven, Monitoring as a Service.
    • Distributed across all Rackspace data centers.
    • Async events trigger support issues.
    • When things break, we can be called in.

Application vs System Logging

  • Application and Service Logging, not OS level.
  • Don't abandon syslog, supplement it.

Logging is an Event Emitter

logging.error('Something is wrong');
emit('log', {level: 'error', msg: 'Something is wrong'});

There is no difference here.

Structured Logging

Structured Logging

  • Not a new idea!
  • Most logs already have some structure
  • 1995: Common Log Format
  • 127.0.0.1 - - [10/Oct/2000:13:55:36 -0700] \
        "GET /apache_pb.gif HTTP/1.0" 200 2326
    

Why Structured Logging?

  • Many producers
  • Many consumers
  • Many programming languages
  • Services developed with only loose coupling.

Structured Logging: JSON

  • Accessible in all Programming Languages
  • Newline terminated.
  • grep works.

Structured Logging: JSON

{
    "timestamp": 1324830675.076,
    "status": "404",
    "short_message": "File does not exist: /var/www/no-such-file",
    "host": "ord1.product.api0",
    "facility": "httpd",
    "errno": "ENOENT",
    "remote_host": "50.57.61.4",
    "remote_port": "40100",
    "path": "/var/www/no-such-file",
    "uri": "/no-such-file",
    "level": 4,
    "headers": {
        "user-agent": "BadAgent/1.0",
        "connection": "close",
        "accept": "*/*"
    },
    "method": "GET",
    "unique_id": ".rh-g2Tm.h-ord1.product.api0.r-axAIO3bO.c-9210.ts-1324830675.v-24e946e"
}

Combined Log Format vs JSON

Perl Cookbook says:

my ($client, $identuser, $authuser, $date, $time, $tz, $method,
    $url, $protocol, $status, $bytes) =
/^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] "(\S+) (.*?) (\S+)"
    (\S+) (\S+)$/;
  $msg = JSON->decode($json_text);

Now a developer adds a new field...

Elements of Every Log Message

Timestamps, Host, Messages, Levels...

Message Tags / IDs

  • Easy to Search for strings.
  • Important for both Internal and Open Source.
  • Added in Apache HTTPD 2.4: "AH02182".
[Wed Jun 27 00:00:17.242251 2012] [allowmethods:error]
[pid 70941:tid 4446611456] [client 203.0.113.3:4172]
AH01623: client method denied by server configuration:
'PROPFIND' to /x1/www/ooo-site.apache.org/content/3

Trace IDs

  • Unique ID assigned to actions at the edge of your network
  • Propagated to all your backend services.

Trace IDs

.rh-el1U.h-ord1-maas-prod-api0.\
r-tvTCwfS3.c-33110.ts-1340775011499.v-c76a8c29
  • rh: random on process startup
  • h: hostname
  • c: counter for this process
  • r: random per ID
  • ts: timestamp
  • v: Git version hash

Trace IDs: Twitter Zipkin

  • Open Sourced June 7th
  • Uses 64bit Integer as Trace ID
  • Aggregates via Scribe, stores in Cassandra, Ruby Web application

Shipping Logs

Shipping Logs: Goals

  • If the network fails....
  • Always write to local disk.
  • Always write to local disk.
  • Always write to local disk.
  • Real Time-ish.

svlogd & Scribe

Description

Shipping Logs: Goals

  • Scribe
  • Apache Flume
  • syslog

Scribe

  • Open Sourced by Facebook
  • Bulk Log mover
  • Can buffer, retry, hash distribution, etc

Scribe Setup

  • Services always send to a localhost
  • Local Scribe sends to higher level router Scribes.
  • Router Scribes forward to another Scribe which over localhost sends to Graylog2.

Graylog2

  • Open Source Web Application
  • Many Inputs (Syslog, Scribe, RabbitMQ)
  • Indexes into ElasticSearch

Graylog2 Streams

Graylog2 Tracing

Graylog2 Message Permalink

Detour: Audit Logs of HTTP APIs

Audit Logs

  • Not great for all types of APIs
  • Every mutation (POST/PUT/DELETE) is recorded.
  • Includes Headers and Request bodies.
  • Available to user in public API.

Audit Logs

"values": [
{
  "id": "541615e0-bee4-11e1-a3c9-69984867fc3c",
  "timestamp": 1340642367038,
  "method": "POST",
  "url": "/v1.0/626873/entities/entMvVW47r/test-check",
  "app": "checks",
  "query": {},
  "txnId": ".rh-TL0q.h-ord1-maas-prod-api0.r-8NQS2mx5.c-280514.ts-1340642366874.v-1a7152ba29615a722a2713bef4d4fe2b5c6ee7ae",
  "payload": "{\"target_hostname\":\"www.example.com\",\"type\":\"remote.http\",\"details\":{},\"monitoring_zones_poll\":[\"mzord\",\"mzdfw\",\"mzlon\"]}",
  "account_id": "626873",
  "headers": {
      "host": "monitoring.api.rackspacecloud.com",
      "accept-encoding": "gzip,deflate",
      "content-type": "application/json; charset=UTF-8",
      "accept": "application/json",
      "user-agent": "libcloud/0.10.1 (Rackspace Monitoring)",
      "content-length": "140"
  },
  "statusCode": 400
},

<Thank You!>

http://ifup.org/slides