# 访问日志格式说明
# 日志样例
访问日志每行为一条记录。每个日志记录代表一个请求,每个字段用空格分隔。下面是一个示例,包含两条日志记录。
SINA000000ABCDEFGHIJ my-bucket [29/Aug/2014:20:23:12 +0800] 111.161.68.74 GRPS000000ANONYMOUSE 000a2709-1408-2920-2312-782bcb67c7d1 REST.GET.OBJECT /path/to/file "GET /my-bucket/path/to/file HTTP/1.1" 304 - 0 4510 28 27999 "http://edu.sina.com.cn/a/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36" -
SINA000000ABCDEFGHIJ my-bucket [29/Aug/2014:20:23:12 +0800] 111.161.68.74 GRPS000000ANONYMOUSE 000a2709-1408-2920-2312-782bcb67c7d9 REST.GET.OBJECT /path/to/file/xx "GET /path/to/file/xx HTTP/1.1" 404 NoSuchBucket 0 4510 28 27999 "http://foo.com/a" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36" -
任何字段可以被设置为-
,表示该数据是未知的或没有值,或者该字段并不适用于这一请求。
# 字段描述
字段 | 举例 | 描述 |
---|---|---|
Bucket Owner | SINA000000ABCDEFGHIJ | bucket所有者的userid |
Bucket | my-bucket | 当前请求的bucket名 |
Time | [29/Aug/2014:20:23:12 +0800] | 服务器接收到当前请求的时间,格式:[%d/%b/%Y:%H:%M:%S %z] |
Remote IP | 111.161.68.74 | 客户端ip地址 |
Requester | GRPS000000ANONYMOUSE | 请求者的身份(user_id或者group_id) |
Request ID | 000a2709-1408-2920-2312-782bcb67c7d1 | 云存储服务器端生成的每个请求的唯一标示 |
Operation | REST.PUT.OBJECT | 操作标示:REST.HTTP_method.resource_type |
Key | /path/to/file | object的key |
Request-URI | "GET /my-bucket/path/to/file HTTP/1.1" | HTTP请求的描述 |
HTTP status | 200 | HTTP响应码 |
Error Code | NoSuchBucket | 错误码,如果没有错误,用“ - ”占位 |
Bytes Sent | 2662992 | 服务器端响应的数据大小(下行),单位:bytes |
Object Size | 3462992 | Object的大小,单位:bytes |
Total Time | 70 | 从服务器收到请求,到响应结束的总时间,单位:milliseconds(毫秒) |
Turn-Around Time | 10 | 从服务器收到请求,到开始响应的时间(不包括下载的时间),单位:milliseconds(毫秒) |
Referrer | "http://edu.sina.com.cn/a/" | 当前请求的来源,依赖于浏览器发送的:HTTP Referrer header |
User-Agent | "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36" | 发出请求的客户端的User-Agent |
Version Id | 3HL4kqtJvjVBH40Nrjfkd | 暂时没有具体含义,一般情况下用“ - ”占位 |
# 日志解析示例(PHP)
function parse_log($log) {
$pattern = '/(?P<owner>\S+) (?P<bucket>\S+) (?P<time>\[[^]]*\]) (?P<ip>\S+) (?P<requester>\S+) (?P<reqid>\S+) (?P<operation>\S+) (?P<key>\S+) (?P<request>"[^"]*") (?P<status>\S+) (?P<error>\S+) (?P<bytes>\S+) (?P<size>\S+) (?P<totaltime>\S+) (?P<turnaround>\S+) (?P<referrer>"[^"]*") (?P<useragent>"[^"]*") (?P<version>\S)/';
$match = preg_match($pattern, $log, $matches);
if ($match && is_array($matches) && count($matches) > 0) {
foreach ($matches as $key => $value) {
if ($value == '-' || $value == '"-"') $matches[$key] = '';
if (is_numeric($key)) unset($matches[$key]);
}
return $matches;
}
return false;
}
$log = 'SINA000000ABCDEFGHIJ my-bucket [29/Aug/2014:20:23:12 +0800] 111.161.68.74 GRPS000000ANONYMOUSE 000a2709-1408-2920-2312-782bcb67c7d1 REST.GET.OBJECT /path/to/file "GET /my-bucket/path/to/file HTTP/1.1" 304 - 0 4510 28 27999 "http://edu.sina.com.cn/a/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36" -';
$parse = parse_log($log);
print_r($parse);