从jsonlogger中的dict中剥离pformat()换行符(\n).JsonFormatter



我使用jsonjsonlogger.JsonFormatter编写json日志文件,如下所示(只是提取了代码的重要部分(。对于控制台,我使用普通的StreamHandler

在程序执行过程中,我会在dict中收集一些统计数据,并在程序完成时记录下来。我使用pformat来获得更人性化的输出,它在控制台streamHandler上运行得很好。

from pprint import pformat
logger.info(pformat(stats_dict, indent=2, sort_dicts=False))

带pformat:的dict控制台输出

2022-01-29 13:53:43 [INFO   ] extractor    
{ 'search_count': 240,
'no_response': 0,
'no_result': 8}

但这也导致json格式化程序无法正确识别字典的键/值元组,相反,整个dict将在message键中显示为单个字符串值,包括pformat:中的n

{"asctime": "2022-01-29 13:17:41,430", "levelname": "INFO", "levelno": 20, "module": "extractor", "message": "{ 'search_count': 240,n  'no_response': 0,n  'no_result': 8,n }", "lineno": 463}

如果我不是使用pformat,那么json文件处理程序将正确识别所有键/值元组,并将其记录为日志。

{"asctime": "2022-01-29 13:53:43,487", "levelname": "INFO", "levelno": 20, "module": "extractor", "message": null, "lineno": 464, "search_count": 240, "no_response": 0, "no_result": 8}

以下是设置我的处理程序的部分:

from pythonjsonlogger import jsonlogger
def get_json_handler(file_base, path, level):
path = os.path.join(path, '')
logfilename = file_base + "_" + level.lower() + ".json"
json_handler = logging.handlers.TimedRotatingFileHandler(f"{path}{logfilename}", when="midnight", interval=1, backupCount=3)
format = jsonlogger.JsonFormatter("%(asctime)s %(levelname)s %(levelno)s %(module)s %(message)s %(lineno)s ")
json_handler.setLevel(level)
json_handler.setFormatter(format)
return json_handler
def get_stream_handler(level):
stream_handler = logging.StreamHandler()
format = logging.Formatter("%(asctime)s [%(levelname)-7s] %(module)-12s %(message)s", "%Y-%m-%d %H:%M:%S")
stream_handler.setLevel(level)
stream_handler.setFormatter(format)
return stream_handler

有没有一种简单的方法可以剥离pformat放在json格式化程序本身中的n换行符?

不要预先设置字典的格式。您现在传递的是一个字符串而不是dict,这在JSON日志输出中很明显。

相反,在控制台记录器中添加一个自定义格式化程序,它可以检测您何时传递字典并对它们应用pformat。例如

import logging
import pprint
from copy import copy
class PFormatter(logging.Formatter):
def __init__(self, *args, pformat_args=None, **kwargs):
super().__init__(*args, **kwargs)
self.pformat_args = pformat_args or {}
def format(self, record: logging.LogRecord):
if isinstance(record.msg, dict):
new_record = copy(record)
new_record.msg = pprint.pformat(record.msg, **self.pformat_args)
return super().format(new_record)
else:
return super().format(record)

要有选择地控制何时应用pformat,可以使用extra。例如:

import logging
import pprint
from copy import copy

class PFormatter(logging.Formatter):
def __init__(self, *args, pformat_args=None, **kwargs):
super().__init__(*args, **kwargs)
self.pformat_args = pformat_args or {}
def format(self, record: logging.LogRecord):
# NB: extra is passed a dictionary, but its keys become
# attributes of the log record
pformat = getattr(record, 'pformat', False)
if pformat:
if isinstance(pformat, dict):
# Allow temporary overriding of default args if 
# pformat is a dict
pformat_args = dict(self.pformat_args, **pformat)
else:
pformat_args = self.pformat_args
new_record = copy(record)
new_record.msg = pprint.pformat(record.msg, **pformat_args)
return super().format(new_record)
else:
return super().format(record)

你可能会像这样使用:

# setup logger
formatter = PFormatter(
'%(asctime)s %(name)s: %(msg)s',
pformat_args={'indent': 2, 'width': 2},
)
handler = logging.StreamHandler()
handler.setFormatter(formatter)
log = logging.getLogger(__name__)
log.addHandler(handler)
log.setLevel('DEBUG')
# actual logging
some_dict = {
'search_count': 240,
'no_response': 0,
'no_result': 8,
}
PFORMAT = {'pformat': True}
# just a normal log call, no pformat
log.info(some_dict)
# now pretty print
log.info(some_dict, extra=PFORMAT)
# change indent for one call
log.info(some_dict, extra={'pformat': {'indent': 8}})
# back to default indent
log.info(some_dict, extra=PFORMAT)
# apply pformat to anything, not just dicts!
log.info(list(range(20)), extra=PFORMAT)

我没有测试过它,因为我没有你的完整程序和记录器等。但根据文档,你可以覆盖JsonFormatter的加载功能。因此,正如您所知,可以删除有问题的字符并转换回dict(不确定是否需要这样做或保留字符串(。

class CustomJsonFormatter(jsonlogger.JsonFormatter):
def add_fields(self, log_record, record, message_dict):
super(CustomJsonFormatter, self).add_fields(log_record, record, message_dict)
if log_record.get('message'):
# remove 'n'
message_str = log_record['message'].replace("n", '')
# remove last comma
message_str = "".join(message_str.rsplit(",", 1))
# transform to dict (required?)
log_record['message'] = json.loads(message_str)

最新更新