我有一个元组包含字符串数据(UTF-8),二进制(true/false/1/0) &我想在输出文件中作为一行输出的整型数据。我的一部分代码是:
### Python 2.73
import fileinput
import re
import time
import codecs
uIDfile = 'PythonFav Testppl.ttxt'
InFile = open(uIDfile)
OutFile = codecs.open('C:PythonFav TestS2.ttxt', encoding='utf-8', mode='w')
for user in InFile:
user = user [:-1]
# user = unicode(user, 'utf-8').encode('utf-8')
if 'NNNN' in user:
break
else:
if '@N' in user:
try:
Grp = people_getGroups(user_id = user)
g = 0
if GetAll:
for group in Grp.find('groups').findall('group'):
g += 1
fErr = ''
uID = user
gID = group.get('ID')
gName = group.get('name')
tup = '"{0}"t"{2}"t"{1}"t''t{3}t{4}t{5}t{6}n'.format(uNSID, gNSID, gName, bin1, bin2, int1, int2)
OutFile.write(tup.encode('utf-8'))
我尝试了几个不同版本的"OutFile.write()"语句。下面列出了每个错误。
OutFile.write(codecs.utf_8_decode(tup.encode('utf-8')))
TypeError: coercing to Unicode: need string or buffer, tuple found
OutFile.write('t'.join(codecs.utf_8_decode(tup.encode('utf-8'))))
TypeError: sequence item 1: expected string or Unicode, int found
OutFile.write('t'.join(map(str, codecs.utf_8_decode(tup.encode('utf-8')))))
tup = '"{0}"t"{2}"t"{1}"t""t"{3}"t"{4}"t"{5}"t"{6}"n'.format(uNSID, gNSID, gName, str(bin1), str(bin2), str(int1), str(int2))
UnicodeEncodeError: "'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)"
OutFile.write('t'.join(map(str, codecs.utf_8_decode(tup.encode('utf-8')))))
tup = '"{0}"t"{2}"t"{1}"t""t"{3}"t"{4}"t"{5}"t"{6}"n'.format(uNSID, gNSID, gName, bin1, bin2, int1, int2)
UnicodeEncodeError: "'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)"
任何帮助都是真诚的感谢!
如果要在文件中输出行,建议使用csv模块。下面是如何使用它的一个例子:
#-*- coding: utf-8 -*-
import csv
# Use of tempfile instead of hard-coded path, to be cross-platform :)
import tempfile
_, tmppath = tempfile.mkstemp()
out = open(tmppath, 'w')
writer = csv.writer(out)
input = "Te×t Ðåtå".decode('utf-8')
tup = (input.encode('utf-8'), 42, False)
tup
# OUT: ('Texc3x97t xc3x90xc3xa5txc3xa5', 42, False)
writer.writerow(tup)
out.close()
print(u"Look at me : {}".format(tmppath))
您可以使用方言和格式参数来精确定义输出文件的格式。
为了避免UTF8干扰,好的做法如下:
- 解码早期
- Unicode到处 <
- 编码晚/gh>