在我的Django应用程序中,我需要从大约500个文件中提取一些数据,这些文件的格式只有.doc和/或.docx
filenames=os.listdir(fpath)
for file1 in filenames: ## iterate each file in folder
if file1.endswith('.doc'): ## check if its .doc ?
pythoncom.CoInitializeEx(pythoncom.COINIT_MULTITHREADED)
wordapp =win32com.client.gencache.EnsureDispatch("Word.Application")
x = wordapp.Documents.Open(file1)
my_list.append(x.Content.Text)
wordapp.ActiveWindow.Close()
wordapp.Quit()
### Do some pattern matching on my_list for extraction of data and store it in DataBase
elif file1.endswith('.docx'):
file1=file1.encode('UTF8')
file1=fpath+"\"+file1
document = opendocx(file1)
body=getdocumenttext(document)
# Do some pattern matching and store in DataBase
else:
print "File are not of required format"
现在我的问题是,在处理完文件夹中的第一个文件后,Django服务器就会挂断。但如果我运行与独立python文件相同的代码,那么它就可以工作了。为什么会发生这种情况,以及如何解决这个问题?感谢在这方面提供的任何帮助。谢谢
为了让django应用程序的调试变得容易,你可以设置这样的基本日志记录(如果你还没有):
# settings.py
import logging
LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'formatters': {
'verbose': {
'format': '%(levelname)s %(asctime)s %(name)-12s %(module)-20s %(funcName)-15s %(message)s'
},
'simple': {
'format': '%(levelname)s %(message)s'
},
},
'handlers': {
'null': {
'level':'DEBUG',
'class':'logging.NullHandler',
},
'console':{
'level': 'DEBUG',
'class': 'logging.StreamHandler',
'formatter': 'simple'
},
'log_file':{
'level': 'DEBUG',
'class': 'logging.handlers.RotatingFileHandler',
'filename': os.path.join(BASE_DIR, 'myapp.log'),
'maxBytes': '16777216', # 16megabytes (to keep the file max. 16MB big)
'formatter': 'verbose'
},
'mail_admins': {
'level': 'ERROR',
'class': 'django.utils.log.AdminEmailHandler',
'formatter': 'verbose',
}
},
'loggers': {
'django.request': {
'handlers': ['mail_admins'],
'level': 'ERROR',
'propagate': True,
},
'django.request': {
'handlers': ['log_file'],
'level': 'ERROR',
'propagate': True,
},
'myapp': { # this will catch any log-calls inside your app 'myapp'
'handlers': ['log_file'],
'level': 'DEBUG',
'propagate': True,
},
}
}
# somefile_to_debug.py
# ... use pformat to output rather complex data structures as pretty
# strings (perfect for debugging)
from pprint import pformat
import logging
# Create an instance of a logger which will include the name of this module
logger = logging.getLogger(__name__)
def my_function(bla, somedict):
logger.debug(pformat({'bla': bla, 'somedict': somedict}))
重新启动你的应用程序,你可以使用tail在终端上查看日志文件上的输出(对不起,不知道如何在windows中做到这一点,但在linux中你会做到的):
tail -f myapp.log