在IBM Watson知识工作室下载ML注释

我正在使用WKS的NLP应用程序中工作，经过培训后，获得了相当低的性能。

我想知道是否有一种方法可以下载带有实体分类的注释文档，无论是用于训练集还是测试集，这样我就可以自动详细识别关键差异在哪里，这样我就能修复它们。

那些由人工注释的文档可以在"Assets"/"Documents"->下载文档集(右侧按钮(部分下载。

下面的Python代码，让您查看其中的数据：

import json
import zipfile
with zipfile.ZipFile(<YOUR DOWNLOADED FILE>, "r") as zip:
with zip.open('documents.json') as arch:  
data      = arch.read()  
documents = json.loads(data)
print(json.dumps(documents,indent=2,separators=(',',':')))
df_documentos = pd.DataFrame(None)
i = 0
for documento in documents:
df_documentos.at[i,'name']         = documento['name']
df_documentos.at[i,'text']         = documento['text']
df_documentos.at[i,'status']       = documento['status']
df_documentos.at[i,'id']           = documento['id']
df_documentos.at[i,'createdDate']  = '{:14.0f}'.format(documento['createdDate'])
df_documentos.at[i,'modifiedDate'] = '{:14.0f}'.format(documento['modifiedDate'])
i += 1
df_documentos
with zipfile.ZipFile(<YOUR DOWNLOADED FILE>, "r") as zip:
with zip.open('sets.json') as arch:  
data = arch.read()  
sets = json.loads(data)
print(json.dumps(sets,indent=2,separators=(',',':')))
df_sets = pd.DataFrame(None)
i = 0
for set in sets:
df_sets.at[i,'type']         = set['type']
df_sets.at[i,'name']         = set['name']
df_sets.at[i,'count']        = '{:6.0f}'.format(set['count'])
df_sets.at[i,'id']           = set['id']
df_sets.at[i,'createdDate']  = '{:14.0f}'.format(set['createdDate'])
df_sets.at[i,'modifiedDate'] = '{:14.0f}'.format(set['modifiedDate'])
i += 1
df_sets

然后，您可以迭代读取压缩文件的"gt"文件夹中的每个JSON文件，并获得详细的语句拆分、标记化和注释。

我需要的是能够在TEST文档上下载机器学习模型产生的注释，这些注释在"机器学习模型"/"性能"/"查看解码结果"中可见。

有了这一点，我将能够确定可能导致修改类型字典和注释标准的特定偏差。

很抱歉，此功能目前不可用。

您可以通过以下URL提交功能请求：https://ibm-data-and-ai.ideas.aha.io/?project=WKS

谢谢。

相关内容

最新更新

热门标签：