在Java中通过哈希图搜索多个键

  • 本文关键字:图搜索 哈希 Java java hash
  • 更新时间 :
  • 英文 :


我正试图弄清楚如何在一些用户输入中搜索多个关键字。关键字来自一个名为同义词的哈希图。所以基本上我输入一些句子,如果句子包含一个或多个关键字或关键字同义词,我想调用一个解析文件方法。到目前为止,我只能搜索一个关键词。我一直在尝试获得一个用户输入,它可能是一个长句,也可能只是一个包含关键字的单词,并在哈希图关键字中搜索匹配的单词。例如,如果哈希图是

responses.put("textbook name", new String[] { "name of textbook", "text", "portfolio" });

responses.put("current assignment", new String[] { "homework","current work" });

用户输入"有作业的课本叫什么名字"我想在文本文件中搜索课本当前作业。假设文本文件中包含句子当前作业在第二本教科书名称ralphy中。我的意思是,我已经完成了大部分实现,问题是处理多个关键字。有人能帮我解决这个问题吗?

这是我的代码

private static HashMap<String, String[]> responses = new HashMap<String, String[]>(); // this

public static void parseFile(String s) throws FileNotFoundException {
File file = new File("data.txt");
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
final String lineFromFile = scanner.nextLine();
if (lineFromFile.contains(s)) {
// a match!
System.out.println(lineFromFile);
// break;
}
}
}

private static HashMap<String, String[]> populateSynonymMap() {

responses.put("test", new String[] { "test load", "quantity of test","amount of test" });
responses.put("textbook name", new String[] { "name of textbook", "text", "portfolio" });
responses.put("professor office", new String[] { "room", "post", "place" });
responses.put("day", new String[] { "time", "date" });
responses.put("current assignment", new String[] { "homework","current work" });

return responses;
}

public static void main(String args[]) throws ParseException, IOException {
/* Initialization */
HashMap<String, String[]> synonymMap = new HashMap<String, String[]>();
synonymMap = populateSynonymMap(); // populate the map

Scanner scanner = new Scanner(System.in);
String input = null;
/*End Initialization*/
System.out.println("Welcome To DataBase ");
System.out.println("What would you like to know?");
System.out.print("> ");
input = scanner.nextLine().toLowerCase();
String[] inputs = input.split(" ");
for (String ing : inputs) { // iterate over each word of the sentence.
boolean found = false;
for (Map.Entry<String, String[]> entry : synonymMap.entrySet()) {
String key = entry.getKey();
String[] value = entry.getValue();
if (input.contains(key) || key.contains(input)|| Arrays.asList(value).contains(input)) {
found = true;
parseFile(entry.getKey());

}
}
}
}

如有任何帮助,将不胜感激

我已经回答了非常相似的问题。使用哈希映射理解两个或多个键。但我会更清楚地表明我的观点。在您使用的当前数据结构集中,让我们考虑以下结构

1) 输入List-->在句子中拼写单词(可能是按顺序排列的),并将其保存在列表示例中[what,is,the,name,of,textbook,that,have,the,home作业]

2) 关键字列表-->您使用的示例[测试,课本名称,教授办公室]中Hashmap数据库中的所有关键字

现在你必须设置一些标准,根据这些标准,你可以说我最多可以有3个单词的句子中的短语(例如"课本名称")作为关键字,为什么这个标准-限制处理,否则你最终会检查很多输入组合。

一旦你有了这个,你就可以在输入列表和关键字列表中检查你设置的标准。如果您没有设置标准,那么您可以针对密钥集尝试所有组合。一旦你发现单个或多个匹配,输出同义词列表等

例如对照地图上的所有键检查[课本名称]。

如果你想反向检查,可以创建一个同义词列表并进行检查

我解决这个问题的两个技巧

1) 定义一组关键字,不要用值列表进行检查,哈希映射结构不适合这样。在这方面要为冗余数据做好准备。

2) 设置要在此关键字集中按顺序搜索的单词数。最好只保留不同的单词。

希望这能有所帮助!

您可以为每个"dictionary entry"使用一个regex模式,并根据您的输入测试每个模式。根据您的性能要求以及字典和输入的大小,这可能是一个不错的解决方案。

如果你使用的是java8,试试这个:

public static class DicEntry {
String key;
String[] syns;
Pattern pattern;
public DicEntry(String key, String... syns) {
this.key = key;
this.syns = syns;
pattern = Pattern.compile(".*(?:" + Stream.concat(Stream.of(key), Stream.of(syns))
.map(x -> "\b" + Pattern.quote(x) + "\b")
.collect(Collectors.joining("|")) + ").*");
}
}
public static void main(String args[]) throws ParseException, IOException {
// Initialization
List<DicEntry> synonymMap = populateSynonymMap();
Scanner scanner = new Scanner(System.in);
// End Initialization
System.out.println("Welcome To DataBase ");
System.out.println("What would you like to know?");
System.out.print("> ");
String input = scanner.nextLine().toLowerCase();
boolean found;
for (DicEntry entry : synonymMap) {
if (entry.pattern.matcher(input).matches()) {
found = true;
System.out.println(entry.key);
parseFile(entry.key);
}
}
}
private static List<DicEntry> populateSynonymMap() {
List<DicEntry> responses = new ArrayList<>();
responses.add(new DicEntry("test", "test load", "quantity of test", "amount of test"));
responses.add(new DicEntry("textbook name", "name of textbook", "text", "portfolio"));
responses.add(new DicEntry("professor office", "room", "post", "place"));
responses.add(new DicEntry("day", "time", "date"));
responses.add(new DicEntry("current assignment", "homework", "current work"));
return responses;
}

样本输出:

Welcome To DataBase 
What would you like to know?
> what is the name of textbook that has the homework
textbook name
current assignment

列出/追加匹配的键。对于给定的示例,当关键字"教科书"匹配时,将其存储在"temp"变量中。现在,继续循环,现在关键字"current"匹配,将其附加到变量temp。所以,现在temp包含"教科书当前"。类似地,继续并将下一个关键字"assign"附加到"temp"中。

现在,临时工包含了"课本上的当前作业"。

现在在最后调用parseFile(temp)。

这应该适用于单个或多个匹配。

//Only limitation is the keys are to be given in a ordered  sequence , if you want 
// to evaluate all the possible combinations then better add all the keys in a list
// And append them in the required combination.
//There might be corner cases which I havent thought of but this might help/point to a more better solution
String temp = "";
//flag - used to indicate whether any word was found in the dictionary or not?
int flag = 0;
for (String ing : inputs) { // iterate over each word of the sentence.
boolean found = false;
for (Map.Entry<String, String[]> entry : synonymMap.entrySet()) {
String key = entry.getKey();
String[] value = entry.getValue();
if (input.contains(key)) {
flag = 1;
found = true;
temp = temp +" "+ key;
}
else if (key.contains(input)) {
flag = 1;
found = true;
temp = temp +" "+ input;
}
else if (Arrays.asList(value).contains(input)) {
flag = 1;
found = true;
temp = temp +" "+ input;
}
}
}   
if (flag == 1){
parseFile(temp);
}

最新更新