有什么方法可以更快地读取文件吗

  • 本文关键字:读取 文件 方法 c++
  • 更新时间 :
  • 英文 :


我正在尝试创建一个程序,该程序查看字符串的所有排列,然后打印出所有有效单词。我可以得到所有的排列,但检查一个单词是否在字典文本文件中大约需要3秒钟。当我尝试7个字母时,花了47:19。有什么方法可以更快地读取文件吗
如有任何帮助,我们将不胜感激。

#include <iostream>
#include <fstream>
#include <string>
#include <vector>


bool in(std::string arr, char element)
{
for (int i = 0; i < arr.size(); i++)
{
if (arr[i] == element)
{
return true;
}
}
return false;
}

bool inDictionary(std::string str)
{
std::ifstream dictionary;
dictionary.open("words.txt");
if (dictionary.is_open())
{
std::string word;
while (std::getline(dictionary, word))
{
if (word == str)
{
return true;
}
}
}
return false;
}


std::string remainingCharacters(std::string orginal, std::string newString)
{
std::string characters = "";
for (int i = 0; i < orginal.size(); i++)
{
if (!in(newString, orginal[i]))
{
characters += orginal[i];
}
}



return characters;
}

void combinations(std::string cur, std::string original, std::vector<std::string>& permutations)
{
for (int i = 0; i < remainingCharacters(original, cur).size(); i++)
{

permutations.push_back(cur + remainingCharacters(original, cur)[i]);
combinations(cur + remainingCharacters(original, cur)[i], original, permutations);
}
}
int main()
{
std::vector<std::string> permutations;
combinations("", "wdsrock",  permutations);
for (int i = 0; i < permutations.size(); i++)
{
if (inDictionary(permutations[i]))
{
std::cout << permutations[i] << ", ";
}
}
;
}

有一些方法可以减少读取操作的数量和测试的排列数量:

  • 将字典存储在内存中
  • 将单词字母与所有可能的排列联系起来(变位词(
  • 迭代组合而不是置换

因此

std::unordered_map<std::string, std::vector<std::string>> read_dictionary()
{
std::ifstream dictionary;
dictionary.open("words.txt");
if (!dictionary.is_open()) { throw std::runtime_error("No dictionary"); }
std::unordered_map<std::string, std::vector<std::string>> res;
std::string word;
while (std::getline(dictionary, word))
{
auto anagram = word;
std::sort(anagram.begin(), anagram.end());
res[anagram].push_back(word);
}
return res;
}
template <typename Iterator>
bool next_combination(const Iterator first, Iterator k, const Iterator last)
{
/* Credits: Thomas Draper */
if ((first == last) || (first == k) || (last == k))
return false;
Iterator itr1 = first;
Iterator itr2 = last;
++itr1;
if (last == itr1)
return false;
itr1 = last;
--itr1;
itr1 = k;
--itr2;
while (first != itr1)
{
if (*--itr1 < *itr2)
{
Iterator j = k;
while (!(*itr1 < *j)) ++j;
std::iter_swap(itr1,j);
++itr1;
++j;
itr2 = k;
std::rotate(itr1,j,last);
while (last != j)
{
++j;
++itr2;
}
std::rotate(k,itr2,last);
return true;
}
}
std::rotate(first,k,last);
return false;
}
int main()
{
const auto dictionary = read_dictionary();
std::string letters = "dsrock";
std::sort(letters.begin(), letters.end());
for (std::size_t i = 1; i != letters.length() + 1; ++i) {
do {
auto it = dictionary.find(letters.substr(0, i));
if (it != dictionary.end()) {
for (const auto& word : it->second) {
std::cout << word << std::endl;
}
}
} while (next_combination(letters.begin(), letters.begin() + i, letters.end()));
}
}

演示

  1. 最好的方法是将文件映射到内存并读取它。Boost库提供了API来读取内存映射文件
#include <iostream>
#include <boost/iostreams/device/mapped_file.hpp>
using namespace std;
int main()
{
boost::iostreams::mapped_file_params arg;
arg.path = "yourfile.txt";
arg.new_file_size = pow(1024, 2); // 1 MB
boost::iostreams::mapped_file::mapmode::readonly;
boost::iostreams::mapped_file mf;
mf.open(arg);
char* bytes = (char*) mf.const_data();
cout << bytes << endl;
mf.close();
return 0;
}
  1. 如果您不想要内存映射文件,另一种方法是读取数据块。读取数据块比单独读取字符快
ifstream is(filename,readmode);
if (is) {
char* buffer = new char[length+1];
is.read(buffer, length);
buffer[length] = '';
is.close();

最新更新