Spark和Hadoop测试方法

我有一种方法，可以读取HDFS并尝试测试此方法

我首先尝试了HDFSMini群集，而没有任何成功。这种类型的方法可以测试吗？如果是这样，则需要什么依赖性来测试它以及如何在不安装Hadoop的情况下在本地嘲笑HDFS文件系统。Hadoop安装不应依赖。我不能要求每个认为测试安装Hadoop的人。

def readFiles(fs: FileSystem,path: Path): String = {
    val sb = new mutable.StringBuilder()
    var br : BufferedReader =null
    var line : String = ""
    try{
      if(fs.exists(path)){
        if(fs.isFile(path)){
          br = new BufferedReader(new InputStreamReader(fs.open(path)))
          while ((line = br.readLine()) != null)
            sb.append(line.trim)
        } else {
          throw new InvalidPathException(s"${path.toString} is a directory, please provide the full path")
        }
      }else {
        throw new InvalidPathException(s"${path.toString} is an invalid file path ")
      }
    } catch {
      case e: Exception => throw e
    } finally {
      if (br != null){
        try {
          br.close()
        } catch {
          case e: Exception => throw e
        }
      }
    }
    sb.toString
  }

处理 org.apache.hadoop.fs.filesystem (SPARK同样(我通常将测试数据文件存储在：

中

src/test/resources

例如

src/test/resources/test.txt

local org.apache.hadoop.fs.filesystem可以使用相对于项目根部的路径访问，即" src/test/test/resources/test.txt"：

test("Some test") {
  val fileSystem = FileSystem.get(new Configuration())
  val fileToRead = new Path("src/test/resources/test.txt")
  val computedContent = readFiles(fileSystem, fileToRead)
  val expectedContent = "todo"
  assert(computedContent === expectedContent)
}

相关内容

最新更新

热门标签：