我已经完成了这些方法,但我似乎不知道如何推断Hive中作业的完成百分比(就像eventListener!)。请帮忙!编辑-我想你可以从客户端得到"我完成了映射…所以我完成了50%"(如果我提交了一个命令OVERWRITE EXTERNAL TABLE)。OpsCenter with Brisk(Datastax出品)正是这样做的。
import java.util.List;
import org.apache.hadoop.hive.metastore.api.MetaException;
import org.apache.hadoop.hive.service.HiveServerException;
import org.apache.hadoop.hive.service.ThriftHive;
import org.apache.hadoop.hive.service.ThriftHive.Client;
import org.apache.thrift.TException;
import org.apache.thrift.protocol.TBinaryProtocol;
import org.apache.thrift.transport.TSocket;
public class Hive {
static Client client;
static TSocket transport;
public static void main(String args[]) throws HiveServerException,
TException, MetaException {
transport = new TSocket("hiveserver",
10000);
transport.setTimeout(999999999);
TBinaryProtocol protocol = new TBinaryProtocol(transport);
client = new ThriftHive.Client(protocol);
transport.open();
System.out.println("Starting map job...");
Thread mapReduceThread = new Thread(new HiveQuery(
"SELECT COUNT(*) FROM myHiveTable"));
mapReduceThread.start();
System.out.println("Waiting on map...");
}
private static class HiveQuery implements Runnable {
private String hql;
public HiveQuery(String hql) {
this.setHql(hql);
}
public void run() {
long start = System.currentTimeMillis();
// Blocking
try {
client.execute(this.getHql());
} catch (HiveServerException e) {
e.printStackTrace();
} catch (TException e) {
e.printStackTrace();
}
List<String> responseList = null;
try {
responseList = client.fetchAll();
} catch (HiveServerException e) {
e.printStackTrace();
} catch (TException e) {
e.printStackTrace();
}
long elapsedTimeMillis = System.currentTimeMillis() - start;
float elapsedTime = elapsedTimeMillis / 1000F;
System.out.println("Job took: " + elapsedTime + " seconds");
for (String response : responseList) {
System.out.println("Response: " + response);
}
transport.close();
System.out.println("Closed transport");
System.exit(0);
}
public void setHql(String hql) {
this.hql = hql;
}
public String getHql() {
return hql;
}
}
}
外推是什么意思?
我不认为你能从Hadoop中的完成百分比推断出任何东西,除了如果它上升,那么就会发生更多的工作。你不能说"哦,那意味着我还有20秒"之类的话。