Selenium-Java-从站点中提取文本,不改变结构



我正在尝试从下面的链接中提取数据

https://bible.usccb.org/bible/readings/090120.cfm

代码:

String quote2 = driver.findElement(By.xpath("//*[@id='block-usccb-readings-content']/div/div[6]/div/div/div/div/div[2]")).getText();

这捕获了文本,但我的要求是将其写入一个文本文件,当我这样做时,它以一段的形式出现,下面是我预期的和实际的

预期:

R. (17) The Lord is just in all his ways.
The LORD is gracious and merciful,
slow to anger and of great kindness.
The LORD is good to all
and compassionate toward all his works.
R. The Lord is just in all his ways.
Let all your works give you thanks, O LORD,
and let your faithful ones bless you.
Let them discourse of the glory of your Kingdom
and speak of your might.

当前:

R. (17) The Lord is just in all his ways. The LORD is gracious and merciful, slow to anger and of great kindness. The LORD is good to all and compassionate toward all his works. R. The Lord is just in all his ways. Let all your works give you thanks, O LORD, and let your faithful ones bless you. Let them discourse of the glory of your Kingdom and speak of your might. R. The Lord is just in all his ways. Making known to men your might and the glorious splendor of your Kingdom. Your Kingdom is a Kingdom for all ages, and your dominion endures through all generations. R. The Lord is just in all his ways. The LORD is faithful in all his words and holy in all his works. The LORD lifts up all who are falling and raises up all who are bowed down. R. The Lord is just in all his ways.   

想知道这是否可能?

您可以使用以下代码找到具有精确格式的上下文。

System.setProperty("webdriver.chrome.driver", System.getProperty("user.dir") + "\src\test\resources\executables\chromedriver.exe");
WebDriver driver = new ChromeDriver();
driver.get("https://bible.usccb.org/bible/readings/090120.cfm");
WebDriverWait wait = new WebDriverWait(driver, 20);
wait.until(ExpectedConditions.elementToBeClickable(By.xpath("//*[@id='block-usccb-readings-content']/div/div[6]/div/div/div/div/div[2]")));
WebElement el = driver.findElement(By.xpath("//*[@id='block-usccb-readings-content']/div/div[6]/div/div/div/div/div[2]"));
String str = el.getAttribute("innerHTML");
BufferedWriter writer;
try {
writer = new BufferedWriter(new FileWriter(System.getProperty("user.dir") + "\src\test\resources\executables\Download.html"));
writer.write(str); 
writer.close();
} catch (IOException e) {
e.printStackTrace();
} 
driver.quit();

相关内容

  • 没有找到相关文章

最新更新