使用简单的HTML DOM从Div提取粗体文本

从事脚本项目，实际上花了过去4个小时来研究我可以的一切 - 我的头从字面上不再起作用，并且确实需要您的帮助。

所以我有一个PHP卷曲脚本，可以从网站上获取数据。我可以抓住拥有ID和所有这些的Div。但是，除了它是Div中唯一的BOLD项目外，我如何从DIV中获取没有任何ID/类/或任何特定内容的特定文本？

这是网站上的HTML文本：

<div class="firststyle"><label for="calculator" class="class-coll-1">
                <p class="sr-only">Welcome to the calculator:</p> <b>What is one plus two?</b> </label></div>

我试图从此HTML部分解析/提取的内容只是文本" 什么是一个加两个？"。如何定义要选择的特定部分？

我目前唯一能做的就是用以下脚本解析整个Div：

$html = str_get_html($response);
$the_question = $html->find('div[class=firststyle]');

但是，这获得了所有文本，包括我不需要的"欢迎到计算器"标签。

可能有可能以某种方式将解析的数据保存到一个变量中，然后从该变量中使用其他脚本从该变量中提取数据？

，或者我可以做类似的事情：

与此ID一起查找Div->在其中查找大胆的文本

或也许：

查找具有ID->的DIV ->取出文本"欢迎到计算器"

echo $html->find('.firststyle b', 0)->innertext;
#=> What is one plus two?

如果您有网站的HTML，则可以使用DomDocument类来解析。

$html = file_get_contents('http://www.example.com');
$dom = new DOMDocument();
$dom->loadHTML($html);

DomDocument类带有许多方法。这是您需要getElementById和getElementsByTagName的两个。

类似的东西：

$html = '<div id="test"><b>I want to be found!</b></div><div id="poep"><b>Im not selected</b></div>';
$dom = new DOMDocument();
$dom->loadHTML($html);
$div = $dom->getElementById('test');
$text = $div->getElementsByTagName('b')->item(0)->nodeValue;
echo $text;

将输出：

I want to be found!

相关内容

最新更新

热门标签：