如何用Python复制URL的所有代码



我想使用python 3.6复制URL的所有代码(http://modelseed.org/biochem/reactions/RXN00001),但我只能复制代码的一部分,一部分代码,我不知道为什么。

到目前为止,我尝试了"请求"模块

import requests
page = requests.get("http://modelseed.org/biochem/reactions/rxn00001")
print(page.content)

和" urllib"

import urllib.request
site = urllib.request.urlopen("http://modelseed.org/biochem/reactions/rxn00001")
print(site.read())

缺少"反应详细信息"的信息,例如"名称"," id"one_answers"缩写",但是如果我在Chrome的开发人员栏上检查代码,它们是可见的。

我可以使用上面两个代码下载的代码是:

<!DOCTYPE html>
<html lang="en" ng-app="ModelSEED">
 <head>
  <base href="/"/>
  <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
  <meta content="IE=edge" http-equiv="X-UA-Compatible"/>
  <meta content="initial-scale=1, maximum-scale=1, user-scalable=no" name="viewport">
   <meta content="The ModelSEED is a resource for the reconstruction, exploration, comparison, and analysis of metabolic models." name="description"/>
   <link href="/img/ModelSEED-favicon.png?v=2.0" rel="shortcut icon"/>
   <meta content="nconrad" name="author"/>
   <title>
    ModelSEED
   </title>
   <link href="components/angular-material/angular-material.css" rel="stylesheet"/>
   <link href="components/bootstrap/dist/css/bootstrap.min.css" rel="stylesheet"/>
   <!-- to be removed -->
   <link href="components/font-awesome/css/font-awesome.min.css" rel="stylesheet"/>
   <link href="icomoon/style.css" rel="stylesheet"/>
   <link href="https://fonts.googleapis.com/icon?family=Material+Icons" rel="stylesheet"/>
   <link href="http://fonts.googleapis.com/css?family=Montserrat:400,700" rel="stylesheet" type="text/css"/>
   <link href="build/style.css" rel="stylesheet"/>
   <!--<script src="https://cdn.socket.io/socket.io-1.3.7.js"></script>-->
   <script src="build/site.js">
   </script>
   <!-- HTML5 Shim and Respond.js IE8 support of HTML5 elements and media queries -->
   <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
   <!--[if lt IE 9]>
        <script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
        <script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
    <![endif]-->
  </meta>
 </head>
 <body>
  <div style="height: 100%;" ui-view="">
  </div>
  <script>
   (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
      (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
      m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
      })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
      ga('create', 'UA-67412611-1', 'auto');
      ga('send', 'pageview');
  </script>
 </body>
</html>

任何人都有任何提示,为什么&lt之间的代码为何;DIV样式="高度:100%;"ui-view ="> and(正好&lt; body>和之前的&lt; script>)未下载?

谢谢。

它是由JavaScript脚本插入的,因此,请求或Urllib会找到它,您需要使用浏览器为此,您应该尝试使用Selenium或Phantomjs

尝试

类似:

from selenium import webdriver
driver = webdriver.Chrome('./chromedriver')
driver.get(url)
driver.page_source

尝试获取此URL:https://www.patricbrc.org/api/model_reaction/?http_accept = application/json&json&json&eq(ID CRXN00001)Div>

最新更新