Python 如果/否则行为不正确(或者我只是愚蠢)



我还是个初学者,所以我确信问题出在我做的一些愚蠢的事情上。

基本上,我试图找出只有一个或两个版本的谷歌分析(UA->通用分析和GA4->谷歌分析4(的网站。

在我看来,最好的方法是抓取网络请求,并使用URL进行区分(请参阅变量"ga4check">"uacheck">中的差异(。

删除网络请求并对其进行解析可以正常工作,但当我使用if/else语句检查其存在时,它就不起作用了。它基本上返回false到第一个if,因为输出是">";有些地方不对">

这是我的代码:

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
import time
import json
ga4check = 'google-analytics.com/g/collect?v=2&tid=G-'
uacheck = 'google-analytics.com/collect?v=1&_v='
collectlist = []
if __name__ == "__main__":
desired_capabilities = DesiredCapabilities.CHROME
desired_capabilities["goog:loggingPrefs"] = {"performance": "ALL"}
options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument("--ignore-certificate-errors")
driver = webdriver.Chrome(executable_path=r'C:UsersdgaygDesktopScriptsGA4 finderchromedriver.exe',
chrome_options=options,
desired_capabilities=desired_capabilities)
driver.get("https://www.measureschool.com/")
time.sleep(10)
logs = driver.get_log("performance")
with open("network_log.json", "w", encoding="utf-8") as f:
f.write("[")
for log in logs:
network_log = json.loads(log["message"])["message"]
if("Network.response" in network_log["method"]
or "Network.request" in network_log["method"]
or "Network.webSocket" in network_log["method"]):
f.write(json.dumps(network_log)+",")
f.write("{}]")
print("Quitting Selenium WebDriver")
driver.quit()
json_file_path = "network_log.json"
with open(json_file_path, "r", encoding="utf-8") as f:
logs = json.loads(f.read())
for log in logs:
try:
url = log["params"]["request"]["url"]
if "collect?v=" in url:
collectlist.append(url)
except Exception as e:
pass
if any(uacheck in i for i in collectlist):
if any(ga4check in i for i in collectlist):
print('There's UA and GA4 on this website')
elif any(ga4check not in i for i in collectlist):
print('Only UA is present on this website')
else:
print('Something isn't right...') 

输出:

C:UsersdgaygDesktopScriptsGA4 finder> & C:/Users/dgayg/AppData/Local/Programs/Python/Python39/python.exe "c:/Users/dgayg/Desktop/Scripts/GA4 finder/main.py"
c:UsersdgaygDesktopScriptsGA4 findermain.py:18: DeprecationWarning: executable_path has been deprecated, 
please pass in a Service object
driver = webdriver.Chrome(executable_path=r'C:UsersdgaygDesktopScriptsGA4 finderchromedriver.exe',     
c:UsersdgaygDesktopScriptsGA4 findermain.py:18: DeprecationWarning: use options instead of chrome_options  driver = webdriver.Chrome(executable_path=r'C:UsersdgaygDesktopScriptsGA4 finderchromedriver.exe',     
DevTools listening on ws://127.0.0.1:14224/devtools/browser/ea27a598-5b1d-48e2-bffa-1bf849b826b8
[0731/193526.701:INFO:CONSOLE(0)] "Failed to set referrer policy: The value '' is not one of 'no-referrer', 'no-referrer-when-downgrade', 'origin', 'origin-when-cross-origin', 'same-origin', 'strict-origin', 'strict-origin-when-cross-origin', or 'unsafe-url'. The referrer policy has been left unchanged.", source:  (0)
[0731/193526.880:INFO:CONSOLE(2)] "JQMIGRATE: Migrate is installed, version 3.3.2", source: https://measureschool.com/wp-includes/js/jquery/jquery-migrate.min.js?ver=3.3.2 (2)
Quitting Selenium WebDriver
Something isn't right...

这是collectlist的输出

['https://region1.google-analytics.com/g/collect?v=2&tid=G-QG5JR71SF7&gtm=2oe7r0&_p=877231823&_z=ccd.v9B&cid=879701179.1659290205&ul=en-us&sr=800x600&_s=1&sid=1659290205&sct=1&seg=0&dl=https%3A%2F%2Fmeasureschool.com%2F&dt=MeasureSchool%20-%20The%20Data-Driven%20Way%20of%20Digital%20Marketing&en=page_view&_fv=1&_nsi=1&_ss=1', 'https://px.ads.linkedin.com/collect?v=2&fmt=js&pid=1024658&time=1659290205477&url=https%3A%2F%2Fmeasureschool.com%2F', 'https://www.google-analytics.com/j/collect?v=1&_v=j96&a=877231823&t=pageview&_s=1&dl=https%3A%2F%2Fmeasureschool.com%2F&dp=%2F&ul=en-us&de=UTF-8&dt=MeasureSchool%20-%20The%20Data-Driven%20Way%20of%20Digital%20Marketing&sd=24-bit&sr=800x600&vp=774x600&je=0&_u=4CDACEABBAAAAC~&jid=253819846&gjid=797933957&cid=879701179.1659290205&tid=UA-58541733-2&_gid=1754033578.1659290206&_r=1&gtm=2wg7r0593KN2&z=952191590', 'https://px.ads.linkedin.com/collect?v=2&fmt=js&pid=1024658&time=1659290205477&url=https%3A%2F%2Fmeasureschool.com%2F&liSync=true', 'https://px4.ads.linkedin.com/collect?v=2&fmt=js&pid=1024658&time=1659290205477&url=https%3A%2F%2Fmeasureschool.com%2F&liSync=true&e_ipv6=AQJQvwc9MAD7QAAAAYJVZz8nCTxegeWWl3Feqs04Ry8lLAYe4tRStgs5YUf0ek2yseMWT3wlT2oSrFcxugGX91BzO2PCy9w']

我希望我努力实现的目标足够清晰。

提前感谢!!

查看您的collectlist:

collectlist = [
'https://region1.google-analytics.com/g/collect?v=2&tid=G-QG5JR71SF7&gtm=2oe7r0&_p=877231823&_z=ccd.v9B&cid=879701179.1659290205&ul=en-us&sr=800x600&_s=1&sid=1659290205&sct=1&seg=0&dl=https%3A%2F%2Fmeasureschool.com%2F&dt=MeasureSchool%20-%20The%20Data-Driven%20Way%20of%20Digital%20Marketing&en=page_view&_fv=1&_nsi=1&_ss=1',
'https://px.ads.linkedin.com/collect?v=2&fmt=js&pid=1024658&time=1659290205477&url=https%3A%2F%2Fmeasureschool.com%2F',
'https://www.google-analytics.com/j/collect?v=1&_v=j96&a=877231823&t=pageview&_s=1&dl=https%3A%2F%2Fmeasureschool.com%2F&dp=%2F&ul=en-us&de=UTF-8&dt=MeasureSchool%20-%20The%20Data-Driven%20Way%20of%20Digital%20Marketing&sd=24-bit&sr=800x600&vp=774x600&je=0&_u=4CDACEABBAAAAC~&jid=253819846&gjid=797933957&cid=879701179.1659290205&tid=UA-58541733-2&_gid=1754033578.1659290206&_r=1&gtm=2wg7r0593KN2&z=952191590',
'https://px.ads.linkedin.com/collect?v=2&fmt=js&pid=1024658&time=1659290205477&url=https%3A%2F%2Fmeasureschool.com%2F&liSync=true',
'https://px4.ads.linkedin.com/collect?v=2&fmt=js&pid=1024658&time=1659290205477&url=https%3A%2F%2Fmeasureschool.com%2F&liSync=true&e_ipv6=AQJQvwc9MAD7QAAAAYJVZz8nCTxegeWWl3Feqs04Ry8lLAYe4tRStgs5YUf0ek2yseMWT3wlT2oSrFcxugGX91BzO2PCy9w'
]

看看你对uacheck:的价值

uacheck = 'google-analytics.com/collect?v=1&_v='

collectlist中不存在任何包含uachecki。您确实有一个ga4checkURL,但如果您的代码没有首先找到至少一个uacheck,那么它就不会麻烦查找ga4check

我相信你可能想把你的支票结构更像:

any_ua = any(uacheck in i for i in collectlist)
any_ga4 = any(ga4check in i for i in collectlist)
if any_ua and any_ga4:
print('There's UA and GA4 on this website')
elif any_ua:
print('Only UA is present on this website')
elif any_ga4:
print('Only GA4 is present on this website')
else:
print('Neither is present on this website.)

由于if有效地检查了两个布尔值的所有可能组合,您也可以将其表示为2x2真值表,如下所示:

any_ua = any(uacheck in i for i in collectlist)
any_ga4 = any(ga4check in i for i in collectlist)
print([
# no GA                # some GA
["Neither is present", "Only GA4 is present"],  # no UA
["Only UA is present", "There's UA and GA4"],   # some UA
][any_ua][any_ga4], "on this website")

相关内容

最新更新