gitpython and git diff



我只想获得从git repo更改的文件的diff。现在,我正在使用gitpython来实际获取git更改的提交对象和文件,但我只想对更改的文件部分进行依赖性分析。有没有办法从git-python中获取git-diff?还是我必须逐行阅读来比较每个文件?

如果您想访问diff的内容,请尝试以下操作:

repo = git.Repo(repo_root.as_posix())
commit_dev = repo.commit("dev")
commit_origin_dev = repo.commit("origin/dev")
diff_index = commit_origin_dev.diff(commit_dev)
for diff_item in diff_index.iter_change_type('M'):
    print("A blob:n{}".format(diff_item.a_blob.data_stream.read().decode('utf-8')))
    print("B blob:n{}".format(diff_item.b_blob.data_stream.read().decode('utf-8'))) 

这将打印每个文件的内容。

您可以将GitPython与git命令一起使用"diff";,只需要使用";树;每个提交的对象或您想要查看差异的分支,例如:

repo = Repo('/git/repository')
t = repo.head.commit.tree
repo.git.diff(t)

这将打印";所有";这个提交中包含的所有文件的diff,所以如果你想要每个文件,你必须对它们进行迭代。

对于实际的分支机构,它是:

repo.git.diff('HEAD~1')

希望得到帮助,问候。

正如您所注意到的,

Git不存储diff。给定两个Blob(在更改之前和之后),可以使用Python的difflib模块来比较数据。

如果您想重新创建类似于标准git diff的内容,请尝试:

# cloned_repo = git.Repo.clone_from(
#     url=ssh_url,
#     to_path=repo_dir,
#     env={"GIT_SSH_COMMAND": "ssh -i " + SSH_KEY},
# ) 
for diff_item in cloned_repo.index.diff(None, create_patch=True):
    repo_diff += (
        f"--- a/{diff_item.a_blob.name}n+++ b/{diff_item.b_blob.name}n"
        f"{diff_item.diff.decode('utf-8')}nn"
        )

我建议您改用PyDriller(它在内部使用GitPython)。更易于使用:

for commit in Repository("path_to_repo").traverse_commits():
    for modified_file in commit.modified_files: # here you have the list of modified files
        print(modified_file.diff)
        # etc...

您还可以通过执行以下操作来分析单个提交:

for commit in RepositoryMining("path_to_repo", single="123213")

如果你想在两次提交之间对文件进行git diff,这是一种方法:

import git
   
repo = git.Repo()
path_to_a_file = "diff_this_file_across_commits.txt"
   
commits_touching_path = list(repo.iter_commits(paths=path))
   
print repo.git.diff(commits_touching_path[0], commits_touching_path[1], path_to_a_file)

这将显示对指定文件执行的两次最新提交之间的差异。

repo.git.diff("main", "head~5")

PyDriller+1

pip install pydriller

但有了新的API:

Breaking API: ```
from pydriller import Repository
for commit in Repository('https://github.com/ishepard/pydriller').traverse_commits():
    print(commit.hash)
    print(commit.msg)
    print(commit.author.name)
    for file in commit.modified_files:
        print(file.filename, ' has changed')

以下是的操作方法

import git
repo = git.Repo("path/of/repo/")
# the below gives us all commits
repo.commits()
# take the first and last commit
a_commit = repo.commits()[0]
b_commit = repo.commits()[1]
# now get the diff
repo.diff(a_commit,b_commit)

最新更新