# Root directory where repo-org lives
# and a temporary location for git filter-branch
root="$PWD"
temp='/dev/shm/tmp'
# The old repository and the subdirectory we'd like to extract
repo_old="$root/repo-old"
repo_old_directory='sub'
# The new submodule repository, its url
# and a hash map folder which will be populated
# and later used in the filter script below
repo_sub="$root/repo-sub"
repo_sub_url='https://github.com/somewhere/repo-sub.git'
repo_sub_hashmap="$root/repo-sub.map"
# The new modified repository, its url
# and a filter script which is created as heredoc below
repo_new="$root/repo-new"
repo_new_url='https://github.com/somewhere/repo-new.git'
repo_new_filter="$root/repo-new.sh"
筛选器脚本
# The index filter script which converts our subdirectory into a submodule
cat << EOF > "$repo_new_filter"
#!/bin/bash
# Submodule hash map function
sub ()
{
local old_commit=\$(git rev-list -1 \$1 -- '$repo_old_directory')
if [ ! -z "\$old_commit" ]
then
echo \$(cat "$repo_sub_hashmap/\$old_commit")
fi
}
# Submodule config
SUB_COMMIT=\$(sub \$GIT_COMMIT)
SUB_DIR='$repo_old_directory'
SUB_URL='$repo_sub_url'
# Submodule replacement
if [ ! -z "\$SUB_COMMIT" ]
then
touch '.gitmodules'
git config --file='.gitmodules' "submodule.\$SUB_DIR.path" "\$SUB_DIR"
git config --file='.gitmodules' "submodule.\$SUB_DIR.url" "\$SUB_URL"
git config --file='.gitmodules' "submodule.\$SUB_DIR.branch" 'master'
git add '.gitmodules'
git rm --cached -qrf "\$SUB_DIR"
git update-index --add --cacheinfo 160000 \$SUB_COMMIT "\$SUB_DIR"
fi
EOF
chmod +x "$repo_new_filter"
1.子目录提取
cd "$root"
# Create a new clone for our new submodule repo
git clone "$repo_old" "$repo_sub"
# Enter the new submodule repo
cd "$repo_sub"
# Remove the old origin remote
git remote remove origin
# Loop over all commits and create temporary tags
for commit in $(git rev-list --all)
do
git tag "temp_$commit" $commit
done
# Extract the subdirectory and slice commits
mkdir -p "$temp"
git filter-branch --subdirectory-filter "$repo_old_directory" \
--tag-name-filter 'cat' \
--prune-empty --force -d "$temp" -- --all
# Populate hash map folder from our previously created tag names
mkdir -p "$repo_sub_hashmap"
for tag in $(git tag | grep "^temp_")
do
old_commit=${tag#'temp_'}
sub_commit=$(git rev-list -1 $tag)
echo $sub_commit > "$repo_sub_hashmap/$old_commit"
done
git tag | grep "^temp_" | xargs -d '\n' git tag -d 2>&1 > /dev/null
# Add the new url for this repository (and e.g. push)
git remote add origin "$repo_sub_url"
# git push -u origin master
2.子目录替换
cd "$root"
# Create a clone for our modified repo
git clone "$repo_old" "$repo_new"
# Enter the new modified repo
cd "$repo_new"
# Remove the old origin remote
git remote remove origin
# Replace the subdirectory and map all sliced submodule commits using
# the filter script from above
mkdir -p "$temp"
git filter-branch --index-filter "$repo_new_filter" \
--tag-name-filter 'cat' --force -d "$temp" -- --all
# Add the new url for this repository (and e.g. push)
git remote add origin "$repo_new_url"
# git push -u origin master
# Cleanup (commented for safety reasons)
# rm -rf "$repo_sub_hashmap"
# rm -f "$repo_new_filter"
cd "$root"
# Clone the new modified repo recursively
git clone --recursive "$repo_new" "$repo_new-tmp"
# Now use the newly cloned one
mv "$repo_new" "$repo_new-bak"
mv "$repo_new-tmp" "$repo_new"
# Cleanup (commented for safety reasons)
# rm -rf "$repo_new-bak"
WARNING: git-filter-branch has a glut of gotchas generating mangled history
rewrites. Hit Ctrl-C before proceeding to abort, then use an
alternative filtering tool such as 'git filter-repo'
(https://github.com/newren/git-filter-repo/) instead. See the
filter-branch manual page for more details; to squelch this warning,
set FILTER_BRANCH_SQUELCH_WARNING=1.
#!/bin/bash
# put this or the commented version below in e.g. ~/bin/git-split-submodule
${GIT_COMMIT-exec git filter-branch --index-filter "subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d "$0")" "$@"}
${debug+set -x}
fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
pathcheck=(`printf "%s:$subdir\\n" ${fam[@]} \
| git cat-file --batch-check='%(objectname)' | uniq`)
[[ $pathcheck = *:* ]] || {
subfam=($( set -- ${fam[@]}; shift;
for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
git rev-parse -q --verify $tpar:"$subdir"
done
))
git rm -rq --cached --ignore-unmatch "$subdir"
if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
git update-index --add --cacheinfo 160000,$subfam,"$subdir"
else
subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
| git commit-tree $GIT_COMMIT:"$subdir" $(
${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
` &&
git update-index --add --cacheinfo 160000,$subnew,"$subdir"
fi
}
${debug+set +x}
#!/bin/bash
# Git filter-branch to split a subdirectory into a submodule history.
# In each commit, the subdirectory tree is replaced in the index with an
# appropriate submodule commit.
# * If the subdirectory tree has changed from any parent, or there are
# no parents, a new submodule commit is made for the subdirectory (with
# the current commit's message, which should presumably say something
# about the change). The new submodule commit's parents are the
# submodule commits in any rewrites of the current commit's parents.
# * Otherwise, the submodule commit is copied from a parent.
# Since the new history includes references to the new submodule
# history, the new submodule history isn't dangling, it's incorporated.
# Branches for any part of it can be made casually and pushed into any
# other repo as desired, so hooking up the `git submodule` helper
# command's conveniences is easy, e.g.
# subdir=utils git split-submodule master
# git branch utils $(git rev-parse master:utils)
# git clone -sb utils . ../utilsrepo
# and you can then submodule add from there in other repos, but really,
# for small utility libraries and such, just fetching the submodule
# histories into your own repo is easiest. Setup on cloning a
# project using "incorporated" submodules like this is:
# setup: utils/.git
#
# utils/.git:
# @if _=`git rev-parse -q --verify utils`; then \
# git config submodule.utils.active true \
# && git config submodule.utils.url "`pwd -P`" \
# && git clone -s . utils -nb utils \
# && git submodule absorbgitdirs utils \
# && git -C utils checkout $$(git rev-parse :utils); \
# fi
# with `git config -f .gitmodules submodule.utils.path utils` and
# `git config -f .gitmodules submodule.utils.url ./`; cloners don't
# have to do anything but `make setup`, and `setup` should be a prereq
# on most things anyway.
# You can test that a commit and its rewrite put the same tree in the
# same place with this function:
# testit ()
# {
# tree=($(git rev-parse `git rev-parse $1`: refs/original/refs/heads/$1));
# echo $tree `test $tree != ${tree[1]} && echo ${tree[1]}`
# }
# so e.g. `testit make~95^2:t` will print the `t` tree there and if
# the `t` tree at ~95^2 from the original differs it'll print that too.
# To run it, say `subdir=path/to/it git split-submodule` with whatever
# filter-branch args you want.
# $GIT_COMMIT is set if we're already in filter-branch, if not, get there:
${GIT_COMMIT-exec git filter-branch --index-filter "subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d "$0")" "$@"}
${debug+set -x}
fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
pathcheck=(`printf "%s:$subdir\\n" ${fam[@]} \
| git cat-file --batch-check='%(objectname)' | uniq`)
[[ $pathcheck = *:* ]] || {
subfam=($( set -- ${fam[@]}; shift;
for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
git rev-parse -q --verify $tpar:"$subdir"
done
))
git rm -rq --cached --ignore-unmatch "$subdir"
if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
# one id same for all entries, copy mapped mom's submod commit
git update-index --add --cacheinfo 160000,$subfam,"$subdir"
else
# no mapped parents or something changed somewhere, make new
# submod commit for current subdir content. The new submod
# commit has all mapped parents' submodule commits as parents:
subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
| git commit-tree $GIT_COMMIT:"$subdir" $(
${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
` &&
git update-index --add --cacheinfo 160000,$subnew,"$subdir"
fi
}
${debug+set +x}
# install git-filter-repo, see [1] for install via pip, or other OS's.
sudo apt-get install git-filter-repo
# copy your repo; everything EXCEPT the subdir will be deleted, and the subdir will become root.
# --no-local is required to prevent git from hard linking to files in the original, and is checked by `filter-branch`
git clone working-dir/.git working-dir-copy --no-local
cd working-dir-copy
# extract the desired subdirectory and its history.
git filter-repo --subdirectory-filter foodir
# foodir is now its own directory. Push it to github/gitlab etc
git remote add origin user@hosting/project.git
git push -u origin --all
git push -u origin --tags
// Original branch needs to get history of all images
git lfs fetch --all
// clone needs to copy the history
git lfs install --skip-smudge
git lfs pull working-dir --all
9条答案
按热度按时间yjghlzjz1#
要将子目录隔离到其自己的仓库中,请在原始仓库的克隆上使用
filter-branch
:然后,只需删除原始目录并将子模块添加到父项目中即可。
sshcrbum2#
首先将目录更改为将成为子模块的文件夹。然后:
km0tfn4u3#
我知道这是一个老线索,但这里的答案挤压任何相关的提交在其他分支。
一个简单的方法来克隆和保留所有这些额外的分支和提交:
1 -确保您有这个git别名
2 -克隆远程、拉取所有分支、更改远程、筛选目录、推送
oprakyz74#
现状
假设我们有一个名为
repo-old
的存储库,其中包含一个子目录sub
,我们希望将其转换为一个子模块,它有自己的存储库repo-sub
。还旨在将原始的存储库
repo-old
转换成修改的存储库repo-new
,其中涉及先前存在的子目录sub
的所有提交现在将指向我们提取的子模块存储库repo-sub
的对应提交。∮让我们改变
在
git filter-branch
的帮助下,可以通过两步过程实现这一点:1.从
repo-old
到repo-sub
的子目录提取(已在接受的answer中提及)1.子目录从
repo-old
替换为repo-new
(使用正确的提交Map)备注:我知道这个问题是老问题了,而且已经提到过
git filter-branch
有点过时,可能很危险。但另一方面,它可能会帮助其他人使用转换后易于验证的个人仓库。所以请警告!请让我知道是否有任何其他工具可以做同样的事情,而不被过时,并且可以安全使用!下面我将解释我是如何在git 2.26.2版本的linux上实现这两个步骤的,旧版本可能在一定程度上可以工作,但需要测试。
为了简单起见,我将把自己限制在原来的存储库
repo-old
中只有一个master
分支和一个origin
远程的情况下,还要注意的是,我依赖于前缀为temp_
的临时git标签,这些标签在这个过程中会被删除,所以如果已经有类似的标签,你可能需要调整下面的前缀。最后,请注意,我还没有广泛的测试这一点,可能会有角落的情况下,食谱失败。所以请备份一切之前继续!下面的bash片段可以连接成一个大脚本,然后在repo
repo-org
所在的文件夹中执行。不建议将所有内容直接复制并粘贴到命令窗口中(尽管我已经成功地测试了这一点)!0.准备
变量
筛选器脚本
1.子目录提取
2.子目录替换
**备注:**如果新创建的存储库
repo-new
在git submodule update --init
期间挂起,请尝试递归地重新克隆存储库一次:gk7wooem5#
这是可以做到的,但并不简单。如果你搜索
git filter-branch
,subdirectory
和submodule
,会有一些不错的评论。它本质上需要创建你的项目的两个克隆,使用git filter-branch
删除除了一个子目录之外的所有内容。然后只删除另一个仓库中的子目录,这样就可以建立第二个仓库作为第一个仓库的子模块。lmyy7pcs6#
@knittl使用
filter-branch
的当前答案让我们非常接近期望的效果,但是当尝试时,Git向我抛出了一个警告:在这个问题被提出和回答9年后,
filter-branch
被弃用,取而代之的是git filter-repo
。事实上,当我查看我使用git log --all --oneline --graph
的git历史时,它充满了不相关的提交。那么如何使用
git filter-repo
呢?Github有一篇很好的文章概述了here。(注意,您需要独立于git安装它。我使用的是python版本的pip3 install git-filter-repo
)如果他们决定移动/删除该条,我将总结和概括他们的程序如下:
在那里,您只需要将新的存储库注册为您希望它所在的子模块:
pkln4tw67#
这是就地转换,您可以像处理任何过滤分支一样将其取消(我使用
git fetch . +refs/original/*:*
)。我有一个带有
utils
库的项目,这个库在其他项目中已经开始有用了,我想把它的历史分割成一个子模块。我没有想到先看SO,所以我自己写了一个,它在本地构建历史,所以速度快了一点,之后如果你想的话,你可以设置helper命令的.gitmodules
文件等等。然后把子模块历史推到你想要的任何地方。剥离的命令在这里,文档在注解中,在后面的未剥离的命令中。将它作为自己的命令运行,设置
subdir
,如果你要拆分utils
目录,就像subdir=utils git split-submodule
一样。它很黑客,因为它是一次性的,但我在Git历史记录中的Documentation子目录上测试过它。jgzswidk8#
官方git项目现在推荐使用git-filter-repo
这也要感谢this gist。
编辑:对于LFS用户(可怜的人),git clone并不能拉取一个映像的整个LFS历史,这会导致git push失败。
https://github.com/newren/git-filter-repo/blob/main/INSTALL.md
7eumitmz9#
如果可以接受将以前的历史记录保存在 parent folder only 中,一个简单的解决方案是删除subfolder from the index,并在相同的路径中启动一个新的存储库或子模块。
1.将
subdir
加到.gitignore
rm -r --cached subdir
git add .gitignore && git commit
cd subdir && git init && git add .
1.提交新
subdir
存储库中的初始文件从
git help rm
开始:--cached:使用此选项仅从索引中取消暂存和删除路径。工作树文件,无论是否修改,都将保持不变。
在生产代码中使用过submodules之后,我可以说这是一个很好的解决方案,特别是因为它记录了项目的依赖关系。
对于一个简单的项目,或者如果没有其他开发人员,或者没有很强的依赖性,文件夹结构更方便,子模块可能有点太多了。但是,如果你选择走这条路,跳过步骤1,继续相应的操作。