Git Repository Spring Cleaning Using “BFG Repo-Cleaner”

Background

I will be the first one to admit it, a lot of what I write about is for nothing sake; it is just to write.

As mum reminded me this last weekend, “you talk just for the sake of talking“.

To which I laughed, knowing she said it in the most endearing way.

 

This Time

This time it is different, I have an actual problem.

And, I have had since yesterday or day before.

Wanted to upload files to github, but “no luck“.

 

Error Message

I issued “git post”, but got this message posted below:

Image

Textual

Counting objects: 13, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (10/10), done.
Writing objects: 100% (13/13), 39.75 MiB | 2.78 MiB/s, done.
Total 13 (delta 2), reused 1 (delta 0)
remote: Resolving deltas: 100% (2/2), done.
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: error: Trace: 6a5c5cc6fc04a6cf086623cfa9f44576
remote: error: See http://git.io/iEPt8g for more information.
remote: error: File WideWorldImportersDW.sql is 1003.32 MB; this exceeds GitHub's file size limit of 100.00 MB
To https://github.com/DanielAdeniji/wideWorldImportersV2014.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'https://github.com/DanielAdeniji/wideWorldImportersV2014.git'

Explanation

  1. The key error message reads “File WideWorldImportersDW.sql is 1003.32 MB; this exceeds GitHub’s file size limit of 100.00 MB

 

Remediation

Git Commands

Remove File ( ** Does not work ** )

Remove File ( Syntax )


git rm --cached --ignore-unmatch [filename]

Remove File ( Sample )


git rm --cached --ignore-unmatch WideWorldImportersDW.sql

Result

  1. Commands
    • Committed Changes ( git commit )
    • Pushed ( git push )
  2. Same result ( not able to push changes)

 

Create & Add files to .gitignore ( ** Does not work ** )

Steps

  1. Using text editor, create a new file and name it .gitignore
  2. Add files to skip to .gitignore

.gitignore

Here is what the contents of the .gitignore file looks like:

Result

  1. Commands
    • Committed Changes ( git commit )
    • Pushed ( git push )
  2. Same result ( not able to push changes)

 

Rewrite History ( ** Does not work ** )

Syntax

Sample


git rm --cached --ignore-unmatch [filename]

git commit --amend -CHEAD

git push

Sample


rem https://stackoverflow.com/questions/21168846/cant-remove-file-from-git-commit
rem git rm --WideWorldImportersDW.sql

git rm --cached --ignore-unmatch WideWorldImportersDW.sql

git commit --amend -CHEAD

git push

BFG

Download

Please download bfg from here.

 

BFG – Prune specific file ( ** Does not work ** )

Script


Rem Find java.exe
rem set "_folder=C:\Program Files (x86)\Java\jre7\bin"
set "_folder=C:\Windows\SysWOW64"

rem set folder of jar files
set "_jarFolder=%CD%"

rem current file name of bfg file
set "_bfg=bfg-1.12.15.jar"

rem set name of file to remove
set "_file=WideWorldImportersDW.sql"

rem invoke bfg and pass along file to prune off
%_folder%\java -jar "%_jarFolder%\%_bfg%"  --delete-files id_{%_file%}

Output

Output – Image

Output – Textual

>rem set "_folder=C:\Program Files (x86)\Java\jre7\bin"

>set "_folder=C:\Windows\SysWOW64"

>set "_jarFolder=C:\Personal\dadeniji\Blog\Microsoft\SQLServer\SampleDB\v2016\WideWorldImportersDW\provisioning\v2014\v2014.20170823\SSMS\GenerateAndPublishScr
pts\script"

>set "_bfg=bfg-1.12.15.jar"

>set "_file=WideWorldImportersDW.sql"

>C:\Windows\SysWOW64\java -jar "C:\Personal\dadeniji\Blog\Microsoft\SQLServer\SampleDB\v2016\WideWorldImportersDW\provisioning\v2014\v2014.20170823\SSMS\Genera
eAndPublishScripts\script\bfg-1.12.15.jar"  --delete-files id_{WideWorldImportersDW.sql}

Using repo : C:\Personal\dadeniji\Blog\Microsoft\SQLServer\SampleDB\v2016\WideWorldImportersDW\provisioning\v2014\v2014.20170823\SSMS\GenerateAndPublishScripts
script\.git

Found 8 objects to protect
Found 3 commit-pointing refs : HEAD, refs/heads/master, refs/remotes/origin/master

Protected commits
-----------------

These are your protected commits, and so their contents will NOT be altered:

 * commit 77ddf1c0 (protected by 'HEAD')

Cleaning
--------

Found 4 commits
Cleaning commits:       100% (4/4)
Cleaning commits completed in 46 ms.

BFG aborting: No refs to update - no dirty commits found??

Explanation

  1. BFG aborting: No refs to update – no dirty commits found??
    • Local Commits occurred
    • Commits to Remote Repository has yet to occur
    • And, thus not cited

 

BFG – Prune files bigger than [N] MB ( ** Works ** )

Script


rem set "_folder=C:\Program Files (x86)\Java\jre7\bin"

set "_folder=C:\Windows\SysWOW64"
set "_jarFolder=%CD%"
set "_bfg=bfg-1.12.15.jar"

rem ------------------------------------------------------------
rem $ java -jar bfg.jar --strip-blobs-bigger-than 100M
rem ------------------------------------------------------------
%_folder%\java -jar "%_jarFolder%\%_bfg%" --strip-blobs-bigger-than 50M

Output

Output – Image

Output – Textual


>invokeBFGBigFiles

>rem $ java -jar bfg.jar --strip-blobs-bigger-than 100M

>rem set "_folder=C:\Program Files (x86)\Java\jre7\bin"

>set "_folder=C:\Windows\SysWOW64"

>set "_jarFolder=C:\Personal\dadeniji\Blog\Microsoft\SQLServer\SampleDB\v2016\WideWorldImportersDW\provisioning\v2014\v2014.20170823\SSMS\GenerateAndPublishScri
pts\script"

>set "_bfg=bfg-1.12.15.jar"

>C:\Windows\SysWOW64\java -jar "C:\Personal\dadeniji\Blog\Microsoft\SQLServer\SampleDB\v2016\WideWorldImportersDW\provisioning\v2014\v2014.20170823\SSMS\Generat
eAndPublishScripts\script\bfg-1.12.15.jar" --strip-blobs-bigger-than 50M

Using repo : C:\Personal\dadeniji\Blog\Microsoft\SQLServer\SampleDB\v2016\WideWorldImportersDW\provisioning\v2014\v2014.20170823\SSMS\GenerateAndPublishScripts\
script\.git

Scanning packfile for large blobs: 1
Scanning packfile for large blobs completed in 14 ms.
Found 1 blob ids for large blobs - biggest=1052056478 smallest=1052056478
Total size (unpacked)=1052056478
Found 8 objects to protect
Found 3 commit-pointing refs : HEAD, refs/heads/master, refs/remotes/origin/master

Protected commits
-----------------

These are your protected commits, and so their contents will NOT be altered:

 * commit 77ddf1c0 (protected by 'HEAD')

Cleaning
--------

Found 4 commits
Cleaning commits:       100% (4/4)
Cleaning commits completed in 343 ms.

Updating 1 Ref
--------------

        Ref                 Before     After
        ---------------------------------------
        refs/heads/master | 77ddf1c0 | 333b7745

Updating references:    100% (1/1)
...Ref update completed in 23 ms.

Commit Tree-Dirt History
------------------------

        Earliest      Latest
        |                  |
          .    D    m    m

        D = dirty commits (file tree fixed)
        m = modified commits (commit message or parents changed)
        . = clean commits (no changes to file tree)

                                Before     After
        -------------------------------------------
        First modified commit | c0bd8d32 | be390435
        Last dirty commit     | c0bd8d32 | be390435

Deleted files
-------------

        Filename                   Git id
        -----------------------------------------------
        WideWorldImportersDW.sql | 20342ad4 (1003.3 MB)

In total, 4 object ids were changed. Full details are logged here:

        C:\Personal\dadeniji\Blog\Microsoft\SQLServer\SampleDB\v2016\WideWorldImportersDW\provisioning\v2014\v2014.20170823\SSMS\GenerateAndPublishScripts\scrip
t.bfg-report\2017-08-25\14-00-00

BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive

--
You can rewrite history in Git - don't let Trump do it for real!
Trump's administration has lied consistently, to make people give up on ever
being told the truth. Don't give up: https://www.rescue.org/topic/refugees-america

Explanation

  1. Scan & Found
    • Scanning packfile for large blobs: 1
    • Scanning packfile for large blobs completed in 14 ms.
    • Found 1 blob ids for large blobs – biggest=1052056478 smallest=1052056478
  2. Deleted files
    • WideWorldImportersDW.sql | 20342ad4 (1003.3 MB)
  3. Next to do
    • BFG run is complete! When ready, run:
      • git reflog expire –expire=now –all && git gc –prune=now –aggressive

MVP!

Having others lean back, Like Eli Manning, the day belongs to Roberto Tyley!

His tool, BFG Repo-Cleaner,  sits on top of the castle.

It is Java and Scala and so works plentifully on all platforms!

Here is the link

 

References

  1. atlassian
    • Rewriting history
      • Git-Rewriting History
        Link

 

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s