Hello, my name is: Amy

Using .gitignore Too Late

TLDR;

  1. List Files you're tracking, if you're curious: git ls-tree --name-only --full-tree -r HEAD
  2. Either: Remove all cached instances of these files: git rm -r --cached cache/
  3. Or: Mark the directory as 'unchanged': cd to the directory you want to ignore, then: git ls-files -z | xargs -0 git update-index --assume-unchanged

You're not listening, git!

So, you've created a git repository and find out along the way there's all kinds of unnecessary garbage that has turned up. Oops. You ignore it for a while, hoping it will go away. Of course it doesn't.

This cache file isn't supposed to be here...

No problem, you say. I'll just add a .gitignore file, add the directory cache/. And stage/commit the .gitignore to the repository. Now any new files added to that directory won't show up (Great!), but the files that were already commited will still be tracked if changes are made (Ummm....close??).

In this simple case, my .gitignore file would only contain this one directory:

cache/

Turns out, once you've added a file into a repository, git likes to hold onto it forever, even if you say 'no thanks'. I get it, git--you've got trust issues. But sometimes mistakes happen, people change. After spending more months than I'd care to admit just accepting all the extra files, I finally dug in and found a solution, actually, make that two solutions!

What files am I actually tracking, anyway?

Run this code to see all the files you're tracking in your git repository: git ls-tree --name-only --full-tree -r HEAD

  • ls-tree lists components of a tree object, and the tree object you're requesting git to list in this case is HEAD, most commonly known as the current branch.
  • --full-tree makes sure no matter where you are in the directory structure every file is included that should be.
  • -r includes all files within subfolders, if applicable.

All of the files that are tracked.

I see the skeletons in the closet... but now what?

Solution 1: You never needed them anyway, this was all just a big mistake!

Remove all cached instances of these files:
git rm -r --cached cache/

Remove cache folder from index.

What seems to be happening with the above code is that you are telling git to remove (rm) the files in that folder and beyond (-r) from the index (--cached) and leave them in your folder structure (also known as working tree). What git means by index is that staging ground before you make a commit.

After you do this, you will need to first commit the deletion of those files.

Commit the change: git add/git commit.

After you've removed them, viewing the files you're tracking with git ls-tree --name-only --full-tree -r HEAD will not include them, and any further changes to those files will not be tracked (so long as you have that directory in the .gitignore file).

Files left that are tracked (no cache folder).

Solution 2: The files should probably stick around in the repo, but don't constantly monitor them, OK?

mark the directory as 'unchanged': cd to the directory you want to ignore, then: git ls-files -z | xargs -0 git update-index --assume-unchanged

Run above code.

Here what we are doing is telling git, "I'm going to leave these files here, but I promise I'm not going to change them." Of course, it doesn't matter if you have your fingers crossed when you make this promise to git, you can change them if you want. But for as long as you flag the files as '--assume-unchange' git will trust that you don't care to monitor them.

Note: Because part of this is linuxy, if you're on Windows be sure to use git bash.

What's happening is that:

  • git ls-files -z outputs all the files in your current directory (the one you want to ignore) in a big long (null terminated \0) format rather than having line breaks (that's what the -z does).
  • xargs -0 is a linux thing that seems like it basically lets your execute commands over longer lists of files. The -0 works with xargs will take cares of file names with blank spaces.
  • update-index accepts file names (that we just got from git ls-files -z), and with --assume-unchanged sets a flag on those files to tell git that we aren't going to change them any more so don't bother checking.

So, with this way, the cache folder sticks around in the tracked files, but any new changes you make won't be tracked until you tell git that you want it to look at them by changing the flag to --no-assume-unchanged.

Old cache files are still listed.

Further Reading

Comments