Hide Files From Git With `.gitignore`15 Feb 2018
- A code repo should generally include a
- For a Python project, this should contain at least the line
- Or, use gitignore.io to generate a
- The cost of ignoring this advice is (1) an unprofessional-looking repo littered with
*.pycfiles; and (2) merge conflicts that didn’t have to be.
Keeping a clean
Here’s a couple of useful invariants of your workflow:
git statusreports only files that should be committed.
When you have no pending changes in your working directory,
git statusis “clean”:
$ git status On branch master Your branch is up to date with 'origin/master'. nothing to commit, working tree clean
With these invariants, you can use commands such as
git commit --all,
git commit -a, or their GUI equivalents, without adding a file to the repo that isn’t supposed to be there. Also, this plays well with tooling, that tells you whether you’ve got pending changes. Finally, any time I’ve got files that shouldn’t go in the repo but show up in
git status, at some point I’ve accidentally committed them anyway.
In order to maintain this invariant, you need to either: (1) never put a file in your working directory that shouldn’t go in the repo (but sometimes this isn’t practical); or, (2) tell git that some files in your directory should be ignored. The latter is configured via
.gitignore is just a text file, with a special meaning to git. It’s a list of blank lines and comments (lines starting with
#), that git, um, ignores; and file globs like
TODO, that matches the single file
build/, that matches the directory named
*.pyc, that matches any file whose name ends in
.gitignore itself is just a text file, that can be added to the repo. You should add it to the repo, so that you and your collaborators have git configured to ignore the same files.
Things that don’t belong in a repo
Some files, that end up in your working directory, should never go in your repo.
Secrets shouldn’t go in the repo. These are discussed in Keeping secrets.
Secrets shouldn’t go in the repo because once you push to Github, those secrets accessible to the whole world. Even if your project sources aren’t published to a public repo, they’re better protected by defense in depth.1 Also, you eventually want different secrets in your development environment from your staging, production, and possibly other environments, which makes this a special case of:
Environment-specific configuration values shouldn’t go in the repo. Aside from secrets, some configuration variables in our example projects hold the paths to components such as the database (the value of
DATABASE_URL) and the message broker (the value of
MQTT_URL). These values don’t go in the repo, because everything in the repo is deployed to multiple environments.2
An exception is that it’s common to wire in a default value that works for the development environment, and override this in other environments. For example, you might use on a local database for development. In that case, default the database configuration to a datase running on localhost.
Absolute paths don’t go in the repo. Your source shouldn’t contain any strings such as
C:\users\osteele\src\my-project (Windows), or
/usr/osteele/src/my-project (one of several possibilities for Unix), or even
~/src/my-project. The first three of these won’t work for any user who doesn’t share your login name. The last won’t work for users who arrange their projects differently, which probably include CI servers and deployment environmens.
Object files generally don’t go in the repo. An object file, for our purpose, is a (generally not editable) file that’s programmatically created from some other file (a source file — although not necessarily a program source file). Note that, by this definition, a bitmap image file whose source (say, a Photoshop, Inkscape, or PlantUML file) is in the repo is an object file, while the same image file, if its source is not in the repo, is not.
Compiler artifacts are a special case of an object file (and indeed, in many contexts are what is meant by “object file”). In Python, these files end in
.pyz. (In C, they include
*.o files. In Java, they include
*.class. Some other languages — especially dynamic languages — don’t create them.)
The reason to keep object files out of the repo is that they introduce unnecessary merge conflicts. If two authors edit different sections of
git pull can resolve this. [^At least the syntactic aspects. Usually the semantics works out too.] If two authors edit different sections of
example.py, each generate their own
example.pyo, and commit that, then one of them is going to see a merge conflict that they need to manually resolve.
Log files don’t go in the repo. These are another special case of object files. I’m calling them out because they’re the other kind of file (aside from compilation artifacts) that often sneaks in. Just add
*.log to your
.gitignore, or use oe of the
.gitignore templates below.
Editor backup files don’t go in the repo. These typically have names like
Operating system metadata files don’t go in the repo. On macOS, this includes
Icon\r\r. On Windows, this includes
The patterns of editor backup files and operating system metadata is a ridiculous amount to know. Just start from a template, or grow your
.gitignore file as you see
git status listing files that you didn’t write.
- gitignore.io generates a
.gitignore, based on your technology stack and operating system.
gitignoretemplates is the community repository of templates behind gitignore.io.
There’s a different approach, pioneered(?) in Rails and used frameworks that copy it, which is that the source repo includes a configuration file with a different section for each environment. This makes it easier to audit the configurations, and update all the configurations at once, say when a new resource comes in. ↩