Article is about 7m reading
My Short Explanation of Git
Git is very confusing, and it doesn't help that its CLI is inconsistent. Despite how awful learning git is, I'm sure it's possible to give enough insight into git that a person can feel more comfortable learning it. That's the goal of this article. I'm not going to teach git to you, but I will attempt to give you a better mental model
Git High-Level Goals
The linux project had their source control server hacked with two lines inserted. Someone noticed their copy didn't match the server and figured out it was a backdoor attempt. Linus Torvalds had a few reasons to create git, but this backdoor attempt was clearly an issue he wanted to prevent. Some goals of git are
- File Integrity, so backdoor attempts like the above are easily noticed (and problems with a failing disk)
- Decentralized, git can work offline, and can be used by sending patch files over email, which the linux kernel did. Although in practice, people use 1 or more servers
- Easy To Branch And Merge, this is a problem with previous source control solutions
A commit uses a hash for file integrity. Decentralized may explain why there's branches and remote branches, and I won't try to explain merging, but it's clear that a significant amount of git works around a branch head and commits.
Git Repo Summary
A Git repo is essentially a list of branches along with the object files (or 'commits') to create them. The wild thing is, you can have hundreds of unrelated projects in a single git repo. You can't merge them (you'll get the error "fatal: refusing to merge unrelated histories"), but you can definitely have multiple projects in a single repo if you choose to.
There's metadata in git such as author name, where remote servers can be found (you may have many), and what branch you're on. But that's to enable you to do things like git push/pull, commit, etc. You can certainly use git without a remote server or an author name if you don't want to commit.
Branches and Commits
Branches are named commits. It might sound silly to simplify it like that, but 90% of the time that's how I think of it, and 8% of the time I think about if I need the origin branch instead of the local. You might see 'HEAD' mentioned while using git. When you're in a branch, HEAD is the commit the branch is set to. When you commit, the HEAD is updated to use the new commit. If you're not in a branch (you checked out a specific commit), you can still commit, but you'll get a warning about leaving commit(s) behind.
Don't overthink branches. They're really just a named commit. You can set them (`git reset`) to unrelated projects. Making up meaning to branches and commits will confuse you. They're really simple; and you don't need to think about them. A handy thing I do is create a new folder + branch called poke. I create it by using `git worktree`. Git places a git folder that points to the original git repo. Since it's the same repo, there's no need to push and pull; both folders can see everything the other can. I use the 'poke' folder and branch to look around, compare diffs, etc. I constantly 'git reset BRANCHNAME', which does not checkout a new branch but set the HEAD to the branch head I want to look at. Then I can do things like hard reset, diff, etc, it's very handy, and after all of it, the branch is still named poke.
Other Git Stuff
Git has things to help, such as commit hooks (runs a script every time you try to commit, and may reject it), bisect (helps you find a commit that broke something), reflog to see recent changes, and helps you find a commit you may have accidentally detached from a branch, rebase to modify your history, etc. They're worth learning, but the important part is understanding that most things are a commit and a branch is simply a name that points to a commit (and may change if you use reset, rebase, or commit).
Now that I hammered that in. Take a moment to think about git push. It's to update the remote server with the commit your branch is pointing to + the commits necessary to use it. Git pull is the reverse and updates your list of origin branches. Rebase changes the start of your branch to the new base, or, if you're working interactively, lets you squash a commit, change a message, sign a commit, etc. Rebase will change your commit ID and update your branch HEAD if you're on one. git bisect looks over your commit history, but doesn't look at commits you squashed out of existence.
One 'danger' I want to point out is that even though git stash creates a commit (you can find the ID in '.git/logs/refs/stash'), it's not put into the reflog. If you pop a commit, then realize you want to reapply it, it's (AFAIK) gone. Branches may work reasonably well in git, but if you think branches link to another where a change in one affects another, you'll be confused. Knowing it's a name for a commit allows you to more easily understand why you may need to rebase something to the same branch (I'm sure you rebased main many times).
I hoped this helped. If it didn't... well... it's an article on git, what did you expect