Introduction to Git: The stupid file tracker

I have included a written copy of all the information I talk about in the video to accommodate different learning styles.

Getting Started

If you haven't already, you'll want to install git. You can do that by downloading it from git-scm.com and downloading it for your distribution. There's plenty of other starter guides out there for installing on Windows, OSX and Linux, so I'm going to assume you are capable enough to find that on your own. You should be able to open a command line prompt and enter the command git --version and output something like this:


$ git --version
git version 2.45.2

Once you have git installed, you are ready to rock for the rest of this tutorial! Also, please note: I will be using bash(1) for all of these shell operations. For the most part, Powershell and cmd should be the same, but you can always use WSL to get access to Linux commands on Windows.

Definitions (yes, there will be a quiz 😝)

  • repository: Bucket of changes. A complete copy of all the code, history, annotations, changes, timestamps and user information related to any given project.
  • commit: A checksum of changes.
  • branch: A reference to a sequence of changes.
  • tag: An annotated commit reserved for releases and special occasions.
  • git: The stupid file tracker.
  • Github: A Microsoft-owned product for managing repositories.
  • release: A snapshot of changes marked by features, bugs, and other changes included in this publication of software. Can be denoed by "versions"
  • version: A release of software. Denoed by numbers. To most, these mean nothing. If you care, there's semver.
  • production: The environment in which the money is made and the software is live and used by the target audience.
  • SDLC: Software Development Life Cycle. Process by which software gets completed. Sometimes used to represent the process as a whole.

These should help make the rest of the techie-speak below make more sense.

Your First Commit

When you first start a project, you will want to create a repository. This is where all the code will live. You will want to commit your first changes to the master branch. I usually start my projects with an empty first commit to start off the project with a clean slate. Every file that gets added after that will be a part of the project. Some people and automated systems can initialize your repository with some starter files like a .gitignore and a README.md file. I'm going to show you how you can create a new git repository from scratch! Open a terminal and let's get started!

git init


git init my-project
cd my-project
git commit --allow-empty -m 'First commit.'

You may see some output like this:


$ git init my-project
Initialized empty Git repository in /home/markizano/my-project/.git/

$ git commit --allow-empty -m 'First commit.'
[master (root-commit) 0d974cf] First commit.

What just happened? We just created a repository and made an empty commit as the root-commit where all changes will follow from here. What you do now is to add your files to your project and you can add them under git's tracking. Let's create our first file: README.md

git status


echo "This is my first project." > README.md
git status

If you follow these commands, what this will do is create a README.md file with the contents "This is my first project." in it. The git status command outputs some text for us so we can see what's happening with the state of git.


$ git status
On branch master
Changes to be committed:
  (use "git restore --staged >file<..." to unstage)
	new file:   README.md

no changes added to commit (use "git add" and/or "git commit -a")

gitconfig: colors

Depending on your settings and version of git, you may have master or main as your branch name, but the gist is the same and you'll see that README.md is staged and ready to be committed.

Side note: if you want different colors, you can use this in your ~/.gitconfig:


[color "status"]
	added = yellow
	changed = green
	untracked = cyan

Alternatively, you can run these commands and they will do the same thing so you don't have to worry about file syntax errors:


$ git config --global color.status.added yellow
$ git config --global color.status.changed green
$ git config --global color.status.untracked cyan

More details on .gitconfig.

git add

The command git add adds this to git's tracking so it can see what the changes are and take note of them. When you do this, git will take note of the state of the file as it is at the time of staging. If you change things after that, you will have to stage/add them again after making those changes before committing.
Also, little history: git stage was the original command and git add was created as an alias because of some confusion. They both do the same thing: add files to the staging area so git can track them.


$ git add README.md
$ git status
On branch master
Changes to be committed:
  (use "git restore --staged >file<..." to unstage)
	modified:   README.md

git commit

Now that we have our code staged and added to git tracking, we need to commit that change. It isn't officially part of git control until you've made a commit and a checksum is produced as a result of that change. Every commit requires a message. This is your opportunity to annotate the reason behind the change or some summary that describes the change in more detail. Most people ignore this and just put their name or like one or two words. This annoys me in proper projects because it completely ignores the inner workings of the mind that put the software together. When exploring the history and I see comments around why someone took a decision in some regard, it can help me make the best decision on what needs to change and why going forward rather than guessing because the commit mesage was "fixed bug" or "added feature". The checksum is like a digital signature that's unique to this set of changes (or "changeset" in the field). Let's make that commit now so git knows what changed.


$ git commit -m 'Adding README.md file.'
[master 440cf4b] Adding README.md file.
 1 file changed, 1 insertion(+)

You may see some output like this. This is the commit message and the output of the commit. The commit message is a short description of what changes were made. The output of the commit is a summary of the changes that were made. In this case, we added a file and made one insertion. The checksum is derived on the changes that went into it and is unique to each repository. Remember this because in git, checksums/commits are never deleted from the repository unless they are rebased out or discarded.

Now if you run git status, you'll see all changes have been synchronized and we are now what I like to call "head of git".


$ git status
On branch master
nothing to commit, working tree clean

This is the basic workflow of git. You make changes, stage them, commit them, and repeat. This is the basic workflow of git. There are many other commands and features of git that you can use to make your life easier. I will cover some of them in the next section.

Congratulations! You have made your first commit! 🎉

Branches of a Tree

If you are the only developer in the project contributing to the codebase, then committing directly to the master branch may work just fine for you. If you collaborate with one to three other developers, you will want to break your changes into different branches and leverage a peer-review process.

Branches are a way to separate changes from the main codebase so you can work on them without affecting the main codebase. This is useful for testing new features, fixing bugs, and other changes that may not be ready for production.

Let's create a branch together now. We will create a branch called staging.

git branch


$ git branch staging
$ git branch
  master
* staging

The git branch command lists all the branches in the repository. The * denotes the current branch you are on. You can see that we have two branches: master and staging. The staging branch is the current branch we are on.

The git branch ${branch_name} command creates a new branch with the name you specified. If you provide 4th argument, you can create a branch from another branch. For example,


$ git branch feature-1 staging
$ git branch
  master
  staging
* feature-1

This will create a new branch called feature-1 from the staging branch.

Pro Tip: I have ran into issues with using "/" in branch names with conjunction of some automation tools. So you will see me avoid creating "folders" in branch names. Instead, I use a dash "-" or an underscore "_" to separate significances.

This is also not the only way a branch can be created. You can also use the git checkout -b ${branch_name} command to create a branch and switch to it in one command. For example,


$ git checkout -b feature-2
Switched to a new branch 'feature-2'

This will create a new branch called feature-2 and switch to it in one command. This is what I do, personally because in my mind, it makes more sense to checkout the target branch I want to work on, and checkout a new branch from there rather than fabricating one in space and checking it out; but that's just how my brain works :)

Congratulations! You've made your first branch! 🎉

Merging Changes

Now that we have two branches, we can make changes to the staging branch and merge them back into the master branch. This is a common workflow for developers to work on features and bug fixes without affecting the main codebase.

Let's make a change to the staging branch. We will add a new file called staging.md and commit it to the staging branch.

git checkout

First, we need to switch to the staging branch.


$ git checkout staging
Switched to branch 'staging'

Now we are on the staging branch. Let's create a new file called staging.md and commit it.


$ echo "This is the staging branch." > staging.md
$ git add staging.md
$ git commit -m 'Adding staging.md file.'
[staging 531de12] Adding staging.md file.
 1 file changed, 1 insertion(+)
 create mode 100644 staging.md

Now we have made a change to the staging branch. Let's switch back to the master branch and merge the changes from the staging branch. I'll run git branch -v to show you the branches and the commit hashes.


$ git checkout master
Switched to branch 'master'

$ git branch -v
* master  440cf4b Adding README.md file.
  staging 531de12 Adding staging.md file.

$ git merge staging
Updating 440cf4b..531de12
Fast-forward
 staging.md | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 staging.md

You'll observe that in the process of merging, we created a new commit that ended up pulling the changes into the master branch and annotating with an automated message. Now the changes from the staging branch are in the master branch.

If changes are made in branches on the same files on the same lines and the result is merged, git will attempt to do its best to merge the results, but if it cannot find a sane solution, it will ask you to resolve the conflict. This is called a merge conflict and is a common occurrence in software development. You can resolve these conflicts by editing the files in question and committing the changes. I will cover this in a future video in more detail. The gist of what I keep in mind when resolving conflicts is the desired end-state of the program.

Congratulations! You've made your first merge! 🎉

Tagging Releases

Tags are a way to mark a specific commit in the repository. This is useful for marking releases, versions, and other important points in the history of the project. Let's tag the current commit as a release. You can also digitally sign the release using GPG keys. I will cover GPG/PGP in a dedicated post.

Let's tag the current commit as a release. We will create a tag called v1.0.0.


$ git tag v1.0.0
$ git tag
v1.0.0

The git tag command lists all the tags in the repository. You can see that we have a tag called v1.0.0. This is the tag we just created.

You can also create an annotated tag by using the -a flag. This will open up an editor for you to add a message to the tag. For example,


$ git tag -a v1.0.1

This will open up an editor for you to add a message to the tag. You can also add a message inline by using the -m flag. For example,


$ git tag -a v1.0.2 -m 'This is a release tag.'

This will create an annotated tag called v1.0.2 with the message "This is a release tag."

Congratulations! You've made your first tag! 🎉

Origin and Remote

If you are working on a project with multiple developers, you will want to use a remote repository to share changes with other developers. This is where origin comes in. origin is the default name for the remote repository. You can have multiple remotes, but origin is the most common name for the remote repository.

git remote

Let's add a remote repository to our project. We will use origin set to a bare local git repository created in another directory.


$ git init --bare ../my-project.git
$ git remote add origin ~/my-project.git
$ git remote -v

git push

Now that we have a remote repository set up, we can push our changes to the remote repository. This will allow other developers to pull the changes from the remote repository and work on them if they have access to the same folder.


$ git push origin --all
Enumerating objects: 11, done.
Counting objects: 100% (11/11), done.
Delta compression using up to 12 threads
Compressing objects: 100% (5/5), done.
Writing objects: 100% (11/11), 3.47 KiB | 1.16 MiB/s, done.
Total 11 (delta 0), reused 0 (delta 0), pack-reused 0
To /tmp/markizano/my-project.git
 * [new branch]      master -> master
 * [new branch]      staging -> staging

This will push all the branches to the remote repository. Now other developers can pull the changes from the remote repository and work on them.

Please note: I used a bare repository in a folder in this example as the remote repository. You can set this to a URL of various resource endpoints and git will respond to them in kind. It's an incredibly versatile tool in tracking files across time and branching changes. I'll show you how you can run your own git server in a dedicated post. Hint: Gitolite is a great tool for this.

Congratulations! You've made your first remote push! 🎉

Pull Requests

If you are working on a project with multiple developers, you will want to use pull requests to review changes before merging them into the main codebase. This is a common workflow for developers to work on features and bug fixes without affecting the main codebase.

Creating a pull request is a UI-driven process. For your respective solution, please refer to their documentation.

or whatever tool you happen to be using to collaborate your changes with a centralized UI.

Video

In this video, we talk about the Git process and take a deep dive into the fundamentals of how work get done. This is a lead into the Scrum process. Captions were generated thanks to the Whisper library by OpenAI with some work done by an open source contributor, @guillaumekln

Please enjoy the video!!

Comments

Popular posts from this blog

Setup and Install Monero(d) -- p2pool -- xmrig

Build xmrig on Linux

Perl Net::SSH2::SFTP Example