This comparison table is taken from the book “Architecture and Design of the Linux Storage Stack” which I find useful to help understand the differences between the two.
Journaling
Copy-On-Write
Write handling
Changes are recorded in a journal before applying them to the actual file system
A separate copy of data is created to make modifications
Original data
Original data gets overwritten
Original data remains intact
Data Consistency
Ensures consistency by recording metadata changes and replaying them if needed
Ensures consistency by never modifying the original data
Performance
Minimal overhead depending on the type of journaling mode
Some performance gains because of faster writes
Space utilisation
Journal size is typically in MB, so no additional space is required
More space is required due to separate copies of data
Recovery times
Fast recovery times as the journal can be replaced instantly
Slower recovery times as data needs to be reconstructed using recent copies
Features
No built-in support for features such as compression or deduplication
Built-in support for compression and deduplication
Taken from “Architecture and Design of the Linux Storage Stack”
Copy-On-Write Filesystem does not overwrite the data in place, here is how it is done. Supposedly there is file that will be modified.
Copy the old data to an allocated location on the disk
New data is written to the allocated location on the disk.
Hence the name Copy-and-Write
The references for the new data are updated
However, the old data and its snapshots are still there
As described in the Architecture and Design of Linux Storage Stack by Muhammad Umer Page 59
As the old data is preserved in the process, filesystem recovery is very simplified. Since the previous state of the data is saved on another allocated location on disk. If there is an outrage, the system system can easily revert to its former state. This make the maintenance of any Journal obsolete. This also allows snapshots to be implemented at the filesystem level.
As the old data is still there, space utilisation may be more than what the user expects……
Some of the filesystem the use the CoW based approach includes Zttabyte Filesystem (ZFS) and B-Tree Filesystem (Btrfs)
The journaling file system (JFS) is a kind of file system developed by IBM IN 1990. It keeps track of changes, which are not yet committed to the file system’s main part, by recording the goal of such changes in a data structure known as “journal”. Usually, the “journal” is a circular log.
In the event of a system crash or power failure, a journaling file system can be brought back online more quickly with a lower chance of being corrupted. Depending on the actual implementation, the JFS may only keep track of stored metadata, which results in improved performance at the expense of increased possibility for data corruption.
Here is a diagram taken from Architecture and Design of Linux Storage Stack by Muhammad Umer Page 57
According to the Chapter 3 of the book,
From the diagram, any changes made to the filesystem are written sequentially to a journal, also called a transaction. Once a transaction is written to a journal, it is written to an appropriate location on a disk. In the case of a system crash, the filesystem replays the journal to see whether any transaction is incomplete. When the transaction has been written to its on-disk location, it is removed from the Journal.
It is interesting to note that either the metadata or actual data is first written to the data. Either way, once written to the filesystem, the transaction is removed from the journal. The size of the journal can be a few megabytes.
Benefits of Journal File System and Impact on Performance
Besides making the Filesystem more reliable and preserving its structure in system crashes and hardware failures, the burning question is whether it will impact performance?
Generally, journaling improves performance when it is enabled by having fewer seeks to the physical disks as data is only when a journal is committed or when the journal fills up. For example, in intense meta-data operations like recursive operations on the directory and its content, journaling improves performance by reducing frequent trips to disks and performing multiple updates as a single unit of work.
The Linux kernel supports many filesystems that are native to Linux, but there are other filesystem that Linux support via FileSystem in USErspace (FUSE). Probably you would guess one prominent example of a Filesystem outside Linux that Linux user face often – Windows NTFS
I have written an entire blog on how to get your portable drive on NTFS working. For more information, do take a look at Mounting NTFS on Rocky Linux 8
Another popular usage of FUSE driver is sshfs. It is using SSH to mount the remote file system and avoid the need to setup up NFS or SAMBA while enjoying the benefits of SSH encryption.
If you are having SSH issues and if you turned on high verbosity and the following output is generated
# ssh -vvv XXX.XXX.XXX.XXX
..... ..... debug1: Offering public key: debug3: send packet: type 50 debug2: we sent a publickey packet, wait for reply debug3: receive packet: type 51 ..... ..... debug2: we did not send a packet, disable method debug1: No more authentication methods to try. user1@192.168.0.1: Permission denied (publickey,gssapi-with-mic,password)
According to SSH protocol (RFC 4252), these are the general authentication message codes
Type 2: Incorrect Configuration Settings on the /etc/ssh/sshd_config (Assuming you are using Password Authentication) Inside /etc/ssh/sshd_config, you should have something like
PermitRootLogin no ..... PasswordAuthentication yes ..... ChallengeResponseAuthentication no ..... GSSAPIAuthentication yes GSSAPICleanupCredentials no ..... UsePAM yes
Type 2: Incorrect Configuration Settings on the /etc/ssh/ssh_config In Rocky Linux 8, everything should be commented except the last line “Include /etc/ssh/ssh_config.d/*.conf”
If you have compiled a new appication in an updated GCC-12.3 and still wondering why the Error still surfacing like the message below when you run the binary that was compiled in the GCC-12.3 and not the system GCC:
GLIBCXX version in Wildfly using “strings /usr/lib64/libstdc++.so.6 | grep GLIBCXX” and found the latest version in Cluster is only GLIBCXX_3.4.25 and cannot support c++17.
First step in troubleshooting is really to find out what the binary used for its Shared Object Dependencies.
$ ldd main_fcc ./main_fcc: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by ./main_fcc) ./main_fcc: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by ./main_fcc) linux-vdso.so.1 (0x00007fffffbd1000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f4827ea1000) libm.so.6 => /lib64/libm.so.6 (0x00007f4827b1f000) libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f48278e7000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f48276cf000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f48274af000) libc.so.6 => /lib64/libc.so.6 (0x00007f48270ea000) /lib64/ld-linux-x86-64.so.2 (0x00007f4828236000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f4826ee6000)
For the above example, “libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f4827ea1000)”, the application seems to be pointing back to the system libraries rather than the libraries you are using. You many want to “module load …….” if you are using Module Environment or configure the PATH, LD_LIBRARY_PATH, MANPATH and CPLUS_INCLUDE_PATH of the GCC you . In the end, the Shared Object Dependencies should be pointing to
We have experience in VS Code Environment, where the Shared Object Dependencies was observed to be pointing back to the System Libraries even when using Environment Modules. Then it is best to put in .bashrc something the one below.
# .bashrc
# Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi
# User specific aliases and functions alias rm='rm -i' alias cp='cp -i' alias mv='mv -i' alias ls='ls --color=auto' alias egrep='egrep --color=auto' alias fgrep='fgrep --color=auto' alias grep='grep --color=auto'
# Uncomment the following line if you don't like systemctl's auto-paging feature: # export SYSTEMD_PAGER=
I had a casual read on the book “Bash Idioms” by Carl Albing. I scribbled what I learned from what stuck me the most. There are so much more. Please read the book instead.
Lesson 1: “A And B” are true only if both A and B are true…..
Example 1: If the cd command succeeds, then execute the “rm -Rv *.tmp” command
cd tmp && rm -Rv *.tmp
Lesson 2: If “A is true”, “B is not executed” and vice versa.
Example 2: Change Directory, if fail, put out the message that the change directory failed and exit
cd /tmp || { echo "cd to /tmp failed" ; exit ; }
Lesson 3: When do we use the [ ] versus [[ ]]?
I learned that the author advises BASH users to use the [[ ]] unless when avoidable. The Double Bracket helps to avoid confusing edge case behaviours that a single bracket may exhibit. If however, the main goal is portability across various platform to non-bash platforms, single quota may be advisable.
LAMMPS is a classical molecular dynamics code with a focus on materials modeling. It’s an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. More Information on the software, do take a look at https://www.lammps.org/