
/boot/vmlinuz Path to kernelModern processor architectures typically allow the CPU to operate in at least two different modes: user mode and kernel mode (sometimes also referred to as supervisor mode). Hardware instructions allow switching from one mode to the other. Correspondingly, areas of virtual memory can be marked as being part of user space or kernel space. When running in user mode, the CPU can access only memory that is marked as being in user space; attempts to access memory in kernel space result in a hardware exception. When running in kernel mode, the CPU can access both user and kernel memory space.
Certain operations can be performed only while the processor is operating in kernel mode. Examples include executing the halt instruction to stop the system, accessing the memory-management hardware, and initiating device I/O operations. By taking advantage of this hardware design to place the operating system in kernel space, operating system implementers can ensure that user processes are not able to access the instructions and data structures of the kernel, or to perform operations that would adversely affect the operation of the system.
A process is started by the kernel, it is also the kernel that can end it, all inputs into a process come through the kernel and all outputs from a process pass through the kernel.
ps -eShow all processes
If A and B are processes, they cant talk to
each other directly all communication should pass through the
Kernel.
In a sense this is like a client-server API, if the client (process) wants to access any resources it asks the server (kernel).
cat /proc/1/limits |grep processesGet maximum number of processes on your system
//TODO:
Users are just another type of client to the server (kernel).
cat /etc/passwd | grep <USERNAME>Show user profile
The above outputs something like:
username:password:UID:GID:comment:home:shellThe first username_ followed by : then the
password_ (x means encrypted and stored in
/etc/shadow) followed by the user ID (UID)
which is followed by the group id of the user GID followed
by a comment that describes the user account followed by the users home
directory and the shell that is launched on user login.
cat /etc/passwd | grep rootShow superuser profile
Users can be grouped together for administrative, imagine some files that can only be accessed by users that are part of a specific group.
cat /etc/group | grep <GROUP_NAME>Show all available groups
The above outputs something like:
group_name:group_password:group_id:group_members/ is the base of the hierarchy also know as the root
directory/
) and null characters ( \0)ls /List the files and directories there in root directory
Each file has an associated user user id (UID) and group
(GID) that define the owner of the file and group it
belongs to. These properties are also the building block of file
permissions.
In the context of permissions, there are 3 types of entities within the system, entities can interact with files depending on the file’s permissions.
Here are the entities:
There 3 types of permissions:
read allows an entity to read the file > (for directories read allows an entity to list the content of a directory)
write allows an entity to modify the file > (for directories write allows the contents of the directory to be changed)
execute allows an entity to execute the file > (for directories execute allows access to files within the directory
// TODO:
+-----------------+ | ...
| Program/Process | | .-----> [ ]
+-||----------||--+ | | [ ]
\ libc / | .----' [ ]
'-------------' | | [ ]
| | | [ ]
| | | [ ]
'---------------' ...
|
[User Space] | [Kernel Space]
|
|This section has been adapted from: https://era.co/blog/unbuffered-io-slows-rust-programs
Programming languages have access to OS syscalls, these are used for things such as IO.
syscalls are slow to call, when designing high performance code all syscalls should be analyzed.
No buffering, slow:
use std::fs;
use std::io::{self, Write};
fn main() -> io::Result<()> {
let mut f = fs::File::create("/tmp/unbuffered.txt")?;
f.write(b"foo")?;
f.write(b"\n")?;
f.write(b"bar\nbaz\n")?;
return Ok(());
}We can use the strace program to see the syscalls used in a program:
$ strace --trace=write ./target/release/01_unbuffered
write(3, "foo", 3) = 3
write(3, "\n", 1) = 1
write(3, "bar\nbaz\n", 8) = 8We should rather use buffered io like this:
use std::fs;
use std::io::{self, BufWriter, Write};
fn main() -> io::Result<()> {
let mut f = BufWriter::new(fs::File::create("x.txt")?);
f.write(b"foo")?;
f.write(b"\n")?;
f.write(b"bar\nbaz\n")?;
return Ok(());
}$ strace --trace=write ./target/release/02_buffered
write(3, "foo\nbar\nbaz\n", 12) = 12This section has been adapted from: https://bonsaidb.io/blog/durable-writes/#What%20are%20%27durable%20writes%27%3F
When writing data to a file, the data is cached in RAM by the OS, it is not immediately written to the file-system. This writing to the disk is slow, so buffered io is used. But when power is suddenly cut the data (in RAM) to be written to the file is lost forever.
To prevent the lost of buffered/cached data, we need to flush, or sync the data.
- On Linux, it’s fsync(), fdatasync(), and sync_file_range().
- On Windows, it’s FlushFileBuffers.
- On Mac/iOS, fsync() is available but does not provide the same guarantees as Linux. Instead, a call to fcntl with the F_FULLFSYNC option must be used to trigger a write to physical media.
Rust uses the correct APIs for each platform when calling File::sync_all or File::sync_data to provide durable writes. The standard library does not provide APIs to invoke the underlying APIs mentioned above. Thankfully, the libc crate makes it easy to call the APIs we are interested in for this post.
In programs that use SQLite, there can be various actions to be performed by the database. These actions can be grouped together in what’s called a Transaction.
A transaction is a sequence of actions on data items
Transactions help prevent problems that could arise, such as data durability when a program crushes or in the event of an unexpected power failure or even during complex concurrent programming procedures etc. (These restrictions/guarantees is basically ACID, more info below)
Programs can start a transaction and execute operations as part of the transaction. But for the transactions changes to occur the transaction must be committed. To commit simply means to instruct the database to permanently update its state according to the operations contained within the transaction.
Transactions can be considered as logical units of work for a database system. If a transaction fails the database must remove it’s effects from the database and revert back to the state the database was in before the transaction occurred.
No only are transaction units of work that move the database state forward, transactions are also a database abstraction with the following guarantees (aka ACID):
Atomic: All operations within a transaction should succeed, if even a single operation fails all other operation will not be considered and the transaction fails to be committed.
Consistency: A transaction mutates a consistent database state to a consistent state and a transaction must be deterministic.
Isolation: All operations of each transaction happen ‘together’ instantaneously.
Durability: Effects of successful transactions must become a part of the database.
To get a greater view of Transactions let us see them at work, using Rust and the Rusqlite crate:
use rusqlite::{params, Connection, Result};
/// A helper function for connecting the database
fn connect_db() -> Result<Connection> {
let conn = Connection::open("/tmp/TEST_DB.db")?;
conn.execute(
"CREATE TABLE IF NOT EXISTS vals(
v INTEGER NOT NULL
)",
[],
)?;
Ok(conn)
}
/// A slow way to insert rows
fn slow_insert(conn: &Connection) -> Result<()> {
for count in 1..=1000 {
conn.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
}
Ok(())
}
/// A fast way to insert rows
fn fast_insert(conn: &mut Connection) -> Result<()> {
let tx = conn.transaction()?;
for count in 0..1000 {
tx.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
}
tx.commit()?;
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_slow_insert() {
let conn = connect_db().unwrap();
slow_insert(&conn).unwrap();
}
// #[test]
// fn test_fast_insert() {
// let mut conn = connect_db().unwrap();
// fast_insert(&mut conn).unwrap();
// }
}In the above code we try out two ways to insert 1000 rows in an SQLite database. The code has three function:
fn connect_db() -> Result<Connection>: A
helper function for connecting the databasefn slow_insert(conn: &Connection) -> Result<()>:
A slow way to insert rows.fn fast_insert(conn: &mut Connection) -> Result<()>:
A fast way to insert rowsThe code also has two test function test_slow_insert and
test_fast_insert the later is commented out because we only
want to test the slow one first by running:
cargo test
We see that it is quite slow, the test on my machine output:
running 1 test
test tests::test_slow_insert has been running for over 60 seconds
test tests::test_slow_insert ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 165.22s
Doc-tests st
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00sLet us try out the fast version by commenting out the
test_slow_insert unit test function and uncommenting the
test_fast_insert unit test function. Then after we run
cargo test we get
running 1 test
test tests::test_fast_insert ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.18s
Doc-tests st
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s This time the test completes instantly.
So why is the first one slow, specifically why is this slow:
// A slow way to insert rows
fn slow_insert(conn: &Connection) -> Result<()> {
for count in 1..=1000 {
conn.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
}
Ok(())
}It is slow because it uses the connection’s execute
method which results in a new transaction being created to insert each
and every row. This might be acceptable if the database was held in memory
but in this case the database is being held on the filesystem on
spinning disk drive. Interactions with the filesystem is slow usually
several syscalls have to be called. For example for durability reasons
(a key requirement for ACID) databases often make use of the fsync
system call. All this means creating a 1000 transactions and committing
them is very slow, it is much better to batch the database operations on
a single transaction and only committing them once like this:
// A fast way to insert rows
fn fast_insert(conn: &mut Connection) -> Result<()> {
let tx = conn.transaction()?;
for count in 0..1000 {
tx.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
}
tx.commit()?;
Ok(())
}[home] | Emancipation through technology |
