[ art / civ / cult / cyb / diy / drg / feels / layer / lit / λ / q / r / sci / sec / tech / w / zzz ] archive provided by lainchan.jp

lainchan archive - /λ/ - 20034



File: 1478515251293-0.png (58.31 KB, 200x300, conspiracy.jpg)

File: 1478515251293-1.png (20.99 KB, 300x255, uwotm8.jpg)

No.20034

Hey lainons, long time lurker here.

While I truly enjoyed discussing random crazy ideas in most of /tech and /lam, it came to my attention that some lainons don't really understand how unix work both abstraction wise and implementation wise.

Which is nothing wrong, considering most newer devs are more experienced in web development framework or high level languages providing strong abstractions, hiding most machine specific stuffs and unix wizardy.

And even some oldfarts might be lisp hacker who does not give much soyfak about io side effects and let the system programmers do the dirty works (and do it correctly).

And yet, most of us works on unix variant os and it never hurts to take a look at tools at your disposal, as it leads to insights about things you can build with it, inherent limitation and whatnot.

Even if you are not toolsmith kind of person, how hardwares work and how unix abstracts those devices are never ending source of inspiration (believe me, some people worked on one project longer than some young lainons' lifetime and they refuse to get tired of hacking one)

So without further ado, I introduce this (meta?)thread, dealing with inner workings of unix operating system.

Instead of calling someone newfag b4 they finish design and implementation of freebsd operating system and POSIX specification, I propose the magic trick that works for me all the time : read and write (code or documentation about) actual program.

Now creating one's own unix variant is lifeconsumingly interesting project, but as a programmers we are obliged to optimize our learning process whenever it makes sense to & serve the purpose.

Instead, I'm thinking of posting general abstraction layer's design and provide pointers to real operating system's implementations (source code) with explanation of why is it written that ways (what FILTHY hardware design imposed on the implementation and those kind of reason)

I can explain how openbsd(i386, amd64, arm, sh4) and linux (limited part of it...) is implemented in details as I was part of developers (still contributes random bug fixes, but with much less frequency), so my implementation example would be mainly from those two projects. Other lainon can help me out if they can provide counterparts for other os.

If this thread gets traction, maybe we can start lain magazine series too.

any suggestions and ideas?
what should we start? I was thinking about file system in general (from file to vnode and how hard drive controller works).

tl;dr
ever-evolving state of art design and implementation of operating system (3rd edition)'s unix subset without spoon feeding.

more like general os internal wiki, but it's hard to teach|learn something soley from wiki if you cannot explain what you don't know.

  No.20035

Will you consider writing things on the wiki ?

  No.20036

>>20035
last time i visited it was closed.

under what category should I contribute?

  No.20043

>>20034
This would be awesome OP! Im looking forward to anything you have to offer regarding the *nix kernel. Wouldnt the best place to start be something to do with the process scheduler? The file system is something that I believe is external to the kernel. Also there are some books out that explain the Linux kernel, maybe you could use one of those as a reference point. The old classic on the Linux kernel is Lion's Commentary on the Unix Kernel, which is freely available on the internet.

  No.20052

I"m probably a good fit for the audience you describe and have recently become more interested in systems programming, so I'm very excited for this thread.

Thank you for sharing your knowledge and experience, OP.

  No.20053

>>20043

while filesystem itself is meant to be kernel agnostic, how unix kernel
lets user program to use file contains good amount of kernel side works
i.e. mapping file to address space, managing vnode

and this thread's purpose is to explain unix as operating system
(including some part of userland), not just as kernel, so explaining
some filesystem might be fun.

thanks for the books! i'll definately check out those books time to time
as linux is not exactly my specialty.

>>20052

good to hear you are interested lainon! never shy away from asking
questions in this thread plz.




Right now I'm kind of busy but I'll start writing something this weekend
or anytime soon.

As where to start, I want to try more story oriented narrative. That is,
I'll start from the question "what starts to happen in your computer when
you execute hello world program (from c prog lang book) in your shell.

By describing how shell forks your program, I'll be able to explain how
kernel reads program, parses elf information and loads it to process
structure, manages threads, reads from file system and writes to tty
device in general sense.

If this method works out as effective, we can further our discussion to
boot process, network demon, userland utilities and so on.

  No.20054

File: 1478616741904-0.png (6.24 MB, 164x200, nope.png)

File: 1478616741904-1.png (4.38 KB, 184x200, phk.png)

>>20053
daemon not demon ofc

but debugging tcp/ip stack is more about exorcism than engineering

  No.20056

>>20034
>Even if you are not toolsmith kind of person, how hardwares work and how unix abstracts those devices are never ending source of inspiration (believe me, some people worked on one project longer than some young lainons' lifetime and they refuse to get tired of hacking one)
UNIX doesn't provide robust abstractions for most hardware.
Take a terminal device as an example. Every program must know how to communicate with any terminal and this is largely delegated to the ncurses library.
A video screen is another example. The X programs provide the interface to these.

>If this thread gets traction, maybe we can start lain magazine series too.

The Lainzine already exists. I may have misunderstood you and you really mean a series of Lainzine articles.

>more like general os internal wiki, but it's hard to teach|learn something soley from wiki if you cannot explain what you don't know.

I'll start my own thread if I want to discuss operating systems, I think. This thread works for UNIX, but UNIX isn't a robust operating system one should take cues from.
Regardless, I don't want to start an argument, so I won't be posting in this thread asides from this. I simply wanted to get you thinking about several of the qualities you wrote about.

  No.20057

>>20056
I'm not arguing back, just clarifying things.

>robust abstraction

not saying it provides one to userland program; that should be corresponding userland library's job. I was talking about how device drivers are written, as I spent most of time writing one.

>lain magazine


yes I meant series of lainzine articles not whole new branch of article just for the sake of one thread.

>is unix operating system


maybe I should have named thread "how openbsd and linux implements posix" to prevent triggering people not the general term unix but I'm also here to discuss not define terms like academics.

  No.20069

This is great OP.
I too think that the process scheduler is a much interesting subject.
Also paging and virtual memory

I've been meaning to dive in the OpenBSD kernel internals, but learning the ins of a kernel is a daunting endeavor, if you could give me any suggestions as to where I could start looking. I've been reading the 4.4BSD book, but looking at the sources I don't even know where to start.
One thing that I would like to do for learning is to write a simple module (are they still called modules in OpenBSD?), you know, a new system call or something.
Thanks for your help senpai

  No.20075

>>20036
It isn't. Use the tech namespace. Thanks !

  No.20084

>>20069

What part of openbsd implementation do you want to learn?

loadable kernel module was never a thing in openbsd unlike linux or freebsd and support for lkm was dropped some time ago.

if you want to implement your own system call read manual page :
http://man.openbsd.org/OpenBSD-5.1/syscall.9

tl;dr edit syscall master file and implement it somewhere you think it's most logical to be.

  No.20091

I'm learning about systems programming so I'll be monitoring this thread.

Some big questions in my mind at the moment:

>Where are Linux system calls documented?


I mean the actual system calls, not C wrapper functions. The man pages only seem to care about glibc functions. There's a general system calls man page but they just explain the calling conventions and error handling and mention that sometimes glibc preprocesses arguments and stuff. Why? When is that necessary? I'm interested in that kind of detail.

>Are system calls a stable interface or am I expected to always use glibc?


I don't want to use C to begin with. I want to build a runtime for myself, not use C's if I can avoid it.

>Why is Windows the only OS with I/O Completion Ports?


Nobody seems to care about POSIX AIO. Linux has a bunch of io_* system calls but the userspace libaio man pages say the kernel's implementation is crap and its apparently been crap for years. What the hell? No wonder programs like libtorrent seem to have chosen the simpler thread pool implementation.

>Linux non-blocking I/O vs. kqueue


Not interested in flamewars or anything. I just want a thorough comparison of both system's capabilities. For example, kqueue seems to be a rather unified mechanism, directly supporting signals and file system events, while Linux provides separate mechanisms such as signalfd and inotify.

>What exactly is eventfd for? How does one use it?


Documentation says you can use it to create an events channel between processes or the kernel and that its cheaper than a pipe. I can't figure out how to use it. Apparently, you can send a fixed size integer, and sending multiple numbers apparently adds them together. How is that supposed to work?

>What's the one true way to handle signals?


The most sound approach I found was masking away all signals and then querying them using signalfd, but even that doesn't seem like a perfect approach. In particular, this results in child processes starting off with their all signals masked as well, likely breaking something.

>In particular, how does one properly handle program error signals such as SIG{SEGV,BUS,ILL,FPE}?


I'm writing a JIT compiler and I need to properly recover from those.

Apparently, I can't return normally from a SIGSEGV handler since it'd go right back to the instuction that caused the fault. I reason that the same restriction applies to the other three signals.

So what's the proper way to do it?

>How to use evdev?


I want to retrieve metadata about and capture input from human interface devices such as keyboard and mouse but also game controllers, joysticks and whatever other hardware might be plugged in. I don't want to depend on things like a running X server, I want the info straight from the kernel.

My research points me towards the evdev interface but documentation is rather obtuse. Also, what about keyboard layouts? Does evdev handle that somehow?

>How are libraries such as libdl implemented?

>What does dlsym do?

You said you'd discuss how the OS loads program images. Is that functionality exposed by the kernel somehow, or am I expected to load programs myself?

  No.20092

>>20091
Forgot one question:

>How do I secure a region of memory?


Suppose my program needs to handle data such as passwords or encryption keys. At the very least, I'd like to ensure this data stays in memory and doesn't end up in memory images and core dumps and such.

How can I tell the kernel to protect the memory? What other security features does the kernel provide?

  No.20093

>>20069
>I've been meaning to dive in the OpenBSD kernel internals

If you can't read it, you're biting far more than you can chew in terms of OS design 101. Also start with more "straightforward" kernels, like linux 2.2 or so, or maybe even numerous toy kernels people spam github with every day.

NetBSD code style is quite spartan and does far less hand holding compared to linux.

  No.20098

how about some easier stuff

first of all, a list of links would be great:


understanding the kernel; (kernel general)
what do i have to expect from a kernel; (kernel specific)
how to compile a kernel; (kernel in use basic)
apply known useful practices to kernel; (kernel in use advanced)
what are the implications of these practices; (kernels in use advanced general)
init system and the kernel; (kernel and process basic)
etc...

you see i dont even know what i dont know.. mostly in this section you get tutorials and little to none documentation/explanation..

cool thread though

  No.20099

Thanks for taking your time to answer:
>>20093
Actually, it's not that the source code is hard to read, I just don't know where to look. >>20084 made me realize that maybe I don't want to look into the kernel.
>>20084
>What part of openbsd implementation do you want to learn?
Yeah, that got me thinking.
I think I'm more interested in the stuff that can be seen from (and that directly affects) userland.
I don't know if this would be the thread for this, but low-level system maintenance, so to speak. Also building a custom tailored kernel.
I'm just going to look into the official documentation for this, but thanks for answering.

  No.20100

Could you tell me if UNIX provides a way to address the inodes directly, rather than a link? (Like like xdg-open [inode number] )

I find both hardlinks and softlinks lacking in terms of capability. If the target of a softlink is moved, then the softlink breaks, while the writing method of many text editors, including emacs, overwrite hardlinks and make it a new file instead. (This is because, rather than writing in the file directly, they work in a temporary file and literally overwrite the hardlink with a copy of that file when you write.)

Windows and OS X both have shortcuts that do not break when the original file is moved. Does Unix really not have an equivalent? And if so: how come they're behind in that area? It seems like quite a fundamental thing to me.

  No.20102

>>20099
>Actually, it's not that the source code is hard to read

If you can't navigate and understand the codebase, it's said you "can't read" the code.

Knowing english doesn't mean you'll make sense of a difficult book.

>>20092
>prevent sensitive data from being paged out
man mlock(2) and man setrlimit(3) to disable coredumps

>>20100
> a way to address the inodes directly
No, it is filesystem specific and theres not much you can do from userspace. As a kernel driver, however, it's often possible to address filesystem by inodes (for example nfs servers do that, so they dont need to keep per file clientstate).

>>20091

>Where are Linux system calls documented?


Syscalls are documented in mangpages in section 2 (section 3 means library wrapper). These typically directly translate to kernel syscalls 1:1, but not always.

>Are system calls a stable interface or am I expected to always use glibc?


Yes they are guaranteed to stay between kernels (linus "dont break userspace ever"), but are not "same" between architectures. For example pipe() interface is architecture specific, as wella s handling of varargs. Some archs have socketcall sub-api, some dont ...

>Why is Windows the only OS with I/O Completion Ports?

Linux IO is process/thread oriented deep inside kernel design (a block io request is ultimately always tied to a thread). On windows, all IO is a callback, which is why you can have APC for IOPL. That said, windows IO still massively underperforms compared to modern linux.

>Linux non-blocking I/O vs. kqueue

epoll and kqueue are architecturally comparable. on linux you can of course too get signal and inotify via epoll - the difference is that linux kernel apis provide just basic building blocks, instead of opaque tools. this is most visible with iptables vs ipfw/pf.

bsd kqueue is unified interface, but internally it of course treats signals differently from block io or directory notify.

>eventfd

Closest relative would be CreateEvent. Both eventfd and createevent can be used to implement IPC semaphores. If you don't know how IPC events are useful, first ask about that.

>What's the one true way to handle signals?

I'm inclined toward windows/plan9 approach (queued packet of data). Unix bitmap model (and limited depth rtio) is unfortunate legacy we have to live with.

>Apparently, I can't return normally from a SIGSEGV handler since it'd go right back to the instuction that caused the fault. I reason that the same restriction applies to the other three signals.


Note that the only acceptable use for that is handling of null dereferences and COW GC. If you use it for something different, your design is seriously broken.

If you want to continue after handling fatal fault, you need an instruction decoder. You need it to figure out what exactly has happened in the first place. Note that this is usually not the case with things like null - you simply longjmp to upper stack frame exception handler (as null deref is simply always exception).

>evdev

Use libevdev, raw evdev ioctls are not guarateend to be stable.

>How are libraries such as libdl implemented?

Read their code. Basically just mmap() and ton of elf format bookkeeping.

>You said you'd discuss how the OS loads program images. Is that functionality exposed by the kernel somehow, or am I expected to load programs myself?


Kernel is not involved in pure userspace things such as dlsym/libdl aside from providing mmap() and execve(). In particular, it does not deal with ELF symbols at all (unless were talking about kernel modules).

  No.20104

ITT : we talk about "unix like operating system" as operating system not just kernel. As OP my intention is to talk about unix programming in general (including runtime), with special focus on actual implementation as published by vendors of open source operating system projects.

>>20098

observe this is not just "kernel hacking for gentoo" thread. I'll eventually provide additional learning material for fiddling with kernel but to be sure every lainons are on the same page, we write about generic "programming on unix" first and delve into kernel part whenever it surfaces.

>>20092
pretty much what >>20102 answered. I'll just mention that in case of swap space, openbsd already encrypts it and linux kernel can do it(not sure what distros do it for default). I'm not sure what kind of project you are working on, but I'd read how libressl handles memory for more pointers on this subject.

>>20010
just things not covered in >>20102
>is system call stable?
On openbsd, nothing is guaranteed to be stable. C library is supposed to mitigate this instability but even those C APIs are subject to change(library APIs themselves are quite stable these days tho). It's program maintainer's choice to write whatever the fuaark they want, but don't advertise your project as portable when you directly invoke system call instead of posix APIs.

>loading program

observe this is mixture of machine dependent and machine independent steps. consult following code to understand how openbsd kernel loads program based on ELF info (i.e how to load shared library into address space, how to compose core file)

https://github.com/openbsd/src/blob/master/sys/kern/exec_elf.c

I can't come up with practical use cases of doing it hand crafted ways unless you are bootstrapping.

nevertheless if you want general intro about ELF file formats and how executable files work I recommend this page about creating minimal elf program for linux.

http://www.muppetlabs.com/~breadbox/software/tiny/home.html

overall impression is you don't want to work on unix variant at all. "library operating system" might be the keyword you are looking for.

  No.20105

File: 1478896683120.png (1.45 MB, 200x200, tutorial.pdf)


  No.20106

>>20104
*>>20100 not >>20010

  No.20108

File: 1478901759257-0.png (33.95 KB, 200x200, Advanced.Programming.in.the.UNIX.Environment.3rd.Edition.0321637739.pdf)

File: 1478901759257-1.png (6.22 MB, 67x118, DIFO.chm.gz)

>>20105
let's provide basic info of what reading material we are contributing.

Design and implementation of freebsd operating system : will introduce you how bsd subsystems ar e designed. must have reference material, used more like dictionary than a textbook.

Advanced Programming in the Unix Environment : tbqh, with man pages, the c programming language book and this book, you should be able to write almost anything you want out of unix.

  No.20109

>>20108
Whenever I try to download the first pdf I just get a gray page, is anyone else unable to download it.

  No.20111

File: 1478908163288.png (19.65 MB, 200x200, Advanced.Programming.in.the.UNIX.Environment.3rd.Edition.0321637739.pdf)

>>20108
>>20109

sorry about that

  No.20112

>>20104
> I'll eventually provide additional learning material for fiddling with kernel but to be sure every lainons are on the same page, we write about generic "programming on unix" first and delve into kernel part whenever it surfaces.
What is your opinion of C being the defacto systems programming language and being the API on which all generic unix programming must interface? Even Ken Thompson, the main inventor of unix made the Go language to get rid of all the problems that C causes programmers. Do you think C deserves to be the one and only language systems programmers use or should it be replaced with something like Rust?

  No.20119

>>20112
Hi, I am not OP but I just want to say
I think C will remain the de facto language for systems programming in the foreseeable future.
C is quite the standard language of UNIX, in principle, were one to adhere (which is rarely the case) to ANSI C, there would be little to no modifications needed for a program to make it work on all (unix) platforms.
Another reason is that you are citing two potential languages that can take the place of C. A single language would be desirable to mantain consistency. Having to know two languages and be constantly discriminating between the two for maintenance (at roughly the same level) would just be cumbersome and likely cause confusion.
The documentation is entirely aimed at C programming. Having two languages also means twice as much documentation, but also having to rewrite it all and have a slow transition leaving us in a sort of python2 vs python3 situation. Badly.
Finally, I don't know either of Rust or Go, but the interaction between C and the unix kernel (or any of it's incantations) is a very direct one, due to the "chicken and egg" situation in which they were conceived. This makes many many things really easy for C programmers using external libraries due to the API. Using a different language means extra work in the form of language bindings and whatnot, and there is also the problem of issuing system calls in a way that the kernel might understand them without undue labor.
Even more, C programmers often want to roll stuff their own way for some language features, a common example is making your own version of malloc(), does Rust/Go provide for such things?
In sum C is a very low level language that integrates tightly with a very low level operating system such as unix.

  No.20120

>>20102
>I'm inclined toward windows/plan9 approach (queued packet of data)

I meant to ask how to properly handle signals from the application side, not the kernel side. There are several approaches: signalfd, starting a thread dedicated to signals handling, etc. I don't understand which one is best.

I agree that a data packet queue would be much better. I wish I could simply opt-out of this annoying system, but I can't just ignore them

>Both eventfd and createevent can be used to implement IPC semaphores.


I see. So it is not a general "pass a small block of data to another process" mechanism?

>If you don't know how IPC events are useful, first ask about that.


Indeed, I don't know. Please clarify on that point.

>Note that the only acceptable use for that is handling of null dereferences and COW GC.


That's the idea. I also thought about handling arithmetic errors. If code emitted by my JIT generates a SIGILL, it probably means the compiler is buggy. I think it'd be useful to handle that condition just in case. Since its asynchronous it shouldn't incur any performance penalties.

So the only way to handle these signals properly is to longjmp out of the signal handler?

>libevdev


Is there a detailed tutorial explaining how to use it? I've found a lot of example code but I don't really know what I'm doing.

>Basically just mmap() and ton of elf format bookkeeping.


Cool, I'll study that and see if I can recreate that functionality.

>>20104
>I'd read how libressl handles memory for more pointers on this subject

Will do.

>>20104
>don't advertise your project as portable when you directly invoke system call instead of posix APIs.

The thing is I want to make a specific implementation tailored to each platform. This is so I can learn about and use each platform's unique strengths. I don't plan to target POSIX

>I recommend this page about creating minimal elf program for linux


This is really good! Thanks!

>library operating system


I think exokernels are a cool idea but it'd be too much work for me to figure hardware out at this time. I'm just not smart enough.

For now I just want to talk to the kernel itself as directly as possible using as few dependencies/libraries/programming languages as possible. I think this is a good learning experience

  No.20121

>>20112
If your language can make system calls, then it is a systems programming language and just as powerful as C.

  No.20123

>>20112
Well, there's this
http://repo.cat-v.org/goblin/
>Goblin is a recreation from scratch of the traditional Unix and Plan 9 command line tools but this time built using the Go programming language.

  No.20135

There is small os implemented by MIT guys called xv6 based on UNIX ,its meant to be a teaching tool.
The entire source code with the book explaining how is available for free if anyone wants to look..

https://pdos.csail.mit.edu/6.828/2014/xv6.html>>20043

  No.20136

>>20135
https://pdos.csail.mit.edu/6.828/2014/xv6.html

Fixed link

Also look into lions commentary for unix v6

  No.20137

>>20120
>signalfd
Only if you can afford to "pull" signals - that works for things like SIGUSR and SIGRTXX, but it wont do for stuff like SIGSEGV as those can't be really deferred (execution context can't continue towards your event handler, until you remove the fatal fault state).

Generally speaking, signal masking is incredible mess - don't do it if you can avoid it. There's a lot out there written on the topic.

https://evbergen.home.xs4all.nl/unix-signals.html
https://lasr.cs.ucla.edu/vahab/resources/signals.html

>I see. So it is not a general "pass a small block of data to another process" mechanism?

>Indeed, I don't know. Please clarify on that point.

Semaphore is just that. Typically it's used as a conditional wait between processes (in-process with threads you can just use pthread mutex).

Note that in traditional unix/posix, pipe() is used for the same purpose - waiting for a byte on a pipe can be semaphore too - however that one is not counted.

Advanced semaphores can be counters which are used to signal state of a task - how many sub-tasks workers were processed, how many bytes are pending in ring buffer and such...

>That's the idea. I also thought about handling arithmetic errors.

>So the only way to handle these signals properly is to longjmp out of the signal handler?

For unmaskable signalls (SEGV/BUS/ILL/FPE...), the current state is corrupt and you have to uncorrupt it somehow. Typically you simply longjmp to exception frame - this works as you simply unwind all stack up to previous setjmp point (ie where your JIT compiled try/catch{} statement). Signal frame is always on current stack, dont use sigaltstack.

Or, you can simply remove the fatal condition - in case of moving/cow GC, mprotect the pages and/or redirect registers with the read/write to proper gc zone, in case of SIGFPE or SIGILL you can emulate the instruction (fe if the CPU does not support it) and move PC past. In either case, you dont longjmp, but merely ensure that restarting at same instruction will not fault again.

>Is there a detailed tutorial explaining how to use it? I've found a lot of example code but I don't really know what I'm doing.


Well, there are at least API docs https://www.freedesktop.org/software/libevdev/doc/latest/ ... I'm afraid you'll have to simply dissect the examples.

  No.20139

>>20137
>>20120

More about signals - it's generally bad idea to use those for "normal" program operations, it should be always things unexpected (null derefs). Signal handler invocations tend to be rather slow and often prone to race conditions between threads.

For this reason, queue mechanisms such as userfaultfd(2) emerged, which simply wake up a thread (and put the faulting one to sleep) - thread switching is cheaper than sighandler/sigreturn, and is free of races too.

  No.20140

>>20137
That's very interesting. The Commentary is also written by the same guy as >>20105 this book.
Thanks.
Which reminds me too of this which I found in some dark corner of the internet:
http://www.nordier.com/v7x86/
>port of UNIX* Version 7 to the x86 (IA-32) based PC
Mostly for the Tr00 UNIX Experience™

  No.20251

How would i go about creating a thumbnail video player, /lam/? For example, i have a porn folder with WEBMs and gifs, but if i were to sort through them i'd have to open them in VLC, find what i'm looking for, copy the name, search for the name in thunar and only then have my file.

I know there are alternatives to this, but i'm looking for a personal fun project, however i don't know how to approach this problem.

TL;DR
Instead of image thumbnail, a video thumbnail for webm files in linux.

  No.20772

Bumping a good thread.

>>20251
This is not the thread to ask, this is about unix internals. If you want video thumbnails is your file search GUI (in this case, Thunar), you'd have to either:
* see if the program is configurable via a scripting language and if it is scoped to do what you want
* dive into the internals of the program, find the code that displays the thumbnails, extend it to do just what you want.
The beginner's general is better for this kind of question.
Have fun