[ art / civ / cult / cyb / diy / drg / feels / layer / lit / λ / q / r / sci / sec / tech / w / zzz ] archive provided by lainchan.jp

lainchan archive - /sec/ - 2563



File: 1479784872418-0.png (312.41 KB, 213x300, 23.png)

File: 1479784872418-1.png (259.7 KB, 213x300, 180.png)

File: 1479784872418-2.png (310.65 KB, 213x300, 188.png)

No.2563

This is the marked thread for machine code programming and examination. This necessarily encompasses reverse engineering and security to a degree.

Feel free to discuss the nuances and advantages of your preferred architectures and assemblers, along with all else relevant.

In particular, the writing and dissemination of minimal machine code routines is encouraged. Obfuscation and protection techniques are also, of course, relevant.

  No.2565

File: 1479789430238-0.png (131.02 KB, 200x200, armref.pdf)

File: 1479789430238-1.png (226.46 KB, 200x200, QRC0001_UAL.pdf)

a) this definitely belongs on >>>/λ/

b) here, have the most useful two documents I've ever read on the ARM architecture.

It seems like most modern compilers are barely able to make the best use of the really interesting parts of registers... if C implementations on ARM did all memory operations in terms of words not bytes, for instance, you could probably get a 2-3x speedup.

  No.2566

>>2565
s/registers/ISAs

  No.2579

>>2565
>a) this definitely belongs on >>>/λ/
Treat it like a reverse engineering thread then.

>b) here, have the most useful two documents I've ever read on the ARM architecture.

Thanks. The ARM website wants me to become a member before I can access documentation. One would think they would be more welcoming of developers.

>It seems like most modern compilers are barely able to make the best use of the really interesting parts of ISAs... if C implementations on ARM did all memory operations in terms of words not bytes, for instance, you could probably get a 2-3x speedup.

I believe it was SPARC that was specifically designed for dumb C compilers to take advantage of, but this is why much of RISC exists.
How interesting and not surprising that it's the higher languages that are able to take better advantage of hardware.

  No.2582

File: 1479863706098.png (5.36 MB, 200x200, DUI0801G_armasm_user_guide.pdf)

>>2579
ARM's official documentation is okay too... it has info on NEON and FPU instructions, which is nice. Don't use their syntax when using gas.

  No.3008

So, to get us started on messing with efficient machine code fragments, I give you this arbitrary length integer addition routine written in MIPS III (R4000):

nadd:    lw $6, 0($3)
lw $7, 0($4)
addu $5,$6,$7
sw $5, 0($4)
srl $5,$5,32
addiu $3,$3, 4
addiu $4,$4, 4
addiu $2,$2,-1
bne $0,$2, nadd
jr $31

Arbitrary length numbers are represented as vectors segmented into thirty two bit sections, ascending, the number of which is the length; one number is passed in register three and the destination and second summand is in register four; the length of the shorter number is in register two and must not be zero; registers six and seven are clobbered; register five is clobbered and holds a single bit if overflow occured; register thirty one holds the destination once the routine is finished. All but register thirty one can easily be changed.

I read in a thirty two bit word from each number, add it using bit thirty three as a carry flag, store this result in the destination location, then shift the carry flag into bit 1 and repeat until the addition is done.

I have actual MIPS hardware that I could test this with, but testing machine code routines is so frustrating on UNIX that I've merely given this a few glances over. I don't think there are any flaws, but I could be wrong.

Regardless, feel free to improve on this and point out any mistakes I may have made.

  No.3074

>>3008
Well, I made a few mistakes I've corrected now, but that's an excuse for more discussion, at least.
I also targeted MIPS64 this time:
nadd:   lwu $6,0($3)
lwu $7,0($4)
daddu $5,$6,$7
sw $5,0($4)
addiu $3,$3, 4
dsrl $5,$5,31
addiu $4,$4, 4
dsrl $5,$5, 1
addiu $2,$2,-1
bnezc $2, nadd
jr $31
add $0,$0,$0
I should really give myself a decent testing environment for these routines.