RSA Encryption Program with Assembly!!
by SirCalcsALot, Jul 16, 2020, 5:54 PM
Hello everyone!
I thought I would share the RSA encryption program I've mentioned in a previous post. It is written in ARM assembly and can process RSA encryption of up 400 bits, with a max 31 bit key length. Because of the short key length, it is very unsecure and can be easily hacked with something like Wolfram Alpha. I designed this code as a final project for my engineering class on microprocessors and assembly language.
Here are a few things about the code:
I've attached the report I wrote up to submit for my project (I had to split it into pieces because of the max file limit on AoPS). The report talks a bit more about the algorithms used as well as shows some test cases. If you look carefully, you will notice that I cited Wikipedia in this report since that is were I got one of my algorithms!
Since the code is over 100 lines, I've included it in a hide tag. Feel free to ask me any questions about the code or assembly in general. Assembly really takes you down to the lower levels of programming and you need to think about things like
Click to see code
I'll try to answer any question you all might have on assembly, so feel free to ask if you want to know about something. Enjoy the code!
(Since I could only attach three files, the first appendix with the code is here and the second appendix with a longer test is here. Let me know if you have issues with these links.)
I thought I would share the RSA encryption program I've mentioned in a previous post. It is written in ARM assembly and can process RSA encryption of up 400 bits, with a max 31 bit key length. Because of the short key length, it is very unsecure and can be easily hacked with something like Wolfram Alpha. I designed this code as a final project for my engineering class on microprocessors and assembly language.
Here are a few things about the code:
- Assembly uses something called registers to manipulate data. We have 12 registers we can use that act as our twelve possible variables. Each register can hold a maximum of 32 bits, so in some places I needed two registers to work with 64 bit numbers.
- Since we only have 12 variables, sometimes we have to push data into the stack to save it for later when we need to access it again.
- The version of ARM assembly I was using doesn't have a moduluo operator, so I had to implement my own.
- To perform the operation
, I implemented a semi-fast algorithm. Initially I used a recursive approach, but I had to switch to an iterative approach due to memory constraints.
I've attached the report I wrote up to submit for my project (I had to split it into pieces because of the max file limit on AoPS). The report talks a bit more about the algorithms used as well as shows some test cases. If you look carefully, you will notice that I cited Wikipedia in this report since that is were I got one of my algorithms!

Since the code is over 100 lines, I've included it in a hide tag. Feel free to ask me any questions about the code or assembly in general. Assembly really takes you down to the lower levels of programming and you need to think about things like
- How functions work and how do you return something from a function.
- How lists work (we actually need an address as well as the number of items in the list).
- How variables work. Some variables point to actual values in memory. Others point to the address in memory that the data is at.
Click to see code
AREA RESET, DATA, READONLY EXPORT __Vectors __Vectors DCD 0x20001000 ; stack pointer value when stack is empty DCD Reset_Handler ; reset vector ALIGN encrypted EQU 0x20000000 ; location of encrypted array decrypted EQU 0x20000500 ; location of decrypted array AREA MYDATA, DATA, READONLY m DCB "ECE 118L", 0, 0 ; message with padding mSize DCD 10 ; message size (bytes) n DCD 1009199683 ; modulus e DCD 17 ; public exponent d DCD 415526641 ; private exponent ALIGN AREA MYCODE, CODE, READONLY ENTRY EXPORT Reset_Handler Reset_Handler _main PROC ; start of main LDR R4, =m ; address of message array LDR R2, =e ; public exponent address LDR R2, [R2] ; public exponent value LDR R3, =n ; modulus address LDR R3, [R3] ; modulus value LDR R6, =encrypted ; address of encrypted array LDR R7, =decrypted ; address of decrypted array LDR R8, =mSize ; address containing message size LDR R8, [R8] ; value of message size (bytes) LSR R8, #1 ; value of message size (half words) MOV R5, #0 ; set encryption loop counter to 0 LoopE LDRH R1, [R4, R5, LSL #1] ; get two characters to encrypt PUSH {R1, R2, R3} ; save values of R1, R2, R3 BL exp_mod ; compute m^e mod n (encryption) POP {R1, R2, R3} ; restore values of R1, R2, R3 STR R0, [R6, R5, LSL #2] ; store encrypted data as a word ADD R5, #1 ; update loop counter CMP R5, R8 ; check if loop counter < mSize BLT LoopE ; continue LoopE if true MOV R5, #0 ; set decryption loop counter to 0 LDR R2, =d ; private exponent address LDR R2, [R2] ; private exponent value LoopD LDR R1, [R6, R5, LSL #2] ; get encrypted word PUSH {R1, R2, R3} ; save values of R1, R2, R3 BL exp_mod ; compute m^d mod n (decryption) POP {R1, R2, R3} ; restore values of R1, R2, R3 STRH R0, [R7, R5, LSL #1] ; store decrypted data as half word ADD R5, #1 ; update loop counter CMP R5, R8 ; check if loop counter < mSize BLT LoopD ; continue LoopD if true Done B Done ; dead loop ENDP ; end of main ; subroutine to compute a^b mod c ; inputs: R1 -> a, R2 -> b, R3 -> c ; output: R0 = R1^R2 mod R3 exp_mod PROC ; begin process PUSH {R4, R5, LR} ; save R4, R5, and LR CMP R2, #0 ; check if exponent is 0 BNE elseEM ; if not, continue MOV R0, #1 ; if true, return 1 POP {R4, R5, LR} ; restore R4, R5, and LR BX lr ; branch back to caller elseEM MOV R4, #1 ; value of y LoopEM ADD R5, R4, #0 ; move R4 into R5 AND R5, R2, #1 ; get last bit of R5 CMP R5, #0 ; check if R4 is even BNE nxtEM ; if so, continue PUSH {R1, R2} ; save R1 and R2 UMULL R1, R2, R1, R1 ; compute (R1)^2 and store in R2:R1 BL Modulo ; find (R2:R1) mod R3 POP {R1, R2} ; restore R1 and R2 ADD R1, R0, #0 ; move R0 into R1 LSR R2, R2, #1 ; R2 = R2/2 CMP R2, #1 ; if R2 is 1 BLS endEM ; end loop BL LoopEM ; otherwise, continue loop nxtEM PUSH {R1, R2} ; save R1 and R2 UMULL R1, R2, R1, R4 ; compute (x * y) and store in R2:R1 BL Modulo ; find (R2:R1) mod R3 POP {R1, R2} ; restore R1 and R2 ADD R4, R0, #0 ; y = (x * y) mod R3 PUSH {R1, R2} ; save R1 and R2 UMULL R1, R2, R1, R1 ; compute (R1)^2 and store in R2:R1 BL Modulo ; find (R2:R1) mod R3 POP {R1, R2} ; restore R1 and R2 ADD R1, R0, #0 ; set x = x^2 mod R3 SUB R2, R2, #1 ; set loop counter to n-1 LSR R2, R2, #1 ; set loop counter to (n-1)/2 CMP R2, #1 ; if R2 is greater than 1 BHI LoopEM ; continue LoopEM endEM PUSH {R1, R2} ; save R1 and R2 UMULL R1, R2, R1, R4 ; compute R1 * R4 BL Modulo ; compute (R2:R1) mod R3 POP {R1, R2, R4, R5, LR} ; restore R1, R2, R4, R5, and LR BX lr ; branch back to caller ENDP ; end process ; subroutine to take R2:R1 mod R3 ; inputs: R2 -> top half of number ; R1 -> bottom half of number ; R3 -> modulus ; output: R0 -> the remainder Modulo PROC ; begin process PUSH {R4, R5, LR} ; save R4, R5, LR MOV R0, #0 ; intitialze remainder as 0 MOV R4, #63 ; loop counter LoopM LSL R0, #1 ; shift the remainder 1 bit to the left PUSH {R0, R1, R2, R3} ; save the number and divisor ADD R3, R4, #0 ; move R4 into R3 (nth bit) BL get_bit ; find Nth bit and store in R0 ADD R5, R0, #0 ; move Nth bit into R5 POP {R0, R1, R2, R3} ; restore the number and divisor ADD R0, R0, R5 ; add Nth bit to remainder CMP R0, R3 ; if remainder greater than divisor SUBHS R0, R0, R3 ; subtract remainder SUBS R4, R4, #1 ; subtract 1 from loop counter BGE LoopM ; if loop counter > 0> continue loop POP {R4, R5, LR} ; save R4, R5, LR BX lr ; branch back to caller ENDP ; end of process ; subroutine to find Nth bit of R2:R1 ; inputs: R2 -> top half of number ; R1 -> bottom half of number ; R3 -> value of N ; output: R0 -> get_bit PROC ; start process PUSH {R4, LR} ; save R4, LR CMP R3, #32 ; check if shift is >= 32 bits LSRLO R1, R3 ; if not, shift the lower register by R3 ANDLO R0, R1, #1 ; and return last bit after shift SUBHS R4, R3, #32 ; if so, subtract 32 from R3 LSRHS R2, R4 ; and shift the upper half by R4 ANDHS R0, R2, #1 ; return last bit after shift POP {R4, LR} ; restore R4, LR BX LR ; branch back to caller ENDP ; end of process END ; end of program
I'll try to answer any question you all might have on assembly, so feel free to ask if you want to know about something. Enjoy the code!

(Since I could only attach three files, the first appendix with the code is here and the second appendix with a longer test is here. Let me know if you have issues with these links.)
Attachments:
This post has been edited 3 times. Last edited by SirCalcsALot, Jul 16, 2020, 6:25 PM