Cairo is a practically-efficient Turing-complete STARK-friendly CPU architecture. In this article, we introduce the CPU architecture of Cairo in terms of instruction structure and state transition, and provide some examples of instruction. Instruction Structure The word natively supported by Cairo CPU is a field element, where the field is some fixed finite field of characteristic . P>2^63 Each instruction will occupy 1 or 2 words. If an immediate value([ap] = “12345678”)follows the instruction, the instruction will occupy 2 words, and the value will be stored in the second word. The first word of each instruction consists of the following elements: _dst[ bit0…15]: destination address offset, representative value off -2^{15} + ∑_{i = 0}^{15} b_i . 2^i ∈ [-2^{15}, 2^{15}); [ bit16…31]: op0 address offset, representative value off_op0 -2^{15} + ∑_{i = 0}^{15} b_i . 2^i ∈ [-2^{15}, 2^{15}); [ bit32…47]: op1 address offset, representative value off_op1 -2^{15} + ∑_{i = 0}^{15} b_i . 2^i ∈ [-2^{15}, 2^{15}); [ bit48]: Base register of destination address offset, ap or fp; dst reg [ bit49]: Base register of op0 address offset, ap or fp; op0 reg [ bit50…52]: address of op1, op1_src Case: 000 op1=m(op0+off_op1) Case: 001 op1=m(pc+off_op1) Case: 010 op1=m(fp+off_op1) Case: 100 op1=m(ap+off_op1) bit53…54]: computational logic res_logic Case: 00 res = op1 Case: 01 res = op0 + op1 Case: 10 res = op0 ∗ op1 [ bit55…57]: the update logic of pc pc_update Case: 000 // common next_pc = pc + instruction_size Case: 001 // absolute jump next_pc = res Case: 010 // relative jump next_pc = pc + res Case: 100 // conditional relative jump next_pc = pc + op1 (or instruction_size if dst = 0) [ bit58…59]: the update logic of ap ap_update Case: 00 next_ap = ap (or ap+2 if opcode = 1) Case: 01 next_ap = ap + res Case: 10 next_ap = ap + 1 [ bit60…62]: opcode types opcode Case: 000 // jmp Case: 001 // call Case: 010 // ret Case: 100 // assert State Transition Comment The state transition function represents a general state transition unit (because it contains the processing logic of all instruction types), and a calculation is usually decomposed into multiple continuously executed instructions. Therefore, we need to: ensure the content of the instruction and the validity of the states before and after the instruction is executed (e.g., passing a certain range check and state consistency check) and ensure that the executed instruction is valid. Transition Logic Comment If the state before and after the instruction execution is consistent, the update of the state must be executed according to the following logic: #Context: m(.). #Input state: (pc, ap, and fp). #Output state: (next_pc, next_ap, and next_fp). #Compute op0. If op0_reg == 0: // Judge the basic register of op0 according to the flag value, 0 is ap,1 is fp, and obtain the value of op0. op0 = m(ap + off_{op0}) else: op0 = m(fp + off_{op0}) #Compute op1 and instruction_size. switch op1_src: // Obtain the value of op1 case 0: instruction_size = 1 // op1 = m[[ap/fp + off_op0] +off_op1], with two-level imbedding at most. op1 = m(op0 + off_{op1}) case 1: instruction_size = 2 // There exist(s) immediate number(s). The instruction size is 2 words. op1 = m(pc + off_{op1})// #If off_{op1} = 1, we have op1 = immediate_value. // For example, [ap] = "12345678", op1 = "12345678" case 2: instruction_size = 1 // op1 = [fp + off_op1] op1 = m(fp + off_{op1}) case 4: instruction_size = 1 // op1 = [ap +off_op1] op1 = m(ap + off_{op1}) default: Undefined Behavior #Compute res. if pc_update == 4: // jnz if res_logic == 0 && opcode == 0 && ap_update != 1: // Assign && jump && advanced ap res = Unused // Under conditional jump, the values of res_logic, opcode and ap_update flags can only be as shown above.The res variable will not be used on this occasion. else: Undefined Behavior else if pc_update = 0, 1 or 2: switch res_logic: // The computational logic of res is: case 0: res = op1 case 1: res = op0 + op1 case 2: res = op0 * op1 default: Undefined Behavior else: Undefined Behavior # Compute dst. if dst_reg == 0: dst = m(ap + offdst) else: dst = m(fp + offdst) # Compute the new value of pc. switch pc_update: case 0: # The common case: [ap] = 1 next_pc = pc + instruction_size case 1: # Absolute jump: jmp abs 123 next_pc = res case 2: # Relative jump: jmp rel [ap - 1] next_pc = pc + res case 4: # Conditional relative jump (jnz): jmp rel [ap - 1] if [fp - 7] = 0 next_pc = if dst == 0: pc + instruction_size // If dst = 0, then pc conducts ordinary updates; otherwise, it is similar to Case 2. else: pc + op1 // Under this circumstance, res is Unused, so pc directly conducts + op1, instead of + res. default: Undefined Behavior # Compute new value of ap and fp based on the opcode. if opcode == 1: # "Call" instruction. assert op0 == pc + instruction_size assert dst == fp # Update fp. next_fp = ap + 2 // [ap] saves the current fp value, [ap + 1] saves the pc after the call instruction; when the ret instruction is executed, it is convenient to reset fp to [ap], and then continue to execute subsequent instructions. # Update ap. switch ap_update: case 0: next_ap = ap + 2 // [ap] saves the value of fp, and [ap+1] saves the instruction address after the call instruction; thus, ap increases by 2. default: Undefined Behavior else if opcode is one of 0, 2, 4: # Update ap. switch ap_update: case 0: next_ap = ap // [fp + 1] = 5 case 1: next_ap = ap + res // advanced ap [ap] += 123 case 2: next_ap = ap + 1 // normal [fp + 1] = [ap]; ap++ default: Undefined Behavior switch opcode: // Within the same function, the fp value remains unchanged. case 0: next_fp = fp case 2: # "ret" instruction. next_fp = dst // The ret instruction goes with the call instruction, and assert dst == fp. case 4: # "assert equal" instruction. assert res = dst next_fp = fp else: Undefined Behavior Instruction Verification As shown in Figure 1, one instruction consists of the following elements: structure instruction := (off_dst : bitvec 16) (off_op0 : bitvec 16) (off_op1 : bitvec 16) (flags : bitvec 15) Where: [ bit0…15], [ bit16…31], [ bit32…47] are in range off_dst off_op0 off_op1 -2^{15} + ∑_{i = 0}^{15} b_i . 2^i ∈ [-2^{15}, 2^{15}) But in AIR, we convert them into the following form: ˜off∗ = off∗ + 2^15 (Where * represents one of dst, op0 and op1) So the value range of should be [0,2^16) ˜off∗ Therefore, for the coding of an instruction, we have: Instead of allocating 15 virtual columns with a length of NN to the 15-bit flags of the instruction, instead, we use one virtual column. Note: =∑_{j=i}^{14} 2j−i⋅fj ˜f_i With a length of 16N, which meets the following requirements: ˜f_0 = ∑_{j=0}^{14} 2^{j-0}⋅f_j is a15-bit value ˜f_15 = 0 ˜f_i −2˜f_{i+1} = ∑_{j=i}^{14} (2^{j−i}⋅f_j) − 2⋅∑_{j=i+1}^{14}(2^{j−i−1}⋅fj) =∑_{j=1}^{14}(2j−i⋅fj) − ∑{j=i+1}^{14}(2j−i⋅fj)=fi Therefore, equation (1) can be written as: (2) inst = ˜off_dst + 2^16⋅˜off_op0 + 2^32⋅˜off_op1 + 2^48⋅˜f0 ∈ [0,263) Because the finite field’s characteristic , one instruction can only correspond to one valid field element. P > 2^63 Therefore, for the instruction itself, the following constraints should be met: Instruction: inst= ˜off_dst + 2^16⋅˜off_op0 + 2^32⋅˜off_op1 + 2^48⋅ f0 ˜ Bit: (˜f_i−2˜f_i +1)(˜f_i−2˜f_i+1 −1)=0 for all i ∈ [0,15) Last value is zero: ˜f_15=0 virtual column ∈ [0,2^16), so then ∈ [2^−15, 2^15) Offset are in range: ˜off∗ off∗ Instruction Examples Assert Equal The assert equal instruction can be expressed in the following syntax: <left_handle_op> = <right_handle_op> It ensures that both sides of the formula are equal; otherwise the execution of the program will be rejected. The left side of the equation often comes from [ ] or [ ], and the right side has some possible forms(reg_0 and reg_1 can be or , ∘ can be addition or multiplication, and imm can be any fixed field elements): fp+off_dst ap+off_dst fp ap imm [reg1+offop1][reg1+offop1] [reg0+offop0]∘[reg1+offop1][reg0+offop0]∘[reg1+offop1] [reg0+offop0]∘imm[reg0+offop0]∘imm [[reg0+offop0+offop1][[reg0+offop0+offop1] Division and subtraction can be expressed as multiplication and addition with different operand orders, respectively. Note2: An instruction can be considered as an assignment instruction, in which the value of one side is known, and the other side is unknown. assert For example, [ap]=4[ap]=4 can be considered as an assertion that the value of [ap] is 4, or as an assignment setting [ap] to 4, according to the context. Figure 4 shows some examples of “assert equal” instructions and the flag values for each instruction: instruction [fp + 1] = 5: Interpret assert instruction => opcode = 4 next_ap = ap => ap_update = 00 = 0 next_pc = pc + instruction_size => pc_update = 000 = 0 For op0 and op1, there is no add or mul => res_logic(res) = 00 = 0 There exists an immediate value => op1_src(op1) = 001 = 1 The immediate value address instructs that the addresses are adjacent => off_op1 = 1 The left side of equation [fp + 1] => dst_reg(dst) = 1 The left side of equation [fp + 1] => off_dst = 1 op0_reg/ off_op0 => inital value(1/-1) //Because these flags are not used in this instruction, the default value is filled in. Conditional and Unconditional Jump Comment The instruction allows changing the value of the program counter . jmp pc Cairo supports relative jump (its operand represents the offset from the current pcpc)and absolute jump - represented by keywords and respectively. rel abs The instruction may be conditional. For example, when the value of a memory unit is not 0, the instruction will be triggered. jmp jmp The syntax of the instruction is as follows: #Unconditional jumps. jmp abs <address> jmp rel <offset> #Conditional jumps. jmp rel <offset> if <op> ! Figure 5 shows some examples of the instructions and the corresponding flag values of each instruction: jmp instruction jmp rel [ap +1] + [fp - 7]: Interpret jmp instruction => opcode = 0 next_ap = ap => ap_update = b00 = 0 next_pc = pc + res=> pc_update = b010 = 2 res = op0 + op1 => res_logic(res) = b01 = 1 op1: [fp - 7] => op1_src(op1) = b010 = 2 op1: [fp - 7] => off_op1 = -7 op0: [ap + 1] => op0_src(op0) = 0 op0: [ap + 1] => off_op0 = 1 dst_reg/ off_dst => inital value(1/-1) ///Because these flags are not used in this instruction, the default value is filled in. and call ret The and instructions allow the implementation of the function stack. The instruction updates the program counter( )and frame pointer( )registers. call ret call pc fp The update of the program counter is similar to the instruction. The previous value of is written to [ ], to allow the instruction to reset the value of to the value before the call; similarly, the returned pcpc (the address of the instruction after the call instruction) is written to [ ], to allow the instruction to jump back and continue the execution of the code after the call instruction. jmp fp ap ret fp ap+1 ret As two memory units were written, advanced by 2 and is set to the new . ap fp ap The syntax of the instructions are as follows: call abs <address> call rel <offset> ret Figure 6 shows some examples of and instructions and flag values corresponding to each instruction: call ret instruction call abs [fp + 4]: Interpret call instruction => opcode = 0 next_ap = ap => ap_update = b00 = 0 next_pc = res => pc_update = b001 = 1 res = op1 => res_logic(res) = b00 = 0 op1: [fp + 4] => op1_src(op1) = b010 = 2 op1: [fp + 4] => off_op1 = 4 op0_reg/ off_op0 => inital value(0/1) ///Because these flags are not used in this instruction, the default value is filled in. dst_reg/ off_dst => inital value(0/0) ///Because these flags are not used in this instruction, the default value is filled in. Advancing ap The instruction increments the value of by the given operand. ap += <op> ap Comment Figure 7 shows some examples of the advancing instructions and the corresponding flag values of each instruction: ap instruction ap += 123: Interpret advancing ap instruction => opcode = 0 next_ap = ap + res => ap_update = b01 = 1 next_pc = pc + instruction_size => pc_update = b000 = 0 res = op1 => res_logic(res) = b00 = 0 op1 = 123 => op1_src(op1) = b001 = 1 op1 = 123 => off_op1 = 1 op0_reg/ off_op0 => inital value(1/-1) ///Because these flags are not used in this instruction, the default value is filled in. dst_reg/ off_dst => inital value(1/-1) ///Because these flags are not used in this instruction, the default value is filled in.