project 1/520F21Project1-unlocked.pdf 1 of 5 CS 520 - Fall XXXXXXXXXXGhose Programming Project 1: Simulator for APEX with in-order issue DUE: Wednesday, October 20 by noon via Brightspace All demos to...

1 answer below »
PFA


project 1/520F21Project1-unlocked.pdf 1 of 5 CS 520 - Fall 2021 - Ghose Programming Project 1: Simulator for APEX with in-order issue DUE: Wednesday, October 20 by noon via Brightspace All demos to be completed by Wednesday, Oct. 27 **EARLY SUBMISSION IS ENCOURAGED** This is a project that has to be done INDIVIDUALLY. DO YOUR OWN WORK. Soft copies of all documents to be submitted via Brightspace by noon on Wednesday, Oct. 20. You also need to demonstrate your simulator to one of the TAs. Instructions for scheduling the demos will be posted on the course page later. Start working on this project as early as you can. PROJECT DESCRIPTION The project has two parts: PART A: This project requires you to implement a cycle-by-cycle simulator for an in-order APEX pipeline with 5 pipeline stages, each with a delay of one cycle, as shown below: The execution of arithmetic and logical operations, as well as memory address calculations, are all implemented in the EX stage. Likewise, memory accesses by load and store instructions are performed in MEM. The arithmetic operations supported are integer operations on 32-bit operands and include addition (ADD, ADDL instructions), subtraction (SUB, SUBL instructions) and multiplication (MUL instruction). Assume for simplicity that the two values to be multiplied have a product that fits into a single register. Instruction issues in this pipeline take place in program-order and a simple scoreboarding logic is used to handle dependencies to consumer instructions in the D/RF stage. For PART A of this project, this, pipeline has the simple scoreboarding logic to stall an instruction in the D/RF stage till the source registers of the instruction are available. Further, the processor has no forwarding mechanism. For PART B, a forwarding mechanism is added to the simulator code developed for PART A. Registers: Assume that there are 16 architectural registers, R0 through R15. The code to be simulated is stored in a text file with one ASCII string representing an instruction (in the symbolic form, such as ADD R1, R4, R6 or ADDL R2, R1, #10) in each line of the file. Registers are 4 Bytes wide. Fetch D/RF MEM WB EX 2 of 5 Instruction Set: The instructions supported are:  Register-to-register instructions: ADD, ADDL (add register with a literal), SUB, SUBL (subtract literal from a register), MOVC (move constant or literal value into a register), AND, OR, EX-OR and MUL. As stated earlier, you can assume that the result of multiplying two registers will fit into a single register.  The instruction MOVC , # , moves signed literal value into specified register. The MOVC uses the EX stage to add 0 to the signed literal and updates the destination register from the WB stage.  Memory instructions include the LOAD and the STORE: LOAD and STORE both include a literal value whose content is added to a register to compute the memory address.  Two additional memory instructions are supported. These are as follows: o LDI # o STI # These instructions are like the LOAD and the STORE but, additionally, both instructions increment the value of by 4 after the original value of is read out to compute a memory address in the D/RF stage. The actual update to the register takes place from the WB stage, like any other register update. A second write port is assumed to exist to enable two register values to be updated simultaneously from the WB stage. A separate adder exists in the EX stage to perform the increment operation to in parallel with the computation of the memory address by the ALU.  The processor supports two flag values, the Z(ero), and P(ositive) flags that are set by instructions that perform register-to-register arithmetic operations (specifically, by the ADD, ADDL, SUB, SUBL, MUL instructions). The values of the two flags are set only by a register-to-register arithmetic instruction when the instruction is in the EX stage, towards the end of the clock cycle. Put in other words, the flags are actually updated within the EX stage. These are quite different from register updates that take place from the WB stage. The flags are set as follows: o The Z flag is set when the result of the arithmetic operation has the value zero. o The P flag is set when the result of the arithmetic instruction is positive and higher than zero.  Two pairs of conditional control flow instructions (that is, branch instructions) are supported, one pair dedicated to a specific flag. These instructions perform conditional branching based on the values of the Z, and P flags and are follows: o BZ: branch if the zero flag is set; BNZ: branch if the zero flag is not set. o BP: branch if the positive flag is set; BNP: branch if the positive flag is not set. All of these four branch instructions use PC-relative addressing and specify a signed literal and the address of the location to branch to (called the “branch target”) is computed in the EX stage by adding the signed literal to the address (PC-value) of the branch instruction. All target instructions begin at a 4-Byte boundary in memory. The decision whether to take branch or not is taken when the condition for branching is satisfied is made when the branch instruction is in the EX stage itself, towards the end of the clock cycle. If the branching condition is satisfied, all following instructions in the pipeline are dismissed (these instructions with be in the D/RF stage and the F stage). In the following cycle, the instruction to be fetched comes from the address computed in the previous cycle for the branch target. If the branching condition is tested and not found to be valid, instruction execution continues as usual (and no instructions are dismissed from the earlier stages in the pipeline). “Dismissed” simply means that no further processing is done for these instructions which pass through the remaining pipeline 3 of 5 changes without invoking any operations within the stages. In effect, each dismissed instruction is a single-cycle bubble. As an example, consider the branch instruction BNP # -64. Assume that this instruction has the address 4096 – this is its PC value. When this BNP instruction is in the EX stage, the address of the target is computed by adding (-64) to 4096, to get the target address as 4032. At the same time, the P flag is tested. If the P flag is NOT set, the instructions in the D/RF and F are flushed and the next instruction fetched by the F stage comes from the address 4032. If the P flag is set when the BNP is in the EX stage, the instruction physically following the one in the F stage is fetched. The dependency that the branch instruction has with the immediately prior arithmetic or CMP (see below) and the flag the branch instruction has to check needs to be implemented correctly.  A compare instruction, CMP , is added to compare the contents of two registers and set the Z and P flags based on the result of the comparison. Specifically, the Zero flag is set if the values stored in the two source registers are equal, the Positive flag is set when the contents of has a value greater than the value of the contents of .  A HALT instruction is added to the ISA. As soon as a HALT instruction enters the D/RF stage, further instruction fetching is suspended and all prior instructions are processed. When the HALT instruction enters the WB stage, the simulator returns to the command prompt (see later).  A NOP instruction, which does nothing as it proceeds through the pipeline.  The HALT instruction stops instruction fetching as soon as it is decoded but allows all prior instructions in the pipeline to complete before returning to the command line for the simulator. The implementation of an ADD, BZ, BNZ, HALT and a LOAD are already provided. You are to implement the code for all other instructions described above. The Simulated Instructions, Instructions and Data Memory: The instruction memory starts at Byte address 4000. You need to handle target addresses of JUMP correctly - what these instructions compute is a memory address. However, all your instructions are stored as ASCII strings, one instruction per line in a SINGLE text file and there is no concept of an instruction memory that can be directly accessed using a computed address. To get the instruction at the target of a BZ, BNZ or JUMP, a fixed mapping is defined between an instruction address and a line number in the text file containing ALL instructions:  Physical Line 1 (the very first line) in the text file contains a 4 Byte instruction that is addressed with the Byte address 4000 and occupies Bytes 4000, 4001, 4002, 4003.  Physical Line 2 in the text file contains a 4 Byte instruction that is addressed with the Byte address 4004 and occupies Bytes 4004, 4005, 4006, 4007.  Physical Line 3 in the text file contains a 4 Byte instruction that is addressed with the Byte address 4008 and occupies Bytes 4008, 4009, 4010, 4011 etc. The targets of all control flow instructions thus have to target a 4_byte boundary. So when you simulate a BZ instruction whose computed target has the address 4012, you are jumping to the instruction at physical line 4 in the text file for the code to be simulated. Register contents and literals used for computing the target of a branch should therefore target one of the lines in the text file. Your text input file should also be designed to have instructions at the target to start on the appropriate line in the text file. Instructions are stored in the following format in the text file, one per line: 4 of 5
Answered 4 days AfterOct 09, 2021

Answer To: project 1/520F21Project1-unlocked.pdf 1 of 5 CS 520 - Fall XXXXXXXXXXGhose Programming Project 1:...

Karthi answered on Oct 14 2021
115 Votes
apex_cpu_pipeline_simulator/apex_cpu.c
/*
* apex_cpu.c
* Contains APEX cpu pipeline implementation
*
* Author:
* Copyright (c) 2020, Gaurav Kothari ([email protected])
* State University of New York at Binghamton
*/
#include
#include
#include
#include "apex_cpu.h"
#include "apex_macros.h"
extern int isForwarding;
extern int isDisplay;
extern int numOfCycles;
/* Converts the PC(4000 series) into array index for code memory
*
* Note: You are not supposed to edit this function
*/
static int
get_code_memory_index_from_pc(const int pc)
{
return (pc - 4000) / 4;
}
static void
print_instruction(const CPU_Stage *stage)
{
switch (stage->opcode)
{
case OPCODE_ADD:
case OPCODE_SUB:
case OPCODE_MUL:
case OPCODE_DIV:
case OPCODE_AND:
case OPCODE_OR:
case OPCODE_XOR:
{
printf("%s,R%d,R%d,R%d ", stage->opcode_str, stage->rd, stage->rs1,
stage->rs2);
break;
}
case OPCODE_MOVC:
{
printf("%s,R%d,#%d ", stage->opcode_str, stage->rd, stage->imm);
break;
}
case OPCODE_LOAD:
{
printf("%s,R%d,R%d,#%d ", stage->opcode_str, stage->rd, stage->rs1,
stage->imm);
break;
}
case OPCODE_STORE:
{
printf("%s,R%d,R%d,#%d ", stage->opcode_str, stage->rs1, stage->rs2,
stage->imm);
break;
}
case OPCODE_BZ:
case OPCODE_BNZ:
{
printf("%s,#%d ", stage->opcode_str, stage->imm);
break;
}
case OPCODE_HALT:
{
printf("%s", stage->opcode_str);

break;
}
case OPCODE_ADDL:
case OPCODE_SUBL:
{
printf("%s,R%d,R%d,#%d ", stage->opcode_str, stage->rd, stage->rs1,
stage->imm);
break;
}
case OPCODE_STR:
{
printf("%s,R%d,R%d,R%d ", stage->opcode_str, stage->rd, stage->rs1,
stage->rs2);
break;
}
case OPCODE_LDR:
{
printf("%s,R%d,R%d,#%d ", stage->opcode_str, stage->rd, stage->rs1,
stage->rs2);
break;
}
case OPCODE_CMP:
{
printf("%s,R%d,R%d", stage->opcode_str, stage->rs1, stage->rs2);
break;
}
case OPCODE_NOP:
{
printf("%s", stage->opcode_str);
break;
}
}
}
/* Debug function which prints the CPU stage content
*
* Note: You can edit this function to print in more detail
*/
static void
print_stage_content(const char *name, const CPU_Stage *stage)
{
if (isDisplay)
{
printf("Instruction at ");
printf("%-35s (I%d: %d) ", name, get_code_memory_index_from_pc(stage->pc), stage->pc);
print_instruction(stage);
printf("\n");
}
}
static void
print_stage_content_empty(const char *name, const char *name2)
{
if (isDisplay)
{
printf("Instruction at ");
printf("%-35s %s", name, name2);
//printf("%-15s--> (I%d: %d) ", name, get_code_memory_index_from_pc(stage->pc),stage->pc);
//print_instruction(stage);
printf("\n");
}
}
/* Debug function which prints the register file
*
* Note: You are not supposed to edit this function
*/
static void
print_reg_file(const APEX_CPU *cpu)
{
//int i;
//printf("----------\n%s\n----------\n", "Registers:");
printf("\n");
printf("\n");
printf("--------------------------------");
printf(" STATE OF ARCHITECTURAL REGISTER FILE");
printf(" ----------------------------------------\n");
printf("\n");
printf("\n");
for (int i = 0; i < REG_FILE_SIZE; ++i)
{
printf("REG[%-3d%s ", i, "]");
printf(" value = %-4d", cpu->regs[i]);
if(cpu->reg_values[i] == 0){
printf(" Status = %-4s", "NOT VALID");
}else{
printf(" Status = %-4s", "VALID");
}
printf("\n");
}
printf("\n");
printf("\n");
printf("\n");
printf("--------------------------------");
printf("STATE OF DATA MEMORY");
printf(" ----------------------------------------\n");
for (int i = 0; i < 100; ++i)
{
printf("MEM[%-3d%s ", i, "]");
printf(" DATA VALUE = %-4d", cpu->data_memory[i]);
printf("\n");
}
// for (i = (REG_FILE_SIZE / 2); i < REG_FILE_SIZE; ++i)
// {
// printf("R%-3d[%-3d] ", i, cpu->regs[i]);
// }
printf("\n");
}
/*
* Fetch Stage of APEX Pipeline
*
* Note: You are free to edit this function according to your implementation
*/
static void
APEX_fetch(APEX_CPU *cpu)
{
APEX_Instruction *current_ins;
if (cpu->fetch.has_insn)
{
/* This fetches new branch target instruction from next cycle */
if (cpu->fetch_from_next_cycle == TRUE || cpu->fetch_stall == TRUE)
{
cpu->fetch_from_next_cycle = FALSE;
cpu->fetch.pc = cpu->pc;
current_ins = &cpu->code_memory[get_code_memory_index_from_pc(cpu->pc)];
strcpy(cpu->fetch.opcode_str, current_ins->opcode_str);
cpu->fetch.opcode = current_ins->opcode;
cpu->fetch.rd = current_ins->rd;
cpu->fetch.rs1 = current_ins->rs1;
cpu->fetch.rs2 = current_ins->rs2;
cpu->fetch.imm = current_ins->imm;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Fetch", &cpu->fetch);
}
/* Skip this cycle*/
return;
}
/* Store current PC in fetch latch */
cpu->fetch.pc = cpu->pc;
/* Index into code memory using this pc and copy all instruction fields
* into fetch latch */
current_ins = &cpu->code_memory[get_code_memory_index_from_pc(cpu->pc)];
strcpy(cpu->fetch.opcode_str, current_ins->opcode_str);
cpu->fetch.opcode = current_ins->opcode;
cpu->fetch.rd = current_ins->rd;
cpu->fetch.rs1 = current_ins->rs1;
cpu->fetch.rs2 = current_ins->rs2;
cpu->fetch.imm = current_ins->imm;
/* Update PC for next instruction */
cpu->pc += 4;
/* Copy data from fetch latch to decode latch*/
cpu->decode = cpu->fetch;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Fetch", &cpu->fetch);
}
/* Stop fetching new instructions if HALT is fetched */
if (cpu->fetch.opcode == OPCODE_HALT)
{
cpu->fetch.has_insn = FALSE;
}
}
else
{
print_stage_content_empty("Fetch", "EMPTY");
}
}
/*
* Decode Stage of APEX Pipeline
*
* Note: You are free to edit this function according to your implementation
*/
static void
APEX_decode(APEX_CPU *cpu)
{
if (cpu->decode.has_insn)
{
/* Read operands from register file based on the instruction type */
switch (cpu->decode.opcode)
{
case OPCODE_ADD:
{
if (cpu->reg_values[cpu->decode.rs1] == 0 || cpu->reg_values[cpu->decode.rs2] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
break;
}
case OPCODE_ADDL:
{
if (cpu->reg_values[cpu->decode.rs1] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
// cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
break;
}
case OPCODE_SUBL:
{
if (cpu->reg_values[cpu->decode.rs1] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
// cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
break;
}
case OPCODE_STORE:
{
if (cpu->reg_values[cpu->decode.rs1] == 0 || cpu->reg_values[cpu->decode.rs2] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
cpu->fetch_stall = FALSE;
// return;
}
//cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
break;
}
case OPCODE_LOAD:
{
if (cpu->reg_values[cpu->decode.rs1] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
break;
}
case OPCODE_MOVC:
{
/* MOVC doesn't have register operands */
break;
}
case OPCODE_MUL:
{
if (cpu->reg_values[cpu->decode.rs1] == 0 || cpu->reg_values[cpu->decode.rs2] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
break;
}
case OPCODE_SUB:
{
if (cpu->reg_values[cpu->decode.rs1] == 0 || cpu->reg_values[cpu->decode.rs2] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
break;
}
case OPCODE_DIV:
{
if (cpu->reg_values[cpu->decode.rs1] == 0 || cpu->reg_values[cpu->decode.rs2] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
break;
}
case OPCODE_AND:
{
if (cpu->reg_values[cpu->decode.rs1] == 0 || cpu->reg_values[cpu->decode.rs2] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
break;
}
case OPCODE_OR:
{
if (cpu->reg_values[cpu->decode.rs1] == 0 || cpu->reg_values[cpu->decode.rs2] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
break;
}
case OPCODE_XOR:
{
if (cpu->reg_values[cpu->decode.rs1] == 0 || cpu->reg_values[cpu->decode.rs2] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
break;
}
case OPCODE_CMP:
{
if (cpu->reg_values[cpu->decode.rs1] == 0 || cpu->reg_values[cpu->decode.rs2] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
break;
}
case OPCODE_LDR:
{
if (cpu->reg_values[cpu->decode.rs1] == 0 || cpu->reg_values[cpu->decode.rs2] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->fetch_stall = FALSE;
// return;
}
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
break;
}
case OPCODE_STR:
{
if (cpu->reg_values[cpu->decode.rs1] == 0 || cpu->reg_values[cpu->decode.rs2] == 0 || cpu->reg_values[cpu->decode.rd] == 0)
{
cpu->fetch_stall = TRUE;
if (ENABLE_DEBUG_MESSAGES)
{
print_stage_content("Decode/RF", &cpu->decode);
}
return;
}
else
{
cpu->decode.rs2_value = cpu->regs[cpu->decode.rs2];
cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
cpu->decode.result_buffer = cpu->regs[cpu->decode.rd];
cpu->fetch_stall = FALSE;
// return;
}
//cpu->decode.rs1_value = cpu->regs[cpu->decode.rs1];
break;
}
}
/* Copy data from decode latch to execute latch*/
cpu->execute = cpu->decode;
cpu->decode.has_insn = FALSE;
if (ENABLE_DEBUG_MESSAGES)
...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here