1. a) (3) The following instruction executes on an IA-32 single threaded scalar uniprocessor to input data from an I/O device: IN EAX,60H What kind of problem, if any, can this instruction cause if...

1 answer below »

1. a) (3) The following instruction executes on an IA-32 single threaded scalar uniprocessor to input data from an I/O device:




IN EAX,60H



What kind of problem, if any, can this instruction cause if the processor uses a data cache?




b) (3) The following instruction executes on a MIPS single threaded scalar uniprocessor to output data to an I/O device:




sw $2,0x60($0)



What kind of problem, if any, can this instruction cause if the processor uses a data cache?





2. a) (2) If an exception occurs on a MIPS processor system, which register indicates the source or type of exception?



b) (2) If a trap occurs on a SparcV8 processor system, which register indicates the source or type of trap?








4. Suppose that the scalar pipelined MIPS processor had the ability (similar to that of the SparcV8) to annul the instruction in the branch delay slot. If the branch is not taken, what effect would this have (if any) on the pipeline control bits for the instruction in the branch delay slot?






5. Shown below is a RAID-DP (i.e., RAID6 with dual parity) disk system. D1, D2, D3 and D4 are the data disks. RP is the horizontal or row parity disk and DP is the diagonal parity disk.



For the purposes of this problem, each strip or block on a disk is a 4-bit pattern. The horizontal parity block for each row is the exclusive-OR of the four 4-bit data disk blocks within the row.


The coloring scheme indicates which data blocks comprise each diagonal. Data blocks that are the same color are exclusive-ORed together to obtain the corresponding diagonal parity block on the DP disk. Blocks can be identified by the combination of the disk name (D1, D2, D3, D4, RP or DP) on which the block resides and the block color ( blue, purple, white, red or orange). The purple block on D1 (D1_purple) is to be overwritten with the new 4-bit pattern shown in the diagram.




a) (3) What is the minimum number of blocks that must be read as a result of writing D1_purple (name each block that must be read)?



b) (4) What is the minimum number of blocks that must be written and what is the name and new 4-bit contents of each block that must be written?




6. (5) Show a sequence of MIPS instructions that have the same effect on the MIPS processor as the ARM instruction LDMIA R10!, {R1, R2, R5, R7} has on the ARM processor.






7. A certain system has 2GB (2147483648 bytes) of physical memory and employs a 128 KB (131072-byte) 4-way set associative cache with a line size of 256 bytes.



a) (3) how many sets are there in the cache?


Line size = 256 bytes


128 KB (131072-byte) 4-way set associative cache


131072/256 = 512 = # of lines


512/4 = 128


b) (3) how many memory blocks are there in the system?



c) (3) how many different memory blocks map to set 5 within the cache?




8. The ST(0) register on an IA-32 processor contains the 80-bit internal extended precision floating point representation of the negative value – 8.75. The IA-32 register EDI contains 0x403809B0 and the following IA-32 instruction is executed:




FSTP DWORD PTR [EDI + 4]



a) (4) List the hex contents of the ST(0) register prior to executing this FSTP instruction.



b) (3) List the hex address of each individual memory byte that is written by this FSTP instruction.



c) (4) List the hex contents of each individual memory byte that is written by the FSTP instruction.



9. A byte-addressable memory system contains four memory modules each of which is 32 bits wide by 228
cells deep. The system employs a 1 MB 2-way set associative cache with 128-byte cache lines. It also uses a 32-bit CPU-to-memory data bus as well as 32-bit physical addresses. Four clock cycles are required to complete all of the activity required for a read from a single memory module (one cycle to select the module and send the address, one cycle to decode the address, one cycle to perform the access, and one cycle to transmit the data.)



a) (3) Assuming that the memory system employs low order interleaving of cells,


show a diagram of the proper 32-bit physical address format, including the width and position of each field. Also describe how each field is used.



b) (3) Assuming that the memory system employs high order interleaving of cells,


show a diagram for the proper 32-bit physical address format, including the width and position of each field. Also describe how each field is used.



c) (3) What is the minimum number of clock cycles required to fill a cache line if the memory system uses low order interleaving?





d) (3) What is the minimum number of clock cycles required to fill a cache line if the memory system uses high order interleaving?






14. (5) Suppose that each of the 4 processors in a shared memory multi-processor system is rated at 400 MIPS. A program contains a purely sequential part that accounts for 22% of the program’s execution time on a single processor. The remaining code can be partitioned into three independent parts (A, B, and C). Running on a single processor, part A accounts for 30% of the program’s execution time, part B accounts for 18%, and part C accounts for 30%. What is the apparent MIPS rating for the program if it is run on the 4-processor system and the sequential part must be completed before any of the remaining independent parts (A, B or C) can run in parallel?



15. (5) Show the characteristic table for the following sequential circuit. Assume that at the beginning of each cycle the clock pulses from 0 to 1 and back to 0 quickly enough that the outputs only have time to change once.







Answered Same DayDec 07, 2021

Answer To: 1. a) (3) The following instruction executes on an IA-32 single threaded scalar uniprocessor to...

Aditi answered on Dec 08 2021
133 Votes
1.
a) IN EAX,60H
In case of IA-32, data caching will not cause any issue if the data is asynchronous and timing is not important. If the data is sy
nchronous, then, storing the data in a cache may cause synchronization issues
b) sw $2,0x60($0)
the sw instruction in MIPS system can cause some other data to be overwritten on the destination register if the input or output is synchronous.
2.
a) If an exception occurs on a MIPS processor system, registers $26 and $27 ($k0 and $k1) indicates the source or type of exception.
b) When a trap occurs, the program counters PC and nPC are copied into registers R[17] and R[18] (local registers 1 and 2) of the trap’s new register window.
4. The branch delay slot is a side effect of pipelined architectures due to the branch hazard. The branch would not be resolved until the instruction has worked its way through the pipeline. A simple design would insert stalls into the pipeline after a branch instruction until the new branch target address is computed and loaded into the program counter.
5.
a) We need basically 3 read operations ... 1 for old value of D1_purple, 1 for horizontal parity RP. and DP_Purple.
Because for updating we just need to do XOR of X_RP(i) (XOR) X1(i) (XOR) X1’(i) where X_RP(i) is either horizontal parity or diagonal parity.
Here X1(i) is old value and X1(i)' is new value.
b) Minimum number of blocks that must be written are also 3, D1_purple,DP_purple , Horizontal parity RP.
D1_purple will be 1001
DP_purple =...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here