ACA QB

10CS74: Advanced Computer Architecture QUESTION BANK FUNDAMENTALS OF COMPUTER DESIGN 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 1...

105 downloads 368 Views 122KB Size
10CS74: Advanced Computer Architecture QUESTION BANK FUNDAMENTALS OF COMPUTER DESIGN 1. 2. 3. 4. 5.

6. 7. 8.

9. 10.

11.

12. 13. 14.

Explain evolution of Computer Architecture. Explain computer development milestones. Explain 7 dimensions of an Instruction Set Architecture (ISA) in details. Discuss Amdahl’s law. Assume a disk subsystem with the following components and MTTF: o 10 disks, each rated at 1,000,000 hour MTTF o 1 SCSI controller, 500,000 hour MTTF o 1 power supply, 200,000 hour MTTF o 1 fan, 200,000 hour MTTF o 1 SCSI cable, 1,000,000 hour MTTF Using the assumptions that the lifetimes are exponentially distributed and that failure are independent; compute the MTTF of the system as a whole. Find the number of dies per 300mm (30 cm) wafer for a die that is 1.5 cm on a side, where the die area is 2.25 cm2 . Find the die yield for dies that are 1.5 cm on a side and 1.0 cm on a side, assuming a defect density of 0.4 per cm2 and α is 4. Suppose we want to enhance the processor used for Web serving. The new processor is 10 times faster on computation in the Web serving application than the original processor. Assuming that the original processor is busy with computation 40% of the time and is waiting for I/O 60% of the time, what is the overall speedup gained by incorporating the enhancement? Explain the processor performance equation. Given an unpipelined processor with a 10ns cycle time and pipeline latches with 0.5ns latency, What are the cycle times of pipelined versions of the processor with 2,4,8 and 16 stages if the datapath logic is evenly divided among the pipeline stages? Also what is the latency of each of the pipelined versions of the processor? Suppose we have made the following measurements: Frequency of FP operations = 25%, Average CPI of FP operations = 4.0 Average CPI of other instructions = 1.33, Frequency of FPSQR = 2% CPI of FPSQR = 20 Assume that the two design alternatives are to decrease the CPI of FPSQR to 2 or to decrease the average CPI of all FP operations to 2.5. Compare these two design alternatives using the processor performance equation. Define computer architecture. Illustrate the seven dimensions of an ISA. What is dependability? Explain two main measures of dependability. List and explain four important technologies , which have led to the improvements in computer system.

10 10 10 5

5

5 5

5

10 6

6*

8* 6* 7*

15. The given data presents the power consumption of several computer system 7* components: Component Product Performance Power Processor SunNiagara8core 1.2Ghz 72-79W DRAM Kingston 1GB 184-Pin 3.7w Hard drive Diamond Max 7200 rpm 7.9 W Read 4.0 W idle i) Assuming the maximum load for each component, a power supply efficiency of 70%, what wattage must the server’s power supply deliver to a system with a sun Niagara 8 core chip, 2 GB 184 Kingston DRAM and 7200 rpm hard drives? ii) How much power will the 7200 rpm disk drive consume, if it is idle roughly 40% of the time? iii) Assume that for the same set of requests, a 5400 rpm disk will require twice as much time to read data as a 10800 rpm disk. What percentage of time would the 5400 rpm disk drive be idle to perform the same transaction as in part (ii) 16. We will run two applications on dual Pentium processor, but the resource 6* requirements are not the same. The first application needs 80% of the resources, and the other only 20% of the resources. i) Given that 40% of the first application is parallelizable, how much speed up will we achieve with that application, if run in isolation? ii) Given that 99% of the second application is parallelizable, how much speed up will this application observe, if run in isolation? iii) Given that 40% of the first application is parallelizable, how much overall system speedup would you observe, if we parallelized it?

1.Give 2 significant changes in computer marketplace that made it easier to be commercially successful with a new architecture. 2.Give the 2 effects occurred due to the dramatic growth rate in the 20th century . 3.Explain different classes of computers. 4.Differentiate desktop,Server & embedded computing classes based on their 3 system characteristics. 5.Define Computer Architecture. 6.Explain Instruction Set Architecture with its 7 dimensions using illustration of MIPS & 80x86. 7.Give basic instruction & floating point instruction formats for MIPS. 8. Explain different instruction type or opcode for MIPS64. 9.Discuss the terms Organization & hardware with respect to designing of computer architecture.

10.What are the functional requirements to be considered in architecture designing? 11.List out other goals apart fro functional requirements that can be considered in architecture designing. 12.Give different trends in technology.. 14.explain the performance trends.

15. Define-i)Computer Architecture ii)Hardware iii)organization iv)bandwidth v)latency/response time vi) feature size vii)dynamic power viii)static power ix)dynamic energy x) learning curve xi) change in yield 16. Explain the scaling of transistor performance. 17. Give trends in power in ICs. 18. Which metric is used for the devices which care more for battery life? 19. Some microprocessors are designed to have adjustable voltage, so that a 15% reduction in voltage may result in a 15% reduction in frequency. What would be impact on dynamic power? 20. Why static power is also important for CMOS when dynamic power is the primary source of power dissipation. 21. Give trends in cost. 22. Explain the impact of time, volume & commodification on the cost of a manufactured component. 23.To understand the cost of current computers , it is must to understand the cost of chips. Justify this. 24.Give the cost of IC,die & how to get no. of dies/wafer. 25.Find the no. of dies per300mm wafer for a die that is 1.5cm on a side. 26. what is die yield & wafer yield? 27. Find the die yield fir dies that are 1.5 cm on a side & 1.0 cm on a side, assuming a defect density of 0.4/sq.cm &α is 4. 28.Give a short note on dependability. 29.Define the terms SLA & SLO. 30. Explain the measures of dependability. 31.Assume a disk subsystem with the following components & MTTF:

• 10 disks , each rated at 1,000,000 –hr MTTF. • 1SCSI controller, 500,000 hr MTTF • 1 power supply,200,000 hr MTTF • 1 Fan ,200,000 hr MTTF • 1 SCSI cabel ,1000,000 hr MTTF Using the simplifying assumptions that the lifetimes are exponentially distributed & that failures are independent , compute MTTF of the system as a whole. 32. Disk subsystems often have redindant power supplies to improve dependability. Using the components & MTTFs from above, calculate the reliability of a redundatnt power supply.Assume one power supply is sufficient to run the disk subsystem & that we are adding one redundant power supply. 33.how can you mesure the performance of a computer system? 34.Explain different benchmarks to measure the computer performance. 35.How can you report & summarize the performance results. 36 Define Spec ratio. Show that the ratio of the geometric means is equal to the geometric mean of the performance ratios, & that the reference computer of SPEC ratio matters not. 37.Explain quantitative principles of computer design. 38.State Amdahl’s Law & give speedup in terms of performance &execution time. 39. What are the 2 factors on which Amdahl’s law depends? 40 Give the new executin time obtained by enhancing the execution mode & also give the overall speedup. 41.Suppose that we want to enhance the processor used for Web serving. The new processor runs 10 times faster on computation in the web serving application than the original processor. Assuming that the original pocessor is busy with computation 40% of the time & is waiting for I/O 60% of time , what is the overall speedup gained by incorporating the enhancement? 42. Give the Amdahl’s law of Diminishing returns. 43. Explain the Processor performance equation. 44.Define the terms-i) CPI ii)CPU Time iii) Clock Cycle time iv)Instrution Count v) Overall CPI vi) CPU Clock cycles. 45. Suppose we have made the following mesurements: • Frequency of FP operations=25% • Average CPI of FP operations=4.0 • Average CPI of other operations=1.33 • Frequency of FPSQR=2% • CPI of FPSQR=20

Assume that 2 design alternatives are to decrease the CPI of FPSQR to2 or to decrease the average CPI of all FP operations to 2.5. Compare these 2 design alternatives using the processo performance equation. 46. Give different pitfalls & fallacies for computer architecture designing.

PIPELINING 1. 2. 3. 4. 5.

6. 7.

Define pipelining with example. Why pipelining is needed? Differentiate RISC and CISC architectures. Explain implementation of a RISC Instruction set in pipeline. Explain the implementation of basic pipeline for MIPS with a neat diagram Showing the data path. Give all the events with associated registers on every pipe stage of the MIPS pipeline. Derive the equation for finding the actual speedup from pipelining. Assume that a unpipelined processor has a 1 ns clock cycle and that it uses 4 cycles for ALU operations and branches and 5 cycles for memory operations. Assume that the relative frequencies of these operations are 40%, 20% and 40% respectively. Suppose due to clock skew and setup, pipelining the processor adds 0.2 ns of overhead to the clock. Ignoring any latency impact, how much speedup in the instruction execution rate will we gain from a

pipeline? Explain delayed branch scheme to reduce pipeline branch penalties. Explain different classes of pipeline hazards with examples. How does exception make pipelining hard to implement? Discuss. List pipeline hazards. Explain any one in detail. List and explain five different ways of classifying exception in a computer system. 13. An Unpipelined machine has 10ns clock cycle and it uses four cycles for ALU operations and branches, five cycles for memory operations. Assume that relative frequencies of these operations are 40%, 20% and 40% respectively. Suppose due to clock skew and set up, pipelining the machine adds 1ns overhead to the clock. Find the speed up from pipelining. 14. With a neat diagram, explain the classic five stage pipeline for a RISC Processor. 15. What are the major hurdles of pipelining? Illustrate branch hazards in detail. 8. 9. 10. 11. 12.

5 5 5 10 10 5

5

5 10 10 7* 7* 6*

10* 10*

1. Explain the term pipelining. What is the goal of a pipeline designer? 2. Pipelining yields a reduction in the average execution time per instruction .Justify this. 3.Give the characteristic properties of a basic RISC instrustion set. 4. Give instruction classes of RISC architecture. 5. How RISC is implemented without pipelining? 6. Explain the classic 5 stage pipeline for a RISC processor, with a neat diagram. 7. Draw the neat datapath of a simplified RISC architecture in a pipeline fashion & explain it.

8. What is the role of pipeline registers in RISC architecture, explain with a neat diagram. 9.Explain in short basic performance issues in Pipelining. 10.Consider the unpipelined processor . Assume that it has a 1ns clk cycle and that it uses 4 cycles for ALU operations & branches and 5 cycles for memory operations. Assume that the relative frequencies of these operations are 40%m 20% ,and 40%,respectively. Suppose that due to clock skew & setup, pipelining the processor adds 0.2 ns of overhead to the clk. Ignoring any other impact, how much spedup in the instruction execution rate will we gain from a pipeline? 11.what is a hazard? How will you classify it? 12.Discuss all the factrors related to performance of pipelines with stalls. 13. what are stuctural hazards?Show the conflict with a neat data path diagram & dig for pipeline stalled for a structural hazard. 14.Suppose that data references constitute 40% of the mix, and that the ideal CPI of the pipelined processor, ignoring the structural hazard, is 1. Assume that the processor with the structural hazard has a clk rate that is 1.05 times higher than the clk rate of the processor without the hazard.Ignoring any other performance losses, is the pipeline with or without the structural hazard faster, and by how much? 15.What are data hazards? Show by a neat diagram & an example. 16. Explain the minimizing of data hazard stalls by forwarding technique(bypassing or short circuiting) with a diagram 17.Give an example for the data hazard which cannot be handeled by bypassing with a diagram. 18 what do you mean by pipeline interlock? 19 What are control /branch hazards?How many cycle stalls it does in a 5 stage pipeline? 20. Explain the schemes used to reduce the pipeline branch penalties. 21.Explain the performance of branch schemes. 22.MIPS is based in 5 stage RISC pipeline scheme. Give its implementation for all 5 clock cycles. 23. With the help of neat diagram show the implementation of MIPS data path that allows every instruction to be executed in 4 or 5 clk cycles. 24. Explain with a neat dig MIPS pipeline with pipeline registers. 25. Give the events takes place at every stage of the MIPS pipeline. 26. Explain the implementation of control for the MIPS pipeline. 27 What is an instruction issue?also defineload interlock. 28.How can you deal with branches in MIPS pipeline?Explain with the help of a diagram. 29.Explain the implementation of forwarding logic with the help of a diagram. 30. Why it is difficult to implement a pipeline? 31. How can you deal with exceptions ? Explain different types of exceptions. 32. Classify the requirements on exceptions./ Give the characteristics of exceptions. 33. What steps a pipeline control takes on an exception in order to save a pipeline state ? 34.Explain the stopping & restarting the execution of a pipleine instruction on an exception. 35. What are precise exceptions? 36. Explain exception in MIPS & how they can be handeled? 37. Explain instruction set complications for MIPS.

INSTRUCTION LEVEL PARALLELISM-1 1. Explain the concept of Instruction Level Parallelism. What are the challenges we are facing in order to exploit ILP? 2. What is data dependence? Explain with an example & what are the hazards that can happen in pipeline system because of the data dependence? 3. What is name dependence? Explain with an example & what are the hazards that can happen in pipeline system because of the name dependence? 4. Explain the concept of control dependence with an example. With what problem a pipeline may suffer if a control dependence is there? 5. When we can implement loop unrolling method in order to exploit loop level parallelism? Give a proper example. 6. With an example tell how loop unrolling with scheduling can minimize the CPI & hence is an advantage over simple loop unrolling & scheduling technique. 7. Explain in brief static branch prediction scheme. 8. What are different dynamic branch prediction schemes? Explain each with an example. 9. Differentiate between 1- bit & 2- bit dynamic branch prediction scheme. 10. What are correlating branch predictors & their advantages over prediction schemes that uses branch prediction buffer. 11. Give a short note on Tournament branch predictors. 12. Give an idea of dynamic scheduling. 13. How dynamic scheduling takes care of different dependencies that is data , name & control dependencies ? 14. What is the concept of register renaming? Explain with an example. 15. Explain the significance of reservation stations. 16. Justify the statement that “Hazard detection & execution control are distributed in dynamic scheduling. 17. With a neat diagram give the basic structure of a MIPS floating point unit using Tomasulo’s algorithm & explain the steps instruction goes through in the approach.

18. Explain the data structures associated with reservation station & register file. 19. Explain in detail the steps of Tomasulo’s algorithm assuming proper data structures. 20. Explain a loop based example using Tomasulo’s algorithm. 21. What are the advantages & limitations of dynamic scheduling. 22. Explain the concept of hardware based speculation technique, with a neat diagram. 23. How control dependences are taken care in hardware based speculation? 24. What is ROB & give its significance? 25. How store operation is implemented in hardware based speculation? 26.Give the steps of algorithm for hardware based algorithm. 27. What would be the baseline performance (in cycles, per loop iteration) of the code sequence in Figure 2.35, if no new instruction’s execution could be initiated until the previous instruction’s execution had completed? Assume that execution does not stall for lack of the next instruction, but only 1 inst/cycle can be issued. Assume that branch is taken & there is 1 cycle branch delay slot. Loop: LD F2,0(Rx) Latencies beyond single cycle MULTD F2,F0,F2 LD +3 DIVD F8,F2,F0 SD +1 LD F4,0(Ry) Int Add, Sub +0 ADDD F4,F0,F4 Branches +1 ADDD F10,F8,F2 ADDD +2 SD F4,0(Ry) MULTD +4 ADDI Rx,Rx,#8 DIVD +10 ADDI Ry,Ry,#8 SUB R20,R4,Rx BNZ R20,Loop

28. In above code , how many cycles would the loop body requires if the pipeline detected true data dependences & only stalled on those? Show the code with inserted where necessary to accommodate stated latencies. 29. The following C code will compute the sum of the entries in a 100-entr vector A. double arraySum = 0;

for (int i = 0; i < 100; i++) { arraySum += A[i]; } For the above code give equivalent MIPS code & unroll the loop by unrolling factor of 4. Assume A[i] is in FP register F10, arraysum in FP register F8, base address of A at 0(R1) and loop count initialized to 100 in register R2. Consider the latency as – load to FP operation 1 clk cycle latency . Find after unrolling how many cycles it is taking for an iteration & also tell that before unrolling how many cycles it is consuming? 30. For the following code sequence show the reservation table status & register file status when first instruction is in write result state. ADD R2, R4, R0 SUB R3, R6, R2 ADD R5, R3, R2 31 32 33 34 35

What is Instruction Level Parallelism? What are the possible data hazards? Explain them briefly. Name the different types of data dependences. Explain the same briefly with an example. Discuss the basic pipeline scheduling and loop unrolling with the help of an example. Consider the MIPS code given below – to add a scalar to a vector: Loop: L.D F0, 0(R1) ADD.D F4,F0,F2 S.D F4, 0(R1) DADDUI R1,R1,# -8 BNE R1, R2, Loop Show how the loop would execute on MIPS, computing the number of clock cycles for the following cases: a) with scheduling the loop b) with unrolling the loop c) with unrolled loop after it has been scheduled. Assume integer load latency of 1 and integer ALU operation latency 0f 0 and the latencies of FP operations as follows: Instruction producing Instruction using Latency in clock result result cycles FP ALU op Another FP ALU op 3 FP ALU op Store double 2

5 5 5 10

10