What is pipelining and how can we increase throughput using
pipelining?



What is pipelining and how can we increase throughput using pipelining?..

Answer / ankit

Pipelining is a technique used to improve the execution throughput of a CPU by using the processor resources in a more efficient manner.

The basic idea is to split the processor instructions into a series of small independent stages. Each stage is designed to perform a certain part of the instruction. At a very basic level, these stages can be broken down into:

Fetch Unit Fetch an instruction from memory
Decode Unit Decode the instruction be executed
Execute Unit Execute the instruction
Write Unit Write the result back to register or memory
cpu pipelininghttp://static.digitalinternals.com/wp-content/uploads/2009/02/pipelining.png

There will be a dedicated CPU module for each of the stages mentioned above.

On a non-pipelined CPU, when a instruction is being processed at a particular stage, the other stages are at an idle state – which is very inefficient. If you look at the diagram, when the 1st instruction is being decoded, the Fetch, Execute and Write Units of the CPU are not being used and it takes 8 clock cycles to execute the 2 instructions.

On the other hand, on a pipelined CPU, all the stages work in parallel. When the 1st instruction is being decoded by the Decoder Unit, the 2nd instruction is being fetched by the Fetch Unit. It only takes 5 clock cycles to execute 2 instructions on a pipelined CPU.

Note that increasing the number of stages in the pipeline will not always result in an increase of the execution throughput. On a non-pipelined CPU, an instruction could only take 3 cycles, but on a pipelined CPU it could take 4 cycles because of the different stages involved. Therefore, a single instruction might require more clock cycles to execute on a pipelined CPU. But the time taken to complete the execution of multiple instructions gets faster in pipelined CPUs. So there needs to a balance in between.

One of the major complications with deep pipelining (eg, 31-stage pipelining used in some of the Intel Pentium 4 processors) is when a conditional branch instruction is being executed – due to the fact that the processor will not be able to determine the location of the next instruction, therefore it has to wait for the branch instruction to finish and the whole pipeline may need to be flushed as a result. If a program has many conditional branch instructions, pipelining could have a negative effect on the overall perfomance. To alleviate this problem, branch prediction can be used, but this too can have a negative effect if the branches are predicted wrongly.

Due to the different ways AMD and Intel implement pipelining in their CPUs, comparing their CPUs purely based on the clock speed is never accurate.

Is This Answer Correct ?    1 Yes 0 No

Post New Answer

More VLSI Interview Questions

For a NMOS transistor acting as a pass transistor, say the gate is connected to VDD, give the output for a square pulse input going from 0 to VDD

0 Answers   Infosys,


How many bit combinations are there in a byte?

6 Answers   Intel,


What is the difference between = and == in C?

5 Answers   Intel,


What are the Advantages and disadvantages of Mealy and Moore?

0 Answers   Intel,


Need to convert this VHDL code into VLSI verilog code? LIBRARY IEEE; USE IEEE.STD_LOGIC_1164.ALL; ----using all functions of specific package--- ENTITY tollbooth2 IS PORT (Clock,car_s,RE : IN STD_LOGIC; coin_s : IN STD_LOGIC_VECTOR(1 DOWNTO 0); r_light,g_light,alarm : OUT STD_LOGIC); END tollbooth2; ARCHITECTURE Behav OF tollbooth2 IS TYPE state_type IS (NO_CAR,GOTZERO,GOTFIV,GOTTEN,GOTFIF,GOTTWEN,CAR_PAID,CHEATE D); ------GOTZERO = PAID $0.00--------- ------GOTFIV = PAID $0.05---------- ------GOTTEN = PAID $0.10---------- ------GOTFIF = PAID $0.15---------- ------GOTTWEN = PAID $0.20--------- SIGNAL present_state,next_state : state_type; BEGIN -----Next state is identified using present state,car & coin sensors------ PROCESS(present_state,car_s,coin_s) BEGIN CASE present_state IS WHEN NO_CAR => IF (car_s = '1') THEN next_state <= GOTZERO; ELSE next_state <= NO_CAR; END IF; WHEN GOTZERO => IF (car_s ='0') THEN next_state <= CHEATED; ELSIF (coin_s = "00") THEN next_state <= GOTZERO; ELSIF (coin_s = "01") THEN next_state <= GOTFIV; ELSIF (coin_s ="10") THEN next_state <= GOTTEN; END IF; WHEN GOTFIV=> IF (car_s ='0') THEN next_state <= CHEATED; ELSIF (coin_s = "00") THEN next_state <= GOTFIV; ELSIF (coin_s = "01") THEN next_state <= GOTTEN; ELSIF (coin_s <= "10") THEN next_state <= GOTFIV; END IF; WHEN GOTTEN => IF (car_s ='0') THEN next_state <= CHEATED; ELSIF (coin_s ="00") THEN next_state <= GOTTEN; ELSIF (coin_s="01") THEN next_state <= GOTFIV; ELSIF (coin_s="10") THEN next_state <= GOTTWEN; END IF; WHEN GOTFIF => IF (car_s ='0') THEN next_state <= CHEATED; ELSIF (coin_s = "00") THEN next_state <= GOTFIF; ELSIF (coin_s ="01") THEN next_state <= GOTTWEN; ELSIF (coin_s = "10") THEN next_state <= GOTTWEN; END IF; WHEN GOTTWEN => next_state <= CAR_PAID; WHEN CAR_PAID => IF (car_s = '0') THEN next_state <= NO_CAR; ELSE next_state<= CAR_PAID; END IF; WHEN CHEATED => IF (car_s = '1') THEN next_state <= GOTZERO; ELSE next_state <= CHEATED; END IF; END CASE; END PROCESS;-----End of Process 1 -------PROCESS 2 for STATE REGISTER CLOCKING-------- PROCESS(Clock,RE) BEGIN IF RE = '1' THEN present_state <= GOTZERO; ----When the clock changes from low to high,the state of the system ----stored in next_state becomes the present state----- ELSIF Clock'EVENT AND Clock ='1' THEN present_state <= next_state; END IF; END PROCESS;-----End of Process 2------- --------------------------------------------------------- -----Conditional signal assignment statements---------- r_light <= '0' WHEN present_state = CAR_PAID ELSE '1'; g_light <= '1' WHEN present_state = CAR_PAID ELSE '0'; alarm <= '1' WHEN present_state = CHEATED ELSE '0'; END Behav;

0 Answers  






Implement D flip-flop with a couple of latches? Write a VHDL Code for a D flip-flop?

6 Answers   Intel,


Explain why is the number of gate inputs to cmos gates usually limited to four?

0 Answers  


Explain what is scr (silicon controlled rectifier)?

0 Answers  


How to improve these parameters? (Cascode topology, use long channel transistors)

0 Answers  


What is the critical path in a SRAM?

0 Answers  


6-T XOR gate?

0 Answers   Intel,


What are the changes that are provided to meet design power targets?

0 Answers  


Categories