Computer Architecture
<Cache Coherency Protocols>
정의
Cache coherency is a situation where multiple processor cores share the same memory hierarchy, but have their own L1 data and instruction cachces. Incorrect execution could occur if two or more copies of a given cache block exist, in the two processors' caches, and one of these blocks is modified. (https://www.sciencedirect.com/topics/engineering/cache-coherence)
It basically refers to the problem of keeping the data in caches consistent. (https://www.ques10.com/p/13222/what-is-cache-coherence-problem-and-how-it-can-be-/)
위 문제를 해결하는 방법1: Snooping based protocol
Broadcast messages are used and the caches are connected through the bus.
주소버스를 항상 감시하여 캐시상의 메모리에 대한 접근이 있는지를 감시하는 구조. 다른 캐시에서 쓰기가 발생하면 캐시컨트롤러에 의해서 자신의 캐시위에 있는 복사본들을 무효화 시킨다.
위 문제를 해결하는 방법2: Directory based protocol
어떤 노드에서 해당 캐시블록의 복사본을 가지고 있는지를 알고 있기 때문에 특정노드에만 요청하게 된다. 브로드캐스트가 불필요하게 되어 대역폭이 상대저으로 작아도 된다.
<Copy-On-Write vs Read-write>
Copy-on-write: Creates a new copy of data only when the original data is actually modified. Until then, the process continues to read from the original data.
Read-write는 그런게 없고 그냥 말 그대로 read하고 write하는 것이다.
차이점: 내용이 바뀌면 Copy-on-write를 쓰고 그렇지 않다면(=그냥 그 내용을 복사하는 것이라면) Read-write를 쓴다.
<Pipelining (5 stage pipelining)>
어떠한 instruction이 있을 때, 그 instruction은 IF, ID, EX, MEM, WB 순서대로 실행된다.
IF: Instruction Fetch. Memory에서 명렁어를 가져온다
ID: Instruction Decode. 명령어를 읽고/해독. 레지스터를 읽는다.
EX: Execute. 연산수행
MEM: Memory access. 메모리에 접근
WB: Write back. 결과를 register에 쓴다.
<Single vs Multi cycle implementation>
Single-cycle implementation: During one clock cycle, 5 stages are executed for one instruction.
Multi-cycle implementation: Operation can be pipelined. The cycle needs to be long enough that the slowest functional unit (IF, ID, MEM, EX, WB 중 하나) can settle. For example, if the following holds MEM: M ns, Register: R ns, ALU: A ns (ns: nanoseconds) then the cycle cannot be faster than max(M, R, A)
<Cycle Per Instruction (CPI)>
명령어(instruction) 수행마다 걸리는 평균 clock cycles의 수를 말한다.
CPI = (frequency of the instruction) x (# of cycles of the instruction)
<What is subroutine>
A sequence of program instructions that perform a specific task. (어떤 특정 작업을 수행하기 위한 instruction들의 연속)
<Stacked organized computer>
Stacked organized computer is characterized by instruction with zero-addressing.
<Write-through vs Write-back>
Write through: Cache와 MEM에 모두 update를 실행한다.
Write back: MEM에는 안쓰고 Cache에만 update한다. 데이터가 cache내에 일시적으로 저장된 이후에 블록단위로 cache에서 해제되는 때(cache안에 있는 것들을 버릴 시) 주기억 또는 보조기억 장치에 기록하는 방식이다.
<Memory mapped I/O>
I/O ports are placed at address on the bus and are accessed just like other memory locations.
<RISC vs CISC>
RISC / CISC
More registers / Fewer registers
Fixed-sized instruction / Variable-sized instruction
An instruction is executed in a single clock cycle / An instruction takes more than one clock cycle
Code size is large / Code size is small.
Variable-length ISAs allows for a smaller code size over fixed-length ISAs --> TRUE
Fixed-length ISAs simplify instruction fetch and decode over variable-length ISAs. --> TRUE
Variable-length ISAs (CISC) require more registers than fixed-length ISAs (RISC). --> FALSE. RISC requires more registers than CISC
<Forwarding (Operand Forwarding)>
It is an optimization in pipeline CPUs to limit performance deficits due to pipeline stalls.
A data hazard can lead to a pipeline stall when the current operation has to wait for the results of an earlier operation that has not yet finished.
Stall is a cycle in the pipeline without new input. 쉽게 말해서 '대기타는 시간'이라고 생각하면 된다.
To minimize data dependency stalls in the pipeline, operand forwarding is used.
In operand forwarding, we use the interface registers between the stages to hold intermediate output so that dependent instruction can directly access value from the interface register.
Forwarding을 사용하면 pipeline이 대기타는 시간(stall)을 줄일 수 있다.