CCNUMA Cachecoherent nonuniform memory access CCNUMA Cachecoherence NUMA

  • Slides: 15
Download presentation
CC-NUMA (Cache-coherent nonuniform memory access)

CC-NUMA (Cache-coherent nonuniform memory access)

CC-NUMA Cache-coherence NUMA (CC-NUMA) : ���������� ���� cache coherence NUMA ����� Processors������ cache �������

CC-NUMA Cache-coherence NUMA (CC-NUMA) : ���������� ���� cache coherence NUMA ����� Processors������ cache ������� main memory ������ loads and stores the memory access time ��� processor ��������� �� main memory ������������ processor

Competing Computer Architectures

Competing Computer Architectures

CC-NUMA Organization

CC-NUMA Organization

CC-NUMA Operation n n ����� processor ���� L 1 ��� L 2 cache �������������������������

CC-NUMA Operation n n ����� processor ���� L 1 ��� L 2 cache ������������������������� processor������������������ : n n L 1 cache (local to processor) L 2 cache (local to processor) Main memory (local to node) Remote memory n n Delivered to requesting (local to processor) cache Automatic and transparent

Memory Access Sequence n n ���������� ������������ e. g. node 2 processor 3 (P

Memory Access Sequence n n ���������� ������������ e. g. node 2 processor 3 (P 2 -3) requests location 798 which is in memory of node 1 n n n n n P 2 -3 issues read request on snoopy bus of node 2 Directory on node 2 recognises location is on node 1 Node 2 directory requests node 1’s directory Node 1 directory requests contents of 798 Node 1 memory puts data on (node 1 local) bus Node 1 directory gets data from (node 1 local) bus Data transferred to node 2’s directory Node 2 directory puts data on (node 2 local) bus Data picked up, put in P 2 -3’s cache and delivered to processor

NUMA Pros & Cons n n n ���������� SMP No major software changes ��������

NUMA Pros & Cons n n n ���������� SMP No major software changes �������� memory ������� n Can be avoided by: n L 1 & L 2 cache design reducing all memory access n n Good spatial locality of software Virtual memory management moving pages to nodes that are using them most Not transparent n n Need good temporal locality of software Page allocation, process allocation and load balancing changes needed Availability?

Figure 1 ��������������� field-programma ����� 512 K 36 -bit SRAM

Figure 1 ��������������� field-programma ����� 512 K 36 -bit SRAM

Figure 2

Figure 2

References n n n http: //www. research. ibm. com/journal/r d/452/brock. html http: //en. wikipedia.

References n n n http: //www. research. ibm. com/journal/r d/452/brock. html http: //en. wikipedia. org/wiki/Cc. NUMA http: //williamstallings. com/COA 6 e. html www. vwin. co. th/ http: //www. ece. neu. edu/