Bull’s BCS Architecture – Deep Dive – Part 1

Before going further, let’s put here a list of related posts. Although not required, I encourage you to go through them all before reading the following post.

OK now let’s deep dive this BCS technology. I ended up my previous post by saying that Bull’s BCS solves scale-up issues without compromising performance. Here a graph showing what that does mean.

Bullion measured performance vs the maximum theoretical performance – Specint_rate 2006 – Courtesy of Bull

Bull’s BCS eXternal Node-Controller technology scales up almost linearly compared to the ‘glueless’ architecture. What’s the secret sauce behind this awesome technology?

BCS Architecture

The BCS enables two key functionalities: CPU caching and the resilient eXternal Node-Controller fabric. These features server to reduce communication and coordination overhead and provide availability features consistent with Intel Xeon E7-4800 series processor.

BCS meets the most demanding requirements of today’s business-critical and mission-critical applications.

Detailed 4 Sockets Xeon E7 Novascale bullion Architecture – Courtesy of Bull

As shown in the above figure, a BCS chip sits on a SIB board that is plugged in the main board. When running in a single node mode, a DSIB (Dummy SIB) board is required.

BCS Architecture – 4 Nodes – 16 Sockets

As shown in the above figure, BCS Architecture scales to 16 processors supporting up to 160 processor cores and up to 320 logical processors (Intel HT). Memory wise, BCS Architecture supports up to 256x DDR3 DIMM slots for a maximum of 4TB of memory using 16GB DIMMs. IO wise, there are up to 24 IO slots available.

BCS key technical characteristics:

ASIC chip of 18x18mm with 9 metal layers
90nm technology
321 millions transistors
1837 (~43×43) ball connectors
6 QPI (~fibers) and 3×2 XQPI links
High speed serial interfaces up to 8GT/s
power-concsious design with selective power-down capabilities
Aggregated data transfer rate of 230GB/s that is 9 ports x 25.6 GB/s
Up to 300Gb/s bandwidth

BCS Chip Design – Courtesy of Bull

Each BCS module groups the processor sockets into a single “QPI island” of four directly connected CPU sockets. This direct connection provides the lowest latencies. Each node controller stores information about all data located in the processors caches. This key functionality is called “CPU caching“. This is just awesome!

More on this key functionality in the second part. Stay tuned!

Source: Bull, Spec.org

About PiroNet

Didier Pironet is an independent blogger and freelancer with +15 years of IT industry experience. Didier is also a former VMware inc. employee where he specialised in Datacenter and Cloud Infrastructure products as well as Infrastructure, Operations and IT Business Management products. Didier is passionate about technologies and he is found to be a creative and a visionary thinker, expressing with passion and excitement, hopefully inspiring and enrolling people to innovation and change.

View all posts by PiroNet →

	Tom Lockwood on Real Life Scenario – Mig…
	How To Troubleshoot… on Chunk Size Of a RAID0 Volume O…
	PiroNet on It All Started With This …
	Gorka on It All Started With This …
	An administrator not… on Ballooning And Hypervisor Swap…