Presentation of Intel Sandy Bridge processors: lineup and architectural features. Network topologies Graphics core in Sandy Bridge microarchitecture

Ring tire.It consists of welded rings that cover the teeth from the vestibular side in the form of a strip, and is located in the occlusal part of the crown closer to the incisal edge (Fig. 26). On the lingual side, the ring expands and overlaps the dental tubercle. Rings, as a rule, are prepared from stamped crowns, but a solid construction can be an option for such a tire. When preparing the teeth, the interdental contact points are ground to the thickness of the stamped crown to the lower edge of the ring. To do this, the boundaries of the rings are previously marked on the diagnostic model with a chemical pencil, which subsequently serve as a guideline for the preparation of teeth. The separation of the contact surfaces facing each other is carried out to the thickness of two rings. The incisal edge is left open, and this circumstance requires special care in determining the indications for the use of this splint. Pronounced vertical mobility of teeth that are not closed from the incisal edge can cause cement resorption and impaired fixation of the splint. In addition, with a pronounced anatomical shape of the lower anterior incisors, grinding of a rather significant layer of hard tissues from the contact surfaces to the lower edge of the ring is required, which complicates the restoration of the contact surfaces on the stamped ring blank and reduces the accuracy of the ring adhesion to the tooth surface. It can also cause the luting cement to dissolve and caries development.

The bus technology is as follows. On the first visit, after a thorough examination and drawing up a splinting plan, impressions should be taken with an alginate mass for the manufacture of diagnostic plaster models. The topography of the boundary line is determined in the parallelometer, the models are fixed in the articulator and the pattern of the ring tire is applied. On the same model, phantom preparation of splinted teeth is carried out. On the next visit, the teeth are prepared under anesthesia, strictly observing the boundaries of the phantom preparation. To make the rings, the impression is again taken using an alginate mass. Taking a double impression in patients with periodontal disease can be difficult due to the mobility of individual teeth and the danger of their extraction. Based on the obtained impressions, plaster working models are cast, which are used to make stamped blanks for future rings. The resulting stamped crowns are used to make rings, which are checked in the patient's oral cavity and if they meet the requirements, an impression is taken with them to transfer the rings to a plaster model. Before taking the impression, the contact surfaces of the rings facing each other are cleaned from scale for the subsequent soldering of the rings on the plaster model without first removing them, which ensures the accuracy of their relative position during tire manufacture. After soldering the rings, the finished splint is subjected to bleaching, polished and fixed in the patient's mouth with special cements.

The disadvantages of the ring splint include: 1 - violation of the aesthetics of natural teeth, some of which are covered with a metal ring; 2 - the presence of solder often leads to its oxidation and discoloration in the form of darkening, this is especially often observed in patients with increased acidity of gastric juice; 3 - no splinting effect under vertical load; 4 - the splint requires the use of cements that are very resistant to the effects of oral fluid (if this condition is not met, there is a risk of tooth decay and impaired fixation of the splint).

Figure: 26. Annular splint: a - view from the labial side; b - view from the lingual side; c - general view of the ring; d - tooth preparation diagram: the dotted line indicates the edge of the ring; the left shows excessive removal of hard tissue from the contact surface; on the right - correct preparation, when hard tissues protruding above the lower border of the ring are removed exactly to the indicated dotted line; e, f - preparation boundaries (front and top view)

Semicircular tire.Structurally, the bus is built on the same principle as the annular one. However, in order to improve the aesthetic properties of the splint, the middle part of the ring from the labial side is removed and, thus, the vestibular surface of the tooth in its middle part is freed from metal (Fig. 27). Thus, on the labial surface, short shoulders remain in the form of tape clasps, covering the teeth completely from the lingual side and partly with the vestibular. The best splinting effect is achieved when full abutment crowns are included in the splint, covering the extreme teeth - canines. From a technological point of view, the tire is the most practical in the manufacture of a one-piece construction, since stamped half rings do not have the rigidity necessary for splinting. In addition, it is now possible to cover the cast half rings with a decorative material - ceramics, which makes the tire very advantageous in aesthetics.

Figure: 27. Semicircular thorn: a - view from the vestibular side; b - view from the lingual side

Cap tire.The system of welded copings covering the incisal edge, the contact surfaces of the tooth, and on the lingual surface reaching the dental tubercle, is designated as a cap splint (Fig. 28). The cutting edge and contact surfaces are prepared for the thickness of the coping. On the labial side, the edge of the cap can be located on top of the hard tissues of the tooth or end on a specially formed shoulder. The second option is preferable, since the edge of the cap turns out to be on the same level with the adjacent hard tissues, that is, flush. In the first variant, the edge of the cap is often felt by patients, it can injure the surrounding movable mucous membrane of the oral cavity and requires the creation of a fold when the edge of the cap passes into the hard tissues of the tooth. Caps can be made in two ways: 1) from stamped crowns, 2) one-piece. The second option is considered more perfect, since the accuracy of the entire splinting structure increases, which means its splinting effect increases, and, in addition, it becomes possible to veneer the cast structure with ceramics. For better stability, the splint is combined with full crowns (metal-acrylic or metal-ceramic) covering the most resistant extreme teeth - canines or premolars. The manufacturing sequence is the same as for the production of the ring tire.

Figure: 28. Cap splint: a - view from the labial side; b - view from the lingual side; c - a layer of removed hard tissues under the cap splint; d - stamped cap; e - preparation for a cast cap; c - construction of a cast cap with an incisal edge

The splints used on vital teeth have one main advantage - the viability of the pulp is preserved, which means that conditions are not created for changing the reactivity in the periodontal tissues. However, often, due to the proximity of the pulp, especially when a part of the cutting and chewing surfaces of the teeth is erased, the use of a complex splint design, which requires the formation of deep cavities, requires a preliminary tooth pulping. Of course, in the presence of depulled teeth, the manufacture of splints is greatly facilitated. Below we will consider exactly such constructions that are used on the dentalized teeth.

When using non-removable tire structures, the rules for placing the edge of the tires located near the Desian edge should be strictly observed. The latter should not be injured by the tire. To do this, the edge of the crown should be minimally immersed in the gingival groove, and to prevent possible pressure on the gum, use the technique of preparing teeth with a shoulder almost at the level with it. A sparing attitude towards the patient with periodontium when using fixed splints has a beneficial effect on the course of periodontal disease and is not an obstacle to conservative and surgical therapy. In addition, the method of taking impressions is important from the point of view of preventing trauma to the desia edge. We consider the most optimal in this case to take impressions for the manufacture of splinting structures with the most elastic alginate materials, which allow avoiding accidental tooth extraction along with the impression in case of mobile teeth. The recommendations found in the special literature to take two-layer impressions using silicone impression materials, even with preliminary splinting, as observations show, are not acceptable, since the removal of two-layer impressions can cause the removal of mobile teeth.

REMOVABLE TIRES

There are different points of view on the question of how to splinter teeth. Some authors consider the preferential use of fixed splints to be justified, while others, on the contrary, give preference to removable splints and splinting structures of removable dentures. Moreover, splinting with removable structures can be used both with intact dentition, and with partial loss of teeth.

If it is necessary to replace the extracted teeth with artificial ones, the restoration of the removable splint can be carried out without replacing the entire structure.

The removable splints ensure reliable stabilization, especially in the vestibulo-oral and mesio-distal directions. This eliminates the need for radical tooth preparation, creates good conditions for hygienic care and medical and surgical treatment both in the preparatory period and in the process of using a removable splinting structure.

For orthopedic treatment of periodontal diseases using removable splints, it is advisable to distinguish two groups of patients:

with intact dentition; partial loss of teeth.

Removable Elbrecht splint.The splint is used when the dentition is preserved and is built like multi-link clasps, which provide immobilization of the teeth in the horizontal plane, leaving them unprotected from the action of the vertical load developing during chewing. Elements of cross-over clasps, occlusal pads and vestibular claw-like processes allow to achieve a good splinting effect.

Figure: 44. Removable Elbrecht tire: a - Elbrecht tire (explanation in the text); b - varieties of multi-link (continuous) clasp: 1 - high position of the clasp (in the upper part of the lingual surface) drop-shaped; 2 - the location of the clasp in the middle of the lingual surface; 3 - low position of the clasp (in the gingival half of the lingual surface); 4 - clasp in the form of a wide strip

Removable splint with dento-alveolar clasps according to V.N. Kopeikin.The removable Elbrecht bus was modified by V.N. Kopeikipym, who suggested using Roach's T-clasps to enhance retention properties and achieve a better aesthetic effect. The multi-link clasps in this design are pubescent below the gingival margin and in the form of an arc are located on the slope of the alveolar processes of the anterior parts of the jaws from the vestibular and lingual sides. T-shaped clasps extend from them to each front tooth, the shoulders of which are located in the undercut zones. The splint can be recommended for stable or mobile 0-1 degrees of the front teeth, when the splinting properties of the retaining T-shaped clamps will not have a harmful effect on the diseased periodontium (Fig. 45). To do this, it is necessary to place the shoulders of the T-clasps so that they are outside the undercut zone. The fixing properties of the splint are ensured by introducing into the undercut zone those cast shoulders of clasps that are located on stable teeth with the least affected periodontal disease. This awl, like all other solid structures, must be cast using refractory models. Removable Elbrecht splint can be reinforced with arcs located on the lingual surface of the clivus of the alveolar process of the lower jaw or the roof of the upper palate (Fig. A, b). If such a splinting design is used only for splinting the posterior teeth, narasagittal stabilization is achieved (Fig. C, d).

Figure: Removable splints reinforced with arcs for the lower (a) and upper jaw (b). The design of the splint to create para-sagittal stabilization: c - on the model; g - general view of the tire

Figure: M.Removable spike with molded splint for anterior teeth: a - on a plaster model; b - removable tire frame

Figure: 48. Removable front teeth; a - removable circular tire; b - a removable thorn in the form of a continuous clasp with claw-like processes

In general, in the absence of several teeth and severe periodontal pathology, removable dentures are preferred. The design of the prosthesis is selected strictly individually and requires several visits to the doctor.

Detachable construction requires careful planning and sequence of actions:

Diagnostics and examination of the periodontium.

Preparing the surface of the teeth and obtaining impressions for the future model

Model Exploring and Planning Tire Construction

Tire wax-up modeling

Obtaining a casting mold and checking the accuracy of the frame on a plaster model

Checking the splint (prosthesis splint) in the oral cavity

Finishing (polishing) tires

Not all working steps are listed here, but even this list speaks of the complexity of the procedure for making a removable splint (prosthetic splint). The complexity of manufacturing explains the need for several sessions of work with the patient and the length of time from the first to the last visit to the doctor. But the result of all efforts is always the same - the restoration of anatomy and physiology, leading to the restoration of health and social rehabilitation.

These days Intel presents the long-awaited processors to the world Sandy bridge, whose architecture was previously baptized as revolutionary. But not only processors have become novelties these days, but also all the accompanying components of the new desktop and mobile platforms.

So, this week announced as many as 29 new processors, 10 chipsets and 4 wireless adapters for laptops and desktop work and gaming computers.

Mobile innovations include:

    processors Intel Core i7-2920XM, Core i7-2820QM, Core i7-2720QM, Core i7-2630QM, Core i7-2620M, Core i7-2649M, Core i7-2629M, Core i7-2657M, Core i7-2617M, Core i5- 2540M, Core i5-2520M, Core i5-2410M, Core i5-2537M, Core i3-2310M;

    intel QS67, QM67, HM67, HM65, UM67 Express chipsets;

    wireless network controllers Intel Centrino Advanced-N + WiMAX 6150, Centrino Advanced-N 6230, Centrino Advanced-N 6205, Centrino Wireless-N 1030.

The desktop segment will include:

    processors Intel Core i7-2600K, Core i7-2600S, Core i7-2600, Core i5-2500K, Core i5-2500S, Core i5-2500T, Core i5-2500, Core i5-2400, Core i5-2400S, Core i5- 2390T, Core i5-2300;

    intel P67, H67, Q67, Q65, B65 Express chipsets.

But it should be noted right away that the announcement of the new platform is not one-part for all models of processors and chipsets - from the beginning of January only mainstream solutions have been available, and most of the more massive and not so expensive ones will appear on sale a little later. Along with the release of Sandy Bridge desktop processors, a new processor socket for them is presented LGA 1155... Thus, the new items do not supplement the Intel Core i3 / i5 / i7 lineup, but are replacements for processors for LGA 1156, most of which are now becoming completely unpromising acquisition, because in the near future their release should stop altogether. And only for enthusiasts, until the end of the year Intel promises to continue releasing the senior quad-core models based on the Lynnfield core.

However, judging by the roadmap, the long-lived Socket T (LGA 775) platform will still remain relevant at least until the middle of the year, being the basis for entry-level systems. For the most productive gaming systems and true enthusiasts, processors based on Bloomfield core, LGA 1366 socket will be relevant until the end of the year. "Trodden" the path for the Sandy Bridge presented "today", having accustomed the consumer to the idea that not only a memory controller, but also a video card can be integrated into the processor. Now the time has come not only to release faster versions of such processors, but to seriously update the architecture to ensure a noticeable increase in their efficiency.

The key features of Sandy Bridge processors are:

    release in compliance with the 32 nm technical process;

    markedly increased energy efficiency;

    optimized Intel Turbo Boost Technology and Intel Hyper-Threading Support;

    a significant increase in the performance of the integrated graphics core;

    implementation of a new set of instructions Intel Advanced Vector Extension (AVX) to accelerate the processing of real numbers.

But all of the above innovations would not provide an opportunity to talk about a truly new architecture, if all this was not implemented now within a single core (die), unlike processors based on the Clarkdale core.

Naturally, in order for all the processor nodes to work in concert, it was necessary to organize a quick exchange of information between them - an important architectural innovation was the Ring Interconnect bus.

Combines Ring Interconnect via L3 cache, now called LLC (Last Level Cache), processor cores, graphics core and System Agent, which includes a memory controller, PCI Express bus controller, DMI controller, power management module and other controllers and modules previously collectively named "uncore".

The Ring Interconnect bus is the next stage in the development of the QPI (QuickPath Interconnect) bus, which, after being tested in server processors with the updated 8-core Nehalem-EX architecture, migrated to the core of processors for desktop and mobile systems. The Ring Interconnect creates four 32-bit rings for Data Rings, Request Rings, Snoop Ring, and Acknowledge Ring. The ring bus functions at the core frequency, so its bandwidth, latency and power consumption are completely dependent on the frequency of the processor's computing units.

The third level cache (LLC - Last Level Cache) is common to all computational cores, graphics core, system agent and other blocks. At the same time, the graphics driver determines which data streams to place in the cache memory, but any other block can access all data in LLC. A special mechanism controls the allocation of cache memory to avoid collisions. In order to speed up work, each of the processor cores has its own cache segment, to which it has direct access. Each such segment includes an independent controller for accessing the Ring Interconnect bus, but at the same time, it constantly interacts with the system agent, which performs general management of the cache memory.

The System Agent, in fact, is a "north bridge" built into the processor and combines controllers for PCI Express buses, DMI, RAM, video processing unit (media processor and interface management), power manager and other auxiliary units. The system agent communicates with the rest of the processor nodes through the ring bus. In addition to streamlining data streams, the system agent monitors the temperature and load of various units, and through the Power Control Unit provides control of the supply voltage and frequencies in order to ensure the best energy efficiency with high performance. It can also be noted here that to power the new processors, you need a three-component power regulator (or two, if the integrated video core remains inactive) - separately for the computing cores, the system agent and the integrated video card.

The PCI Express bus built into the processor complies with the specification 2.0 and has 16 lanes for the possibility of increasing the power of the graphics subsystem using a powerful external 3D accelerator. In the case of using the older sets of system logic and negotiating licensing issues, these 16 lines can be divided into 2 or three slots in 8x + 8x or 8x + 4x + 4x modes, respectively, for NVIDIA SLI and / or AMD CrossFireX.

To exchange data with the system (drives, I / O ports, peripherals, whose controllers are in the chipset), the DMI 2.0 bus is used, which allows pumping up to 2 GB / s of useful information in both directions.

An important part of the system agent is a dual-channel DDR3 memory controller integrated into the processor, nominally supporting modules at 1066-1333 MHz, but when used in motherboards based on the Intel P67 Express chipset, without any problems, it ensures the functioning of modules at frequencies up to 1600 and even 2133 MHz. Placing a memory controller in a single die with processor cores (the Clarkdale core consisted of two crystals) should reduce memory latency and, accordingly, increase system performance.

Thanks in part to the Power Control Unit's advanced monitoring of all cores, caches, and ancillary units, Sandy Bridge processors have enhanced Intel Turbo Boost 2.0 technology. Now, depending on the load and the tasks being performed, the processor cores, if necessary, can accelerate even with an excess of the thermal packet, as with a normal manual overclocking. But the system agent will monitor the temperature of the processor and its components, and when "overheating" is detected, the frequencies of the nodes will gradually decrease. However, desktop processors have a limited runtime in super-accelerated mode. here it is much easier to organize several times more efficient cooling than a boxed cooler. This "overboost" will allow you to get an increase in performance at critical moments for the system, which should give the user the impression of working with a more powerful system, as well as reduce the waiting time for the system's response. Intel Turbo Boost 2.0 also ensures that the onboard graphics core delivers dynamic performance in desktop computers.

The architecture of Sandy Bridge processors implies not only changes in the structure of inter-component interaction and improvement of the capabilities and energy efficiency of these components, but also internal changes in each computing core. Leaving aside the "cosmetic" improvements, the most important are the following:

    return to the allocation of cache memory for about 1.5 thousand decoded micro-ops L0 (used in Pentium 4), which is a separate part of L1, which simultaneously provides a more uniform loading of pipelines and reduces power consumption due to an increase in pauses in the operation of rather complex decoder schemes;

    increasing the efficiency of the branch prediction block due to an increase in the capacity of the buffers of the addresses of the results of branching, command history, branch history, which increased the efficiency of pipelines;

    increasing the capacity of the buffer of reordered instructions (ROB - ReOrder Buffer) and increasing the efficiency of this part of the processor due to the introduction of a physical register file (PRF - Physical Register File, also a characteristic feature of the Pentium 4) for storing data, as well as expanding other buffers;

    doubling the capacity of registers for working with streaming real data, which in some cases can provide twice the speed of operations that use them;

    increasing the efficiency of executing encryption instructions for the AES, RSA and SHA algorithms;

    introduction of new vector instructions Advanced Vector Extension (AVX);

  • optimization of the cache memory of the first L1 and second L2 levels.

An important feature of the graphics core of Sandy Bridge processors is that it is now located in the same die with the rest of the blocks, and its characteristics and state monitoring are performed at the hardware level by the system agent. In this case, the block for processing media data and generating signals for video outputs is placed in this very system agent. This integration enables tighter communication, lower latency, greater efficiency, etc.

However, the architecture of the graphics core itself does not have as many changes as we would like. Instead of the expected DirectX 11 support, DirectX 10.1 support was simply added. Accordingly, not many applications with OpenGL support are limited by hardware compatibility only with the 3rd version of the specification of this free API. At the same time, although it is said about the improvement of computing units, there are the same number of them - 12, and then only for older processors. However, increasing the clock speed to 1350 MHz promises a noticeable performance boost anyway.

On the other hand, it is very difficult to create an integrated video core with really high performance and functionality for modern games with low power consumption. Therefore, the lack of support for new APIs will only affect compatibility with new games, and if you really want to play comfortably, performance will need to be increased using a discrete 3D accelerator. But the expansion of functionality when working with multimedia data, primarily when encoding and decoding video in the framework of Intel Clear Video Technology HD, can be counted among the advantages of Intel HD Graphics II (Intel HD Graphics 2000/3000).

The updated media processor allows unloading the processor cores when encoding video in MPEG2 and H.264 formats, and also expands the set of post-processing functions with hardware implementation of algorithms for automatic image contrast adjustment (ACE - Adaptive Contrast Enhancement), color correction (TCC - Total Color Control) and improving the display of the skin (STE - Skin Tone Enhancement). The support of the HDMI version 1.4 interface, compatible with Blu-ray 3D (Intel InTru 3D), increases the prospects for using the integrated video card.

All of the above architectural features provide the new generation of processors with a noticeable performance advantage over the previous generation models, both in computing tasks and when working with video.

As a result, the Intel LGA 1155 platform becomes more productive and functional, replacing the LGA 1156.

To summarize, the Sandy Bridge processors are designed to solve a very wide range of tasks with high energy efficiency, which should make them really mainstream in new productive systems, especially when more affordable models in a wide range become available.

In the near future, 8 processors for desktop systems of different levels will gradually become available to customers: Intel Core i7-2600K, Intel Core i7-2600, Intel Core i5-2500K, Intel Core i5-2500, Intel Core i5-2400, Intel Core i5-2300 , Intel Core i3-2120 and Intel Core i3-2100. Models with the K index have a free multiplier and a faster integrated Intel HD Graphics 3000 video adapter.

Also, for energy-critical systems, energy-efficient (S index) and high-energy-efficient (T index) models have been released.

To support the new processors, motherboards based on Intel P67 Express and Intel H67 Express chipsets are already available today, and in the near future are expected on Intel Q67 Express and Intel B65 Express, aimed at corporate users and small businesses. All of these chipsets have finally started to support SATA 3.0 drives, although not all ports. But they do not support the seemingly even more popular USB 3.0 bus. An interesting feature of the new chipsets for conventional motherboards is that they have dropped the PCI bus support. In addition, now the clock generator is built into the chipset and its characteristics can be controlled without consequences for the stability of the system operation only in a very small range, with luck it is only ± 10 MHz, and in practice even less.

It should also be noted that different chipsets are optimized for use with different processors in systems designed for different purposes. That is, Intel P67 Express from Intel H67 Express differs not only in the lack of support for working with integrated video, but also in advanced features for "overclocking" and performance tuning. In turn, Intel H67 Express does not notice the free multiplier at all in models with the K index.

But due to architectural features, overclocking of Sandy Bridge processors is still possible only with a multiplier if it is a K-series model. However, all models tend to some optimization and "overboost".

Thus, temporarily to create the illusion of working on a very powerful processor, even models with a locked multiplier are capable of noticeable acceleration. The acceleration time for desktop systems, as mentioned above, is limited by hardware, not just temperature, as in mobile PCs.

After presenting all the architectural features and innovations, as well as updated proprietary technologies, it remains only to summarize once again why Sandy Bridge is so innovative and remind about positioning.

For high-performance and mass-production systems, in the near future it will be possible to buy processors of the Intel Core i7 and Intel Core i5 series, which differ among themselves in support of Intel Hyper-Threading technology (for quad-core Intel Core i5 models it is disabled) and the volume of L3 cache. For more economical buyers, new Intel Core i3 models are presented, which have 2 times less processing cores, although they support Intel Hyper-Threading, only 3 MB of LLC cache, do not support Intel Turbo Boost 2.0 and all are equipped with Intel HD Graphics 2000 ...

In the middle of the year, Intel Pentium processors will be presented for mass systems (this brand is very hard to abandon, although it was predicted a year ago) based on a very simplified Sandy Bridge architecture. In fact, these processors for "workhorses" will resemble in their capabilities yesterday the actual Core i3-3xx on the Clarkdale core, since almost all functions inherent in older models for LGA 1155, they will lose.

It remains to be noted that the release of Sandy Bridge processors and the whole desktop platform LGA 1155 has become the next "Tak" in the framework of Intel's "Tik-Tak" concept, i.e. a major architecture update for release using the already debugged 32 nm process technology. In about a year we will have Ivy Bridge processors with an optimized architecture and 22 nm process technology, which, for sure, will again have "revolutionary energy efficiency", but, hopefully, will not eliminate the LGA 1155 processor socket. Well, let's wait and see. In the meantime, we have at least a year to study the Sandy Bridge architecture and fully test it. , which we are going to start in the coming days.

Article read 14947 times

Subscribe to our channels

The capabilities of the Sandy Bridge graphics processor are generally comparable to those of the previous generation of similar Intel solutions, except that now, in addition to DirectX 10 capabilities, DirectX 10.1 support has been added, instead of the expected DirectX 11 support. Accordingly, not many applications with OpenGL support are limited by hardware compatibility only with 3rd version of the specification for this free API.

Nevertheless, there are a lot of innovations in the Sandy Bridge graphics, and they are mainly aimed at increasing productivity when working with 3D graphics.

The main emphasis in the development of the new graphics core, according to Intel representatives, was placed on the maximum use of hardware capabilities for calculating 3D functions, and the same thing - for processing media data. This approach is radically different from the fully programmable hardware model adopted by, for example, NVIDIA, or Intel itself for Larrabee development (with the exception of texture units).

However, in the implementation of Sandy Bridge, the departure from programmable flexibility has its undeniable advantages, due to it, more important benefits for integrated graphics are achieved in the form of lower latency during operations, better performance against the background of energy savings, a simplified driver programming model, and, importantly, with saving physical dimensions of the graphics module.

Sandy Bridge programmable execution shader units, traditionally called Execution Units (EU) by Intel, are characterized by increased register file sizes, which allows efficient execution of complex shaders. Also in the new execution units, branching optimization is applied to achieve better parallelization of executable commands.

In general, according to Intel representatives, the new execution units will have doubled bandwidth compared to the previous generation of integrated graphics, and the performance of calculations with transcendental numbers (trigonometry, natural logarithms, etc.) due to the emphasis on using the hardware computing capabilities of the model will increase by 4 -20 times.

The internal instruction set, reinforced in Sandy Bridge with a number of new ones, allows most of the DirectX 10 APIs to be distributed one-to-one, as is the case with the CISC architecture, resulting in significantly higher performance at the same clock speed.

Fast access via a fast ring bus to the distributed L3 cache with dynamically configurable segmentation allows you to reduce latency, increase performance and at the same time reduce the frequency of GPU requests to RAM.

Ring tire

The entire history of modernization of Intel processor microarchitectures in recent years is inextricably linked with the sequential integration into a single crystal of an increasing number of modules and functions previously located outside the processor: in a chipset, on a motherboard, etc. Accordingly, as the processor performance and the degree of chip integration increased, the bandwidth requirements of the internal interconnect buses grew at an outstripping pace. For the time being, even after the introduction of a graphics chip into the architecture of Arrandale / Clarkdale chips, it was possible to get by with inter-component buses with the usual cross-topology - that was enough.

However, the efficiency of such a topology is high only with a small number of components participating in the data exchange. In the Sandy Bridge microarchitecture, to improve the overall system performance, the developers decided to turn to a ring topology of a 256-bit interconnect bus (Fig.6.1), made on the basis of a new version of QPI (QuickPath Interconnect) technology, extended, modified and first implemented in the architecture of the Nehalem server chip. EX (Xeon 7500), as well as planned for use in conjunction with the architecture of Larrabee chips.

The ring bus (Ring Interconnect) in the version of Sandy Bridge architecture for desktop and mobile systems serves to exchange data between six key components of the chip: four x86 processor cores, a graphics core, L3 cache, now it is called LLC (Last Level Cache). and a system agent. The bus consists of four 32-byte rings: Data Ring, Request Ring, Snoop Ring and Acknowledge Ring; in practice, this actually allows you to share access to the 64-byte interface last level cache into two different packages. The buses are controlled using the distributed arbitration communication protocol, while pipelining of requests occurs at the clock frequency of the processor cores, which gives the architecture additional flexibility during overclocking. Ring bus performance is rated at 96 GB per second per connection at 3 GHz, effectively four times the performance of previous generation Intel processors.

Figure 6.1. Ring bus (Ring Interconnect)

The ring topology and bus organization ensures minimum latency in processing requests, maximum performance and excellent scalability of the technology for versions of chips with different numbers of cores and other components. According to company representatives, in the future, up to 20 processor cores per chip can be "connected" to the ring bus, and such a redesign, as you understand, can be done very quickly, in the form of a flexible and prompt response to current market needs. In addition, the physical ring bus sits directly above the L3 cache blocks at the top layer of metallization, which simplifies the layout of the design and allows the chip to be more compact.

Several years ago, during the reign of the Pentium brand, the first appearance of the Intel Core trademark and the microarchitecture of the same name (Architecture 101), the next generation of Intel microarchitecture with the working name Gesher ("bridge" in Hebrew) was first mentioned on slides about future processors. later transformed into Sandy Bridge.

In that long-standing era of the domination of NetBurst processors, when the contours of the upcoming Nehalem cores had just begun to emerge, and we got acquainted with the peculiarities of the internal structure of the first representatives of the Core microarchitecture - Conroe for desktop systems, Merom - for mobile and Woodcrest - for server ...

In a word, when the grass was green, and before Sandy Bridge it was like before the Moon, even then Intel representatives said that it would be a completely new processor microarchitecture. This is how, let's say, today you can imagine the mysterious Haswell microarchitecture, which will appear after the generation of Ivy Bridge, which, in turn, will replace Sandy Bridge next year.

However, the closer the release date of the new microarchitecture is, the more we learn about its features, the more noticeable the similarities between neighboring generations become, and the more obvious is the evolutionary path of changes in the circuitry of processors. And indeed, if between the initial reincarnations of the first Core architecture - Merom / Conroe, and the firstborn of the second generation Core - Sandy Bridge - there is actually an abyss of differences, then the current latest version of the Core generation - the Westmere core - and the upcoming, considered today, the first version of the Core generation II - the core of Sandy Bridge, may seem similar.

Yet the differences are significant. So significant that now we can finally talk about the end of the 15-year era of the P6 microarchitecture (Pentium Pro) and the emergence of a new generation of Intel microarchitecture.

⇡ Sandy Bridge Microarchitecture: Bird's Eye View

The Sandy Bridge chip is a quad-core 64-bit processor with out-of-order command execution, support for two data streams per core (HT), and four instructions per clock cycle; with integrated graphics and integrated DDR3 memory controller; with a new ring bus, support for 3- and 4-operand (128/256-bit) vector commands of the extended set of AVX (Advanced Vector Extensions); the production of which is established on the lines in compliance with the standards of the modern 32-nm technological process of Intel.

So, in short, in one sentence, you can try to characterize the new generation of Intel Core II processors for mobile and desktop systems, the mass deliveries of which will begin in the very near future.

Intel Core II processors based on Sandy Bridge microarchitecture will be shipped in a new 1155-pin LGA1155 design for new motherboards based on Intel 6 Series chipsets.

Approximately the same microarchitecture will be relevant for Intel Sandy Bridge-EP server solutions, except with actual differences in the form of a larger number of processor cores (up to eight), a corresponding LGA2011 processor socket, a larger L3 cache, an increased number of DDR3 memory controllers and PCI support -Express 3.0.

The previous generation, the Westmere microarchitecture by Arrandale and Clarkdale for mobile and desktop systems, is a design of two crystals - a 32nm processor core and an additional 45nm "coprocessor" with a graphics core and onboard memory controller, located on a single substrate and exchanging data via the QPI bus. In fact, at this stage, Intel engineers, using mainly previous developments, created a kind of integrated hybrid microcircuit.

When creating the Sandy Bridge architecture, the developers completed the integration process that had begun at the stage of creating Arrandale / Clarkdale and placed all the elements on a single 32-nm crystal, while abandoning the classic QPI bus in favor of a new ring bus. At the same time, the essence of the Sandy Bridge microarchitecture remained within the framework of the previous Intel ideology, which relies on increasing the total processor performance by improving the "individual" efficiency of each core.

The structure of the Sandy Bridge chip can be roughly divided into the following main elements: processor cores, graphics core, L3 cache memory and the so-called "System Agent".

In general, the structure of the Sandy Bridge microarchitecture is clear. Our task today is to find out the purpose and implementation features of each of the elements of this structure.

Ring bus (Ring Interconnect)

The entire history of modernization of Intel processor microarchitectures in recent years is inextricably linked with the sequential integration into a single crystal of an increasing number of modules and functions previously located outside the processor: in a chipset, on a motherboard, etc. Accordingly, as the processor performance and the degree of chip integration increased, the bandwidth requirements of the internal interconnect buses grew at an outstripping pace. For the time being, even after the introduction of a graphics chip into the architecture of Arrandale / Clarkdale chips, it was possible to get by with inter-component buses with the usual cross-topology - that was enough.

However, the efficiency of such a topology is high only with a small number of components participating in the data exchange. In the Sandy Bridge microarchitecture, to improve the overall system performance, the developers decided to turn to a ring topology of a 256-bit interconnect bus, based on a new version of QPI (QuickPath Interconnect) technology, extended, modified and first implemented in the architecture of the Nehalem-EX server chip (Xeon 7500) , as well as planned for use in conjunction with the architecture of Larrabee chips.

The Sandy Bridge desktop and mobile (Core II) ring bus is used to exchange data between six key chip components: four x86 processor cores, a graphics core, an L3 cache, and a system agent. The bus consists of four 32-byte rings: Data Ring buses, Request Ring buses, Snoop Ring buses and Acknowledge Ring buses, in practice, this actually allows you to divide the access to the 64-byte cache interface of the last layer into two different packets. The buses are controlled using the distributed arbitration communication protocol, while pipelining of requests occurs at the clock frequency of the processor cores, which gives the architecture additional flexibility during overclocking. Ring tire performance is rated at 96 GB per second per connection @ 3 GHz, effectively four times the performance of previous generation Intel processors.

The ring topology and bus organization ensures minimum latency in processing requests, maximum performance and excellent scalability of the technology for versions of chips with different numbers of cores and other components. According to company representatives, in the future, up to 20 processor cores per chip can be "connected" to the ring bus, and such a redesign, as you understand, can be done very quickly, in the form of a flexible and prompt response to current market needs. In addition, the physical ring bus sits directly above the L3 cache blocks at the top layer of metallization, which simplifies the layout of the design and allows the chip to be more compact.

L3 - last level cache, LLC

As you have already noticed, on the Intel slides, L3 cache memory is referred to as "last level cache", that is, LLC - Last Level Cache. In the Sandy Bridge microarchitecture, the L3 cache is distributed not only between the four processor cores, but, thanks to the ring bus, also between the graphics core and the system agent, which, among other things, includes a graphics hardware acceleration module and a video output unit. At the same time, a special tracing mechanism prevents access conflicts between processor cores and graphics.

Each of the four processor cores has direct access to its "own" L3 cache segment, while each L3 cache segment provides half of its bus width for access to the ring data bus, while the physical addressing of all four cache segments is provided by a single hash function. Each L3 cache segment has its own independent ring bus access controller, which is responsible for processing requests for allocating physical addresses. In addition, the cache controller constantly interacts with the system agent for failed calls to L3, control of inter-component data exchange and non-cached calls.

Additional details about the structure and functioning of the L3 cache of Sandy Bridge processors will appear later in the text, in the process of getting to know the microarchitecture, as the need arises.

System agent: DDR memory controller3, PCUand others

Previously, instead of defining System Agent in Intel terminology, there was the so-called “Non-core” - Uncore, that is, “everything that is not included in the Core”, namely L3 cache, graphics, memory controller, other controllers like PCI Express, etc. Out of habit, we often called most of this elements of the northbridge, transferred from the chipset to the processor.

The system agent of the Sandy Bridge microarchitecture includes a DDR3 memory controller, a Power Control Unit (PCU), PCI-Express 2.0, DMI controllers, a video output unit, etc. Like all other elements of the architecture, the system agent is connected to the general system through high performance ring bus.

The architecture of the standard version of the Sandy Bridge system agent implies the presence of 16 PCI-E 2.0 bus lines, which can also be distributed on two PCI-E 2.0 buses with 8 lines, or on one PCI-E 2.0 bus with 8 lines and two PCI- E 2.0 in four lines. The dual-channel DDR3 memory controller is now back on the die (in Clarkdale chips it was located outside the processor die) and, most likely, will now provide significantly lower latency.

The fact that the memory controller in Sandy Bridge has become a dual-channel is unlikely to please those who have already managed to dump a lot of money for overclocking sets of three-channel DDR3 memory. Well, it happens that now sets of only one, two or four modules will be relevant.

We have some thoughts on going back to a dual-channel memory controller. Perhaps Intel has begun preparing microarchitectures to work with DDR4 memory? Which, due to the departure from the "star" topology to the "point-to-point" topology, in versions for desktop and mobile systems will be by definition only two-channel (for servers, special multiplexer modules will be used). However, these are just guesses, there is not enough information about the DDR4 standard for confident assumptions.

The power management controller located in the system agent is responsible for the timely and dynamic scaling of supply voltages and clock frequencies of processor cores, graphics core, caches, memory controller and interfaces. Most importantly, power and clock management are independent for the processor cores and the graphics core.

A completely new version of Turbo Boost technology is implemented not least thanks to this power management controller. The fact is that, depending on the current state of the system and the complexity of the problem being solved, the Sandy Bridge microarchitecture allows Turbo Boost technology to "overclock" the processor cores and integrated graphics to a level that significantly exceeds the TDP for a fairly long time. Indeed, why not take advantage of this opportunity in a regular way, while the cooling system is still cold and can provide more heat dissipation than already warmed up?

In addition to the fact that Turbo Boost technology now allows all four cores to be overclocked out of TDP, it is also worth noting that performance and thermal management of graphics cores in Arrandale / Clarkdale chips are, in fact, only built-in, but not fully integrated into processor, produced using the driver. Now, in the Sandy Bridge architecture, this process is also assigned to the PCU controller. Such tight integration of the supply voltage and frequency control system made it possible to implement in practice much more aggressive scenarios for the operation of Turbo Boost technology, when both graphics and all four processor cores, if necessary and observing certain conditions, can simultaneously work at increased clock frequencies with a significant excess of TDP, but without any side effects.

The new version of Turbo Boost technology in the Sandy Bridge processors is well documented in a multimedia presentation at the Intel Developer Forum in San Francisco in September. The video below of this moment of the presentation will tell you about Turbo Boost faster and better than any retelling.

How effectively this technology will work in production processors remains to be seen, but what Intel showed during the closed demonstration of Sandy Bridge capabilities during the IDF days in San Francisco is simply amazing: both the increase in clock frequency and, accordingly, the processor performance and the graphics can reach fantastic levels at once.

There is information that for standard cooling systems the mode of such "overclocking" with the help of Turbo Boost and exceeding the TDP will be limited in the BIOS to a period of 25 seconds. But what if motherboard manufacturers can guarantee better heat dissipation with some exotic cooling system? This is where the expanse for overclockers opens up ...

Each of the four cores of Sandy Bridge can be independently switched to the lowest power mode, if necessary, the graphics core can also be switched to a very economical mode. The ring bus and the L3 cache, due to their distribution among other resources, cannot be disabled, however, a special economical standby mode is provided for the ring bus when it is not loaded, and for the L3 cache, the traditional technology of turning off unused transistors is used, already known to us on previous microarchitectures. Thus, Sandy Bridge processors in mobile PCs provide long battery life.

Video output and multimedia hardware decoding modules are also part of the system agent. Unlike its predecessors, where hardware decoding was entrusted to the graphics core (we will talk about its capabilities next time), the new architecture uses a separate, much more productive and economical module to decode multimedia streams, and only in the process of encoding (compressing) multimedia data uses the capabilities of the shader units of the graphics core and the L3 cache.

In accordance with modern trends, tools for playing 3D content are provided: the Sandy Bridge hardware decoding module can easily process two independent MPEG2, VC1 or AVC streams at once in Full HD resolution.

Today we got acquainted with the structure of the new generation of the Intel Core II microarchitecture with the working title Sandy Bridge, figured out the structure and operation of a number of key elements of this system: a ring bus, L3 cache memory and a system agent, which includes a DDR3 memory controller, a control module power supply and other components.

However, this is only a small part of the new technologies and ideas implemented in the Sandy Bridge microarchitecture, no less impressive and large-scale changes have been made to the architecture of processor cores and integrated graphics. So our story about Sandy Bridge does not end there - to be continued.


1. Microarchitecture Sandy Bridge: briefly

The Sandy Bridge chip is a two-to-four-core 64-bit processor ● with a variable (out-of-order) sequence of command execution, ● with support for two data streams per core (HT), ● with the execution of four commands per cycle; ● with integrated graphics core and integrated DDR3 memory controller; ● with a new ring bus, ● support for 3- and 4-operand (128/256-bit) vector commands of the extended set of AVX (Advanced Vector Extensions); the production of which is established on the lines in compliance with the norms of the 32nm technological process of Intel.

Thus, one sentence can be used to characterize the new generation of Intel Core 2 processors for mobile and desktop systems, with delivery since 2011.

Intel Core II MPs based on the Sandy Bridge MA are shipped in a new 1155 contact construct LGA1155 for new motherboards based on Intel 6 Series chipsets with system logic kits (Intel B65 Express, H61 Express, H67 Express, P67 Express, Q65 Express, Q67 Express and 68 Express, Z77).


About the same microarchitecture is relevant for server solutions. Intel Sandy Bridge-E with differences in the form of a larger number of processor cores (up to 8), processor socket LGA2011, more L3 cache, more DDR3 memory controllers and PCI-Express 3.0 support.

Previous generation, microarchitecture Westmere was a design of two crystals: ● 32-nm processor core and ● an additional 45-nm “coprocessor” with a graphics core and onboard memory controller located on a single substrate and exchanging data via the QPI bus, ie. integrated hybrid chip (center).

When creating MA Sandy Bridge, the developers placed all the elements on a single 32-nm crystal, abandoning the classic bus in favor of a new ring bus.

The essence of the Sandy Bridge architecture remains the same - the stake is on increasing the total processor performance by improving the "individual" efficiency of each core.



The structure of the Sandy Bridge chip can be roughly divided into the following essential elements: ■ processor cores, ■ graphics core, ■ L3 cache and ■ System Agent. Let us describe the purpose and implementation features of each of the elements of this structure.

The whole history of modernization of Intel processor microarchitectures in recent years is connected with sequential integration into a single crystal of an increasing number of modules and functions previously located outside the MP: in the chipset, on the motherboard, etc. As processor performance and chip integration increased, the bandwidth requirements of internal interconnects grew at an outstripping pace. Previously, cross-topology interconnects were dispensed with - and that was enough.

However, the efficiency of such a topology is high only with a small number of components participating in the data exchange. Sandy Bridge turned to ring topology 256-bit interconnect bus based new version QPI (QuickPath Interconnect).

The bus serves for data exchange between chip components:


● 4 MP x86 cores,

● graphics core,

● L3 cache and

● a system agent.


The bus consists of 4 32-byte rings:

■ data buses (Data Ring), ■ request buses (Request Ring),

■ Status Monitoring Bus (Snoop Ring) and ■ Acknowledge Ring Bus.


Tires are controlled by distributed arbitration communication protocol, while pipeline processing of requests occurs at the clock frequency of processor cores, which gives MA additional flexibility when overclocking. Tire performance is measured in 96 GB / s per connection at clock frequency 3 GHzwhich is 4 times higher than the previous generation Intel processors.

The ring topology and bus organization provides ● minimum latency in processing requests, ● maximum performance, and ● excellent scalability of the technology for versions of chips with different numbers of cores and other components.

In the future, the ring bus can be "connected" up to 20 processor cores per chip, and such a redesign can be done very quickly, in the form of a flexible and prompt response to current market needs.

In addition, the physical ring bus sits directly above the L3 cache blocks at the top layer of metallization, which simplifies the layout of the design and allows the chip to be more compact.