Skip navigation


warning: Creating default object from empty value in /var/www/vhosts/ on line 33.
Original author: 
Peter Bright


AMD wants to talk about HSA, Heterogeneous Systems Architecture (HSA), its vision for the future of system architectures. To that end, it held a press conference last week to discuss what it's calling "heterogeneous Uniform Memory Access" (hUMA). The company outlined what it was doing, and why, both confirming and reaffirming the things it has been saying for the last couple of years.

The central HSA concept is that systems will have multiple different kinds of processors, connected together and operating as peers. The two main kinds of processors are conventional: versatile CPUs and the more specialized GPUs.

Modern GPUs have enormous parallel arithmetic power, especially floating point arithmetic, but are poorly-suited to single-threaded code with lots of branches. Modern CPUs are well-suited to single-threaded code with lots of branches, but less well-suited to massively parallel number crunching. Splitting workloads between a CPU and a GPU, using each for the workloads it's good at, has driven the development of general purpose GPU (GPGPU) software and development.

Read 21 remaining paragraphs | Comments

Your rating: None


Intel has been working on its "Many Integrated Core" architecture for a quite some time, but the chipmaker has finally taken the code-name gloves off and announced that Knights Corner will be the first in a new family of Xeon processors — the Xeon Phi. These co-processors will debut later this year (Intel says "by the end of 2012"), and will come in the form of a 50-core PCIe card that includes at least 8GB of GDDR5 RAM. The card runs an independent Linux operating system that manages each x86 core, and Intel is hoping that giving developers a familiar architecture to program for will make the Xeon Phi a much more attractive platform than Nvidia's Tesla.

The Phi is part of Intel's High Performance Computing (HPC) program, where the...

Continue reading…

Your rating: None


Given how long 64-bit processors have been on the market, it's a bit surprising to see a vulnerability that takes advantage of AMD's x86-64 instruction set on Intel processors surface this late in the game. The vulnerability was originally thought to be Linux-specific, but was only recently found to be exploitable in Windows, BSD, and potentially OS X.

Originally discovered by CERT, the vulnerability takes advantage of the intricate mechanics of how memory is copied from one security level to another. In a nutshell, when AMD was creating its x86-64 instruction set, it opted to restrict the addressable memory space to 48-bits, leaving bits 48 through 64 unused. In order to prevent hackers from putting malicious data in this out-of-bounds...

Continue reading…

Your rating: None

This series of five manuals describes everything you need to know about optimizing
code for x86 and x86-64 family microprocessors, including optimization advices for C++
and assembly language, details about the microarchitecture and instruction
timings of Intel, AMD and VIA processors, and details about different compilers and
calling conventions.

Intel microprocessors covered: Intel Pentium 1 through Pentium 4,
Pentium D, Pentium M, Core Duo, Core 2, Core i7, Atom, but not Itanium.
AMD microprocessors covered: Athlon 64, Opteron.
VIA microprocessors covered: Nano.
Operating systems covered: DOS, Windows, Linux, BSD, Mac OS X Intel based.
Includes coverage of 64-bit systems.

Note that these manuals are not for beginners.

1. Optimizing software in C++:
An optimization guide for Windows, Linux and Mac platforms
This is an optimization manual for advanced C++ programmers.
Topics include: The choice of platform and operating system. Choice of
compiler and framework. Finding performance bottlenecks.
The efficiency of different C++ constructs. Multi-core systems.
Parallelization with vector operations. CPU dispatching. Efficient
container class templates. Etc.

File name: optimizing_cpp.pdf, size: 900271, last modified: 2011-Jun-08.


2. Optimizing subroutines in assembly language:
An optimization guide for x86 platforms
This is an optimization manual for advanced assembly language programmers
and compiler makers.
Topics include: C++ instrinsic functions, inline assembly and stand-alone assembly.
Linking optimized assembly subroutines into high level language programs.
Making subroutine libraries compatible with multiple compilers and operating systems.
Optimizing for speed or size. Memory access. Loops. Vector programming (XMM, YMM, SIMD).
CPU-specific optimization and CPU dispatching.

File name: optimizing_assembly.pdf, size: 862273, last modified: 2011-Jun-08.

3. The microarchitecture of Intel, AMD and VIA CPUs:
An optimization guide for assembly programmers and compiler makers
This manual contains details about the internal working of various microprocessors
from Intel, AMD and VIA. Topics include: Out-of-order execution, register renaming,
pipeline structure, execution unit organization and branch prediction algorithms
for each type of microprocessor. Describes many details that cannot be found
in manuals from microprocessor vendors or anywhere else. The information is
based on my own research and measurements rather than on official sources.
This information will be useful to programmers who want to make CPU-specific
optimizations as well as to compiler makers and students of microarchitecture.

File name: microarchitecture.pdf, size: 1385579, last modified: 2011-Jun-08.


4. Instruction tables:
Lists of instruction latencies, throughputs and micro-operation
breakdowns for Intel, AMD and VIA CPUs
Contains detailed lists of instruction latencies, execution unit throughputs,
micro-operation breakdown and other details for all application instructions
of most microprocessors from Intel, AMD and VIA. Intended as an appendix to the
preceding manuals.

File name: instruction_tables.pdf, size: 1585771, last modified: 2011-Jun-08.


5. Calling conventions for different C++ compilers and operating systems
This document contains details about data representation,
function calling conventions, register usage conventions, name mangling schemes,
etc. for many different C++ compilers and operating systems. Discusses compatibilities
and incompatibilities between different C++ compilers. Includes information that
is not covered by the official Application Binary Interface standards (ABI's).
The information provided here is based on my own research and therefore
descriptive rather than normative.
Intended as a source of reference for programmers who want to make function
libraries compatible with multiple compilers or operating systems and for
makers of compilers and other development tools who want their tools to be
compatible with existing tools.

File name: calling_conventions.pdf, size: 414709, last modified: 2011-Jun-08.


All five manuals
Download all the above manuals together in one zip file.

File name:, size: 4151138, last modified: 2011-Jun-08.


Your rating: None