tango.core.Cpuid

Identify the characteristics of the host CPU, providing information about cache sizes and assembly optimisation hints. Some of this information was extremely difficult to track down. Some of the documents below were found only in cached versions stored by search engines! This code relies on information found in: - "Intel(R) 64 and IA-32 Architectures Software Developers Manual, Volume 2A: Instruction Set Reference, A-M" (2007). - "AMD CPUID Specification", Advanced Micro Devices, Rev 2.28 (2008). - "AMD Processor Recognition Application Note For Processors Prior to AMD Family 0Fh Processors", Advanced Micro Devices, Rev 3.13 (2005). - "AMD Geode(TM) GX Processors Data Book", AMD, Publication ID 31505E, (2005). - "AMD K6 Processor Code Optimisation", Advanced Micro Devices, Rev D (2000). - "Application note 106: Software Customization for the 6x86 Family", Cyrix Corporation, Rev 1.5 (1998) - http://ftp.intron.ac/pub/document/cpu/cpuid.htm - "Geode(TM) GX1 Processor Series Low Power Integrated X86 Solution", National Semiconductor, (2002) - "The VIA Isaiah Architecture", G. Glenn Henry, Centaur Technology, Inc (2008). - http://www.sandpile.org/ia32/cpuid.htm - http://grafi.ii.pw.edu.pl/gbm/x86/cpuid.html - "What every programmer should know about memory", Ulrich Depper, Red Hat, Inc. (2007).

Authors:

Don Clugston, Tomas Lindquist Olsen <tomas@famolsen.dk> Fawzi Mohamed

Bugs:

Currently only works on x86 CPUs. Many processors have bugs in their microcode for the CPUID instruction, so sometimes the cache information may be incorrect.
struct CacheInfo #
Cache size and behaviour
size_t size #
Size of the cache, in kilobytes, per CPU. For L1 unified (data + code) caches, this size is half the physical size. (we don't halve it for larger sizes, since normally data size is much greater than code size for critical loops).
ubyte associativity #
Number of ways of associativity, eg: 1 = direct mapped 2 = 2-way set associative 3 = 3-way set associative ubyte.max = fully associative
uint lineSize #
Number of bytes read into the cache when a cache miss occurs.
uint nThreadSharing #
how many threads share the cache, 0=unkown
bool wildGuess #
if you cannot really trust these numbers
CpuInfo mainCpu [static] #
the main type of cpu
CpuInfo currentCpu() #
the current type of cpu
bool uniqueCpuType() #
if the system has only one kind of cpu (at the moment hardcoded to true)
class CpuInfo #
information on a cpu
if you think x86,sparc,arm,ppc,... should be always defined, but throw or return null post a ticket explaining why
CacheInfo[5] datacache [protected] #
The data caches. If there are fewer than 5 physical caches levels, the remaining levels are set to size_t.max/1024 (== entire memory space) make this a function?
uint numCacheLevels [protected] #
cache levels
char [] vendorName [protected] #
vendor name (only for display purposes)
char [] processorName [protected] #
name of the processor (only for display purposes)
bool getCpuData() [protected] #
tries to get valid data from the current cpu, return false if it fails
char[] vendor() [public] #
Returns vendor string, for display purposes only. Do NOT use this to determine features! Note that some CPUs have programmable vendorIDs.
char[] processor() [public] #
Returns processor string, for display purposes only
bool hyperThreading() [public] #
Is hyperthreading supported?
uint threadsPerCPU() [public] #
Returns number of threads per CPU
uint coresPerCPU() [public] #
Returns number of cores in CPU
void clear() [public] #
clears info stored in this object
CpuInfo dup() [public] #
duplicates this object
CpuInfo opSliceAssign(CpuInfo other) [public] #
copies data from one object to the other
void cacheFixup() [protected] #
sets unset values in cache info
CpuInfoX86 x86() [public, final] #
utility method to get information about x86 processors
class CpuInfoX86 : CpuInfo [final] #
If optimizing for a particular processor, it is generally better to identify based on features rather than model. NOTE: Normally it's only worthwhile to optimise for the latest Intel and AMD CPU, with a backup for other CPUs. Pentium -- preferPentium1() PMMX -- + mmx() PPro -- default PII -- + mmx() PIII -- + mmx() + sse() PentiumM -- + mmx() + sse() + sse2() Pentium4 -- preferPentium4() PentiumD -- + isX86_64() Core2 -- default + isX86_64() AMD K5 -- preferPentium1() AMD K6 -- + mmx() AMD K6-II -- + mmx() + 3dnow() AMD K7 -- preferAthlon() AMD K8 -- + sse2() AMD K10 -- + isX86_64() Cyrix 6x86 -- preferPentium1() 6x86MX -- + mmx()
uint stepping [public] #
uint model [public] #
uint family [public] #
Processor type (vendor-dependent). This should be visible ONLY for display purposes.
bool x87onChip() [public] #
Does it have an x87 FPU on-chip?
bool mmx() [public] #
Is MMX supported?
bool sse() [public] #
Is SSE supported?
bool sse2() [public] #
Is SSE2 supported?
bool sse3() [public] #
Is SSE3 supported?
bool ssse3() [public] #
Is SSSE3 supported?
bool sse41() [public] #
Is SSE4.1 supported?
bool sse42() [public] #
Is SSE4.2 supported?
bool sse4a() [public] #
Is SSE4a supported?
bool sse5() [public] #
Is SSE5 supported?
bool amd3dnow() [public] #
Is AMD 3DNOW supported?
bool amd3dnowExt() [public] #
Is AMD 3DNOW Ext supported?
bool amdMmx() [public] #
Are AMD extensions to MMX supported?
bool hasFxsr() [public] #
Is fxsave/fxrstor supported?
bool hasCmov() [public] #
Is cmov supported?
bool hasRdtsc() [public] #
Is rdtsc supported?
bool hasCmpxchg8b() [public] #
Is cmpxchg8b supported?
bool hasCmpxchg16b() [public] #
Is cmpxchg8b supported?
bool has3dnowPrefetch() [public] #
Is 3DNow prefetch supported?
bool hasLahfSahf() [public] #
Are LAHF and SAHF supported in 64-bit mode?
bool hasPopcnt() [public] #
Is POPCNT supported?
bool hasLzcnt() [public] #
Is LZCNT supported?
bool isX86_64() [public] #
Is this an Intel64 or AMD 64?
bool isItanium() [public] #
Is this an IA64 (Itanium) processor?
bool hyperThreading() [public] #
Is hyperthreading supported?
uint threadsPerCPU() [public] #
Returns number of threads per CPU
uint coresPerCPU() [public] #
Returns number of cores in CPU
bool preferAthlon() [public] #
Optimisation hints for assembly code. For forward compatibility, the CPU is compared against different microarchitectures. For 32-bit X86, comparisons are made against the Intel PPro/PII/PIII/PM family.
The major 32-bit x86 microarchitecture 'dynasties' have been: (1) Intel P6 (PentiumPro, PII, PIII, PM, Core, Core2). (2) AMD Athlon (K7, K8, K10). (3) Intel NetBurst (Pentium 4, Pentium D). (4) In-order Pentium (Pentium1, PMMX) Other early CPUs (Nx586, AMD K5, K6, Centaur C3, Transmeta, Cyrix, Rise) were mostly in-order. Some new processors do not fit into the existing categories: Intel Atom 230/330 (family 6, model 0x1C) is an in-order core. Centaur Isiah = VIA Nano (family 6, model F) is an out-of-order core.

Within each dynasty, the optimisation techniques are largely identical (eg, use instruction pairing for group 4). Major instruction set improvements occur within each group. Does this CPU perform better on AMD K7 code than PentiumPro..Core2 code?

bool preferPentium4() [public] #
Does this CPU perform better on Pentium4 code than PentiumPro..Core2 code?
bool preferPentium1() [public] #
Does this CPU perform better on Pentium I code than Pentium Pro code?
CpuInfoX86 opSliceAssign(CpuInfo o) [public] #
copies data from one object to the other
bool getCpuData() [protected, override] #
auto config for current cpu
class CpuInfoSparc : CpuInfo [final] #
this should be expanded by someone using sparc