How Many Registers Does A Cpu Hold?
ENOSUCHBLOG
Programming, philosophy, pedaling.
How many registers does an x86-64 CPU accept?
Nov xxx, 2020 Tags: programming, x86
x86 is back in the general programmer soapbox, in role cheers to Apple'due south M1 and Rosetta two. Every bit such, I figured I'd do nonetheless another x86-64 post.
Just like the last ane, I'yard going to embrace a facet of the x86-64 ISA that sets information technology apart as unusually circuitous amongst modernistic ISAs: the number and diversity of registers available.
Like instruction counting, register counting on x86-64 is subject to debates over methodology. In particular, for this blog post, I'g going to lay the following ground rules:
-
I will count sub-registers (eastward.k.,
EAX
forRAX
) as distinct registers. My justification: they accept unlike instruction encodings, and both Intel and AMD optimize/pessimize particular sub-register use patterns in their microcode. -
I will count registers that are nowadays on x86-64 CPUs, merely that tin can't be used in long mode.
-
I won't count registers that are just nowadays on older x86 CPUs, like the 80386 and 80486 examination registers.
-
I won't count microarchitectural implementation details, like shadow registers.
-
I will count registers that aren't directly addressable, similar MSRs that can only exist accessed through
RDMSR
. Nevertheless, I won't (or will try not to) double-count registers that have multiple access mechanisms (likeRDMSR
andRDTSC
). -
I won't count model-specific registers that fall into these categories:
- MSRs that are but present on niche x86 vendors (Cyrix, Via)
- MSRs that aren't widely available on recent-ish x86-64 CPUs
- Errata: I accidentally included AVX-512 in some of the original counts below, not realizing that it hadn't been released on any AMD CPUs. The post has been updated.
- MSRs that are completely undocumented (both officially and unofficially)
In add-on to the rules above, I'grand going to use the post-obit considerations and methodology for grouping registers together:
-
Many sources, both official and unofficial, use "model-specific register" equally an umbrella term for any non-core or non-feature-set up register supplied past an x86-64 CPU. Whenever possible, I'll try to avert this in favor of more specific categories.
-
Both Intel and AMD provide synonyms for registers (e.thousand.
CR8
as the "task priority register," orTPR
). Whenever possible, I'll effort to use the more generic/category conforming proper noun (likeCR8
in the example to a higher place). -
In general, the private cores of a multicore processor have independent register states. Whenever this isn't the instance, I'll brand an attempt to certificate it.
General-purpose registers
The full general-purpose registers (or GPRs) are the primary registers in the x86-64 register model. Equally their name implies, they are the only registers that are full full general purpose: each has a set up of conventional uses1, but programmers are mostly gratis to ignore those conventions and utilize them as they please2.
Considering x86-64 evolved from a 32-bit ISA which in turn evolved from a sixteen-scrap ISA, each GPR has a set of subregisters that concur the lower 8, xvi and 32 $.25 of the total 64-bit annals.
As a table:
64-bit | 32-bit | xvi-fleck | 8-fleck (low) |
---|---|---|---|
RAX | EAX | AX | AL |
RBX | EBX | BX | BL |
RCX | ECX | CX | CL |
RDX | EDX | DX | DL |
RSI | ESI | SI | SIL |
RDI | EDI | DI | DIL |
RBP | EBP | BP | BPL |
RSP | ESP | SP | SPL |
R8 | R8D | R8W | R8B |
R9 | R9D | R9W | R9B |
R10 | R10D | R10W | R10B |
R11 | R11D | R11W | R11B |
R12 | R12D | R12W | R12B |
R13 | R13D | R13W | R13B |
R14 | R14D | R14W | R14B |
R15 | R15D | R15W | R15B |
Some of the xvi-bit subregisters are also special: the original 8086 allowed the high byte of AX
, BX
, CX
, and DX
to exist accessed indepenently, then x86-64 preserves this for some encodings:
sixteen-scrap | viii-bit (high) |
---|---|
AX | AH |
BX | BH |
CX | CH |
DX | DH |
So that'southward xvi total-width GPRs, fanning out to another 52 subregisters.
Registers in this group: 68.
Running full: 68.
Special registers
This is sort of an bogus category: like every ISA, x86-64 has a few "special" registers that proceed things moving along. In item:
-
The instruction pointer, or
RIP
.x86-64 has 32- and sixteen-scrap variants of
RIP
(EIP
andIP
), merely I'm non going to count them as separate registers: they accept identical encodings and tin't be used in the same CPU modeiii. -
The status register, or
RFLAGS
.Merely similar
RIP
,RFLAGS
has 32- and 16-scrap counterparts (EFLAGS
andFLAGS
). UnlikeRIP
, these counterparts can be partially mixed:PUSHF
andPUSHFQ
are both valid in long way, andLAHF
/SAHF
can operate on the $.25 ofFLAGS
on some x86-64 CPUs exterior of compatiblility stylefour. And so I'm going to get ahead and count them.
Registers in this grouping: four.
Running total: 72.
Segment registers
x86-64 has a total of half dozen segment registers: CS
, SS
, DS
, ES
, FS
, and GS
. The operation varies with the CPU's way:
-
In all modes except for long fashion, each segment register holds a selector, which indexes into either the GDT or LDT. That yields a segment descriptor which, amid other things, supplies the base of operations address and extent of the segment.
-
In long fashion all merely
FS
andGS
are treated equally having a base address of nix and a 64-bit extent, effectively producing a apartment address infinite.FS
andGS
are retained as special cases, but no longer use the segment descriptor tables: instead, they admission base of operations addresses that are stored in theFSBASE
andGSBASE
model-specific registers5. More than on those after.
Registers in this group: half-dozen.
Running total: 78.
SIMD and FP registers
The x86 family has gone through several generations of SIMD and floating-signal education groups, each of which has introduced, extended, or re-contextualized various registers:
- x87
- MMX
- SSE (SSE2, SSE3, SSE4, SSE4, …)
- AVX (AVX2, AVX512)
Allow's do them in rough order.
x87
Originally a detached coprocessor with its own instruction prepare up and annals file, the x87 instructions have been regularly baked into x86 cores themselves since the 80486.
Considering of its coprocessor history, x87 defines both normal registersvi (akin to GPRs) and a diversity of special registers needed to control the FPU land:
-
ST0
throughST7
: viii eighty-scrap floating-bespeak registers -
FPSW
,FPCW
,FPTW
vii: Control, condition, and tag-word registers - "Information operand pointer": I don't know what this one does, but the Intel SDM specifies information technology8
- Educational activity arrow: the x87 country motorcar apparently holds its own re-create of the electric current x87 teaching
- Last didactics opcode: this is patently distinct from the x87 opcode, and has its ain register
Registers in this group: 14.
Running full: 92.
MMX
MMX was Intel's start endeavor at consumer SIMD in their x86 chips, released dorsum in 1997.
For design reasons that are a complete mystery to me, the MMX registers are actually sub-registers of the x87 STn
registers: each 64-bit MMn
occupies the mantissa component of its respective STn
. Consequently, x86 (and x86-64) CPUs cannot execute MMX and x87 instructions at the same time.
Edit: This section incorrectly included MXCSR
, which was actually introduced with SSE. Thanks to /u/Skorezore for pointing out the error.
Registers in this group: 8.
Running total: 100.
SSE and AVX
For simplicity's sake, I'one thousand going to wrap SSE and AVX into a single section: they use the aforementioned sub-annals design every bit the GPRs and x87/MMX practise, so they fit well into a single table:
AVX-512 (512-chip) | AVX-ii (256-chip) | SSE (128-chip) |
---|---|---|
ZMM0 | YMM0 | XMM0 |
ZMM1 | YMM1 | XMM1 |
ZMM2 | YMM2 | XMM2 |
ZMM3 | YMM3 | XMM3 |
ZMM4 | YMM4 | XMM4 |
ZMM5 | YMM5 | XMM5 |
ZMM6 | YMM6 | XMM6 |
ZMM7 | YMM7 | XMM7 |
ZMM8 | YMM8 | XMM8 |
ZMM9 | YMM9 | XMM9 |
ZMM10 | YMM10 | XMM10 |
ZMM11 | YMM11 | XMM11 |
ZMM12 | YMM12 | XMM12 |
ZMM13 | YMM13 | XMM13 |
ZMM14 | YMM14 | XMM14 |
ZMM15 | YMM15 | XMM15 |
ZMM16 | YMM16 | XMM16 |
ZMM17 | YMM17 | XMM17 |
ZMM18 | YMM18 | XMM18 |
ZMM19 | YMM19 | XMM19 |
ZMM20 | YMM20 | XMM20 |
ZMM21 | YMM21 | XMM21 |
ZMM22 | YMM22 | XMM22 |
ZMM23 | YMM23 | XMM23 |
ZMM24 | YMM24 | XMM24 |
ZMM25 | YMM25 | XMM25 |
ZMM26 | YMM26 | XMM26 |
ZMM27 | YMM27 | XMM27 |
ZMM28 | YMM28 | XMM28 |
ZMM29 | YMM29 | XMM29 |
ZMM30 | YMM30 | XMM30 |
ZMM31 | YMM31 | XMM31 |
In other words: the lower half of each ZMMn
is YMMn
, and the lower half of each YMMn
is XMMn
. In that location'southward no straight style register access for but the upper one-half of YMMn
, nor does ZMMn
take directly 256- or 128-bit access for the thunks of its upper half.
SSE too defines a new status register, MXCSR
, that contains flags roughly parallel to the arithmetics flags in RFLAGS
(along with floating-betoken flags in the x87 status give-and-take). SSE also introduces a load/shop pedagogy pair for manipulating information technology (LDMXCSR
and STMXCSR
).
AVX-512 besides introduces 8 opmask registers, k0
through k7
. k0
is a special case that behaves much like the "naught" register on some RISC ISAs: it tin can't exist stored to, and loads from it always produce a bitmask of all ones.
Errata: The tabular assortment in a higher place includes AVX-512, which isn't bachelor on any AMD CPUs equally of 2020. I've updated the counts beneath to merely include SSE and AVX2-introduced registers.
Registers in this group: 33.
Running total: 133.
Bounds registers
Intel added these with MPX, which was intended to offering hardware-accelerated bounds checking. Nobody uses it, since it doesn't work very well. Just x86 is eternal and dull to gear up mistakes, so we'll probably accept these registers taking upward space for at to the lowest degree a while longer:
-
BND0
—BND3
: Individual 128-scrap registers, each containing a pair of addresses for a spring. -
BNDCFG
: Leap configuration, kernel mode. -
BNDCFU
: Leap configuration, user mode. -
BNDSTATUS
: Jump status, later a#BR
is raised.
Registers in this group: 7.
Running full: 140.
Debug registers
These are what they audio like: registers that assist and advance software debuggers, similar GDB.
There are 6 debug registers of ii types:
-
DR0
throughDR3
contain linear addresses, each of which is associated with a breakpoint condition. -
DR6
andDR7
are the debug status and control registers.DR6
'due south lower bits indicate which debug weather were encountered (upon entering the debug exception handler), whileDR7
controls which breakpoint addresses are enabled and their breakpoint conditions (due east.grand., when a item accost is written to).
What virtually DR4
and DR5
? For reasons that are unclear to me, they don't (and accept never) existed9. They do take encodings but are treated as DR6
and DR7
, respective, or produce an #UD
exception when CR4.DE[chip iii] = 1
.
Registers in this grouping: one-half-dozen.
Running full: 146.
Command registers
x86-64 defines a set of command registers that tin can exist used to manage and audit the state of the CPU.
In that location are 16 "main" control registers, all of which can exist accessed with a MOV
variant:
Proper proper name | Purpose |
---|---|
CR0 | Bones CPU performance flags |
CR1 | Reserved |
CR2 | Page-mistake linear address |
CR3 | Virtual addressing land |
CR4 | Protected manner functioning flags |
CR5 | Reserved |
CR6 | Reserved |
CR7 | Reserved |
CR8 | Chore priority register (TPR) |
CR9 | Reserved |
CR10 | Reserved |
CR11 | Reserved |
CR12 | Reserved |
CR13 | Reserved |
CR14 | Reserved |
CR15 | Reserved |
All reserved command registers upshot in an #UD
when accessed, which makes me inclined to non count them in this post.
In add-on to the "main" CRn
control registers in that location are too the "extended" control registers, introduced with the XSAVE
characteristic gear up. Every bit of writing, XCR0
is the but specified extended control annals.
The extended control registers utilise XGETBV
and XSETBV
instead of a MOV
variant.
Registers in this group: 6.
Running full: 152.
"Arrangement table pointer registers"
That'southward what the Intel SDM calls these8: these registers hold sizes and pointers to various protected style tables.
Every bit all-time I tin can tell, there are four of them:
-
GDTR
: Holds the size and base of operations accost of the GDT -
LDTR
: Holds the size and base accost of the LDT -
IDTR
: Holds the size and base accost of the IDT -
TR
: Holds the TSS selector and base address for the TSS
The GDTR
, LDTR
, and IDTR
each seem to be 80 bits in 64-bit modes: 16 lower bits for the size of the annals'south table, so the upper 64 $.25 for the tabular array'south starting address.
TR
is likewise 80 bits: 16 bits for the selector (which behaves identically to a segment selector), so another 64 for the base of operations accost of the TSSx.
Registers in this group: iv.
Running count: 156.
Retention-blazon-ranger registers
These are an interesting example: unlike all of the other registers I've covered and so far, these are non unique to a particular CPU in a multicore chip; instead, they're shared beyond all cores11.
The number of MTTRs seems to vary by CPU model, and have been largely superseded by entries in the page attribute tabular array, which is programmed with an MSR12.
Registers in this grouping:
Running count: >156.
Model specific registers
Model-specific registers are where things go fun.
Like extended control registers, they're accessed indirectly (past identifier) through a pair of instructions: RDMSR
and WRMSR
. MSRs themselves are 64-$.25 merely originated during the 32-bit era, and then RDMSR
and WRMSR
read from and write to 2 32-chip registers: EDX
and EAX
.
Past way of case: hither's the setup and RDMSR
invocation for accessing the IA32_MTRRCAP
MSR, which includes (among other things) that actual number of MTRRs available on the organisation:
one ii 3
MOV ECX , 0xFE ; 0xFE = IA32_MTRRCAP RDMSR ; The $.25 of IA32_MTRRCAP are now in EDX:EAX
RDMSR
and WRMSR
are privileged instructions, so normal ring-three lawmaking can't admission MSRs direct13. The i (?) exception that I know of is the timestamp counter (TSC
), which is stored in the IA32_TSC
MSR merely tin can exist read from non-privileged contexts with RDTSC
and RDTSCP
.
2 other interesting (simply still privilegedxiv) cases are FSBASE
and GSBASE
, which are stored equally IA32_FS_BASE
and IA32_GS_BASE
, respectively. Equally mentioned in the segment register section, these store the FS
and GS
segment bases on x86-64 CPUs. This makes them targets of relatively frequent utilize (past MSR standards), and and so they take their ain dedicated R/Due west opcodes:
-
RDFSBASE
andRDGSBASE
for reading -
WRFSBASE
andWRGSBASE
for writing
Simply dorsum to the meat of things: how many MSRs are at that place?
Using the standards laid out at the kickoff of this mail, we're interested in counting what Intel calls "architectural" MSRs. From the SDM15:
Many MSRs have carried over from 1 generation of IA-32 processors to the side by side and to Intel 64 processors. A subset of MSRs and associated bit fields, which practice non change on futurity processor generations, are now considered architectural MSRs. For historical reasons (first with the Pentium 4 processor), these "architectural MSRs" were given the prefix "IA32_".
According to the subsequent tabular arrayxvi, the highest architectural MSR is 6097
/17D1H
, or IA32_HW_FEEDBACK_CONFIG
. And and so, the naïve reply is over 6000.
However, in that location are meaning gaps in the documented MSR ranges: Intel'south documentation jumps direct from 3506
/DB2H
(IA32_THREAD_STALL
) to 6096
/17D0H
(IA32_HW_FEEDBACK_PTR
). On summit of the empty ranges, there are also ranges that are explicitly marked as reserved, either generally or explicitly for later on expansion of a particular MSR family.
To count the actual number of MSRs, I did a chip of pipeline ugliness:
-
Extract just table 2-ii from Volume 4 of the SDM (link):
one
$ pdfjam 335592-sdm-vol-4.pdf 19-67 -o 2-2.pdf
-
Apply
pdftotext
to catechumen information technology to evidently text and manually trim the side by side table from the last page:i two
$ pdftotext 2-two.pdf tabular array.txt # edit table.txt by hand
-
Carve up the manifestly text tabular array into a sequence of words, filter by
IA32_
, remove cruft, and practice a standard sort-unique-count:1 2 three four five 6
$ tr -s '[:infinite:]' '\n' < tabular assortment.txt \ | grep 'IA32_' \ | tr -d '.' \ | sed 'due south/\[.*$//' \ | sort | uniq | wc -fifty 404
(Output preserved for posterity here).
That pipeline left a bit of cruft towards the cease thank y'all to quoted variants, and then I count the bodily number at 400 architectural MSRs. That's a lot more reasonable than 6096!
Registers in this grouping: 400
Running count: >556.
Other bits and wrapup
The footnotes at the bottom of this mail cover most of my notes, merely I also wanted to dump some other resource that I found useful while discovering registers:
-
sandpile.org has a nice visualization of many of the architectural MSRs, including field breakdowns.
-
Vol. 3A § 8.vii.i ("Country of the Logical Processors") of the Intel SDM has a useful list of almost all of the registers that are either unique to or shared betwixt x86-64 cores.
-
The OSDev Wiki has collection of helpful pages on various x86-64 registers, including a great page on the beliefs of the segment base MSRs.
All told, I think that there are roughly 557 registers on the boilerplate (relatively contempo) x86-64 CPU core. With that being said, I take some peripheral cases that I'g not certain virtually:
-
Modern Intel CPUs utilise integrated APICs equally part of their SMT implementation. These APICs accept their own annals banks which tin can be retention-mapped for reading and potential modification by an x86 core. I didn't count them because (i) they're memory mapped, and thus bear more than like mapped registers from an arbitrary slice of hardware than CPU registers, and (2) I'one thousand not sure whether AMD uses the same mechanism/implementation.
-
The Intel SDM implies that Last Branch Records are stored in discrete, non-MSR registers. AMD'south programmer manual, on the other paw, specifies a range of MSRs. Equally such, I didn't effort to count these separately.
-
Both Intel and AMD take their ain (and incompatible) virtualization extensions, too as their ain enclave/hardened execution extensions. My intuition is that each introduces some boosted registers (or maybe only MSRs), merely their vendor-specificity made me inclined to not look likewise deeply.
Information on these (and whatsoever other) registers would be deeply appreciated.
Discussions: Reddit
Source: https://weblog.yossarian.cyberspace/2020/11/thirty/How-many-registers-does-an-x86-64-cpu-have
How Many Registers Does A Cpu Hold?,
Source: https://poorealiampat.blogspot.com/2022/03/how-many-registers-are-in-cpu.html
Posted by: portertherose.blogspot.com
0 Response to "How Many Registers Does A Cpu Hold?"
Post a Comment