Contact US

Creating a Corporation

May

 

SITE MAP

Class Training For Pc Repair

A Plus Guide

A plus Os

LAPTOPS

 

X86 Tech

January

February

March

Osi

auto parts

Winserver8

windows7

 

Virtualization

Cloud Computing

Security

Cash fast

B & I SNOWDEN-Find a Wealth of Products and Services

 

 

AMD-K6™ Processor

Multimedia Technology

Introduction

Next generation PC performance requirements are being

driven by emerging multimedia and communications software.

3D graphics, video, audio, and telephony capabilities are

evolving across education, entertainment, and internet

applications. As multimedia applications continue to

proliferate in the marketplace, PC systems suppliers are being

challenged to deliver multimedia-enabled PC solutions

covering all mainstream price/performance points.

In response to the growing need to provide improved PC

multimedia capabilities, the AMD-K6™ MMX™ enhanced

processor is the first member in the AMD family of processors

to incorporate a robust multimedia technology that is fully

software compatible with the MMX™ technology as defined by

Intel. This multimedia technology enables scaleable

multimedia capabilities across a broad range of PC system

price/performance points.

The AMD-K6 processor features a decode-decoupled

superscalar microarchitecture and state-of-the-art design

techniques to deliver true sixth-generation performance while

maintaining full x86 binary software compatibility. An x86

binary-compatible processor implements the industry-standard

x86 instruction set by decoding and executing the x86

2 AMD-K6™ Processor Multimedia Technology

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

instruction set as its native mode of operation. Only this native

mode enables delivery of maximum performance when running

PC software.

The AMD-K6 processor delivers leading-edge performance to

mainstream PC systems running industry-standard x86

software. The AMD-K6 processor implements advanced design

techniques like instruction pre-decoding, dual x86 opcode

decoding, single-cycle internal RISC operations, parallel

execution units, out-of-order execution, data forwarding,

register renaming, and dynamic branch prediction. In other

words, the AMD-K6 is capable of issuing, executing, and

retiring multiple x86 instructions per cycle, resulting in

superior scaleable performance.

This document describes the multimedia technology of the

AMD-K6 processor, including data types, instructions, and

programming considerations.

Multimedia Technology Architecture

The multimedia technology in the AMD-K6 MMX enhanced

processor is designed to accelerate media and communication

applications. Specialized applications that use music synthesis,

speech synthesis, speech recognition, audio and video

compression and decompression, full motion video, 2D and 3D

graphics, and video conferencing, can take advantage of the

AMD-K6 processor multimedia technology. The multimedia

technology implements new instructions, new data types, and

powerful parallel processing (Single Instruction Multiple Data,

SIMD) techniques that can significantly increase the

performance of these applications.

Key Functionality

At the lowest levels, multimedia applications (audio, video, 3D

graphics, and telephony, etc.) contain many similar functions.

When these functions are performed on a processor that does

not have MMX capability, the processor is heavily burdened by

the computational requirements of this information. Processors

executing the MMX instructions increase the performance of

AMD-K6™ Processor Multimedia Technology 3

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

multimedia applications. This performance increase is a direct

result of the increased multimedia bandwidth of the processor.

Multimedia applications must process large amounts of data.

Parallel data computing is exemplified by applications that

manipulate screen pixel information. Instead of acting on one

pixel at a time, multimedia technology enables the system to

act on multiple pixels simultaneously. This Single Instruction

Multiple Data (SIMD) model is a key feature of MMX

technology.

The AMD-K6 processor multimedia technology architecture

includes four new MMX data types, 57 new MMX instructions,

eight new 64-bit MMX registers, and an SIMD processing

pipeline. The multimedia technology is compatible with

existing x86 applications.

The 57 new MMX instructions include arithmetic functions,

packing and unpacking functions, logical operations, and

moves. These are the basic functions that are most commonly

used in repetitive computational multimedia programs.

Multimedia applications often use smaller operands—8-bit data

is commonly used for pixel information and 16-bit data is used

for audio samples. The new MMX registers allow data to be

packed into 64-bit operands. For example, 8-bit data (1 byte)

can be packed in sets of eight in a single 64-bit register, and all

eight bytes can be operated on simultaneously by a single MMX

instruction.

For 256-color video modes, this translates to computing eight

pixels per instruction. When an entire screen is being re-drawn,

these pixel manipulation routines often use highly repetitive

loops. Parallel processing of eight pieces of data can reduce the

processing time of a code loop by up to a factor of eight.

Multimedia applications frequently multiply and accumulate

data. The multimedia technology provides instructions that

add, multiply, and even combine these operations. For example,

the PMADDWD instruction can multiply and then add words of

data in a single instruction that uses far less processor cycles

than the equivalent x86 operations.

4 AMD-K6™ Processor Multimedia Technology

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

Executing MMX™

Instructions

A programmer must approach the use of MMX instructions

differently, based on whether the code being developed is at

the system level or at the application level. The details of these

differences are discussed in “Programming Considerations” on

page 9.

Before using the MMX instructions, the programmer must use

the CPUID instruction to determine if the processor supports

multimedia technology. See the AMD Processor Recognition

Application Note, order# 20734, for more information.

Function 1 (EAX=1) of the AMD-K6 processor CPUID

instruction returns the processor feature bits in the EDX

register. Software can then test bit 23 of the feature bits to

determine if the processor supports the multimedia technology.

If bit 23 is set to 1, MMX instructions are supported. All

AMD-K6 processors have bit 23 set. Once it is determined that

multimedia technology is supported, subsequent code can use

the MMX instructions. Alternatively, the AMD 8000_0001h

extended CPUID function can be used to test whether the

processor supports multimedia technology.

After a module of MMX code has executed, the programmer

must empty the MMX state by executing the EMMS command.

Because the MMX registers share the floating-point registers,

an instruction is needed to prevent MMX code from interfering

with floating-point. The EMMS command clears the multimedia

state and resets all the floating-point tag bits. Emptying the

MMX state sets the floating-point tag bits to empty (all ones),

which marks the MMX/FP registers as invalid and available.

Register Set

The AMD-K6 processor implements eight new 64-bit MMX

registers. These registers are mapped on the floating-point

registers. As shown in Figure 1 on page 5, the new MMX

instructions refer to these registers as mmreg0 to mmreg7.

Mapping the new MMX registers on the floating-point stack

enables backwards compatibility for the register saving that

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

Figure 1. MMX™ Registers

Aliasing the MMX registers onto the floating-point stack

registers provides a safe way to introduce this new technology.

Instead of needing to modify operating systems, new MMX

applications can be supported through device drivers, MMX

libraries, or DLL files. See the Programming Considerations

section of this document for more information.

Current operating systems have support for floating-point

operations. Using the floating-point registers for MMX code is

an ingenious way of implementing automatic support for MMX

instructions. Every time the processor executes an MMX

instruction, all the floating-point register tag bits are set to zero

(00b=valid). Setting the tag bits after every MMX instruction

prevents the processor from having to perform extra tasks.

These extra tasks are normally executed on floating-point

registers when the Tag field is something other than 00b.

If a task switch occurs during an MMX or floating-point

instruction, the Control Register (CR0) Task Switch (TS) bit is

set to 1. The processor then generates an interrupt 7 (int 7

Device Not Available) when it encounters the next

floating-point or MMX instruction, allowing the operating

system to save the state of the MMX/FP registers.

TAG BITS 63 0

mmreg0

mmreg7

mmreg1

mmreg6

mmreg5

mmreg2

mmreg3

mmreg4

xx

xx

xx

xx

xx

xx

xx

xx

6 AMD-K6™ Processor Multimedia Technology

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

If there is a task switch when MMX applications are running

with older applications that do not include MMX instructions,

the MMX/FP register state is still saved automatically through

the int 7 handler.

Data Types

The AMD-K6 processor multimedia technology uses a packed

data format. The data is packed in a single, 64-bit MMX register

or memory operand as eight bytes, four words, or two double

words. Each byte, word, doubleword, or quadword is an integer

data type.

The form of an instruction determines the data type. For

example, the MOV instruction comes in two different forms—

MOVD moves 32 bits of data and MOVQ moves 64 bits of data.

The four new data types are defined as follows:

Packed byte Eight 8-bit bytes packed into 64 bits

Signed integer range(–27 to 27–1)

Unsigned integer range(0 to 28–1)

Packed word Four 16-bit words packed into 64-bits

Signed integer range(–215to 215–1)

Unsigned integer range(0 to 216–1)

Packed Two 32-bit doublewords packed into 64 bits

doubleword Signed integer range(–231 to 231–1)

Unsigned integer range(0 to 232–1)

Quadword One 64-bit quadword

Signed integer range(–263 to 263–1)

Unsigned integer range(0 to 264–1)

Figure 2 on page 7 shows the four new data types.

AMD-K6™ Processor Multimedia Technology 7

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

Figure 2. MMX™ Data Types

Instructions

The AMD-K6 processor multimedia technology includes 57 new

MMX instructions. These new instructions are organized into

the following groups:

n Arithmetic

n Empty MMX registers

n Compare

n Convert (pack/unpack)

n Logical

n Move

n Shift

The following mnemonics are used in the instructions:

n P—Packed data

n B—Byte

n W—Word

n D—Doubleword

n Q—Quadword

n S—Signed

63 56 55 47

63

39 31 23 15 7

47

63

63

31 15

48 40 32 24 16

0

0 32

48 32 16 0

0 8

31

(8 bits x 8) Packed bytes

(16 bits x 4) Packed words

(32 bits x 2) Packed double words

(64 bits x 1) Quadword

B2 B1 B4 B3 B5 B0 B6 B7

W0 W1 W2 W3

D0 D1

Q0

8 AMD-K6™ Processor Multimedia Technology

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

n U—Unsigned

n SS—Signed Saturation

n US—Unsigned Saturation

For example, the mnemonic for the PACK instruction that packs

four words into eight unsigned bytes is PACKUSWB. In this

mnemonic, the US designates an unsigned result with

saturation, and the WB means that the source is packed words

and the result is packed bytes.

The term saturation is commonly used in multimedia

applications. Saturation allows mathematical limits to be

placed on the data elements. If a result exceeds the boundary of

that data type, the result is set to the defined limit for that

instruction. A common use of saturation is to prevent color

wraparound.

Instruction Formats

All MMX instructions, except the EMMS instruction that uses

no operands, are formatted as follows:

INSTRUCTION mmreg1, mmreg2/mem64

The source operand (mmreg2/mem64) can be either an MMX

register or a memory location. The destination operand

(mmreg1) can only be an MMX register.

The MOVD and MOVQ instructions also have the following

acceptable formats:

MOVD mmreg1, mreg32/mem32

MOVD mreg32/mem32, mmreg1

MOVQ mem64, mmreg1

In the first example, the source operand (mreg32/mem32) can

be either an integer register or a 32-bit memory address. The

destination operand (mmreg1) can only be an MMX register.

The second example has the source operand as an MMX

register. The destination operand (mreg32/mem32) can be

either an integer register or a 32-bit memory address. The third

example has the source operand as an MMX register and the

destination operand as a 64-bit memory location

The SHIFT instructions can also utilize an immediate source

operand. It is designated as imm8.

PSRLW mmreg1, imm8

9

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

2

Programming

Considerations

This chapter describes considerations for programmers writing

operating systems, compilers, and applications that utilize

MMX instructions as implemented in the AMD-K6 MMX

Feature Detection

To use the AMD-K6 processor multimedia technology, the

programmer must determine if the processor supports them.

The CPUID instruction gives programmers the ability to

determine the presence of multimedia technology on the

processor. Software must first test to see if the CPUID

instruction is supported. For a detailed description of the

CPUID instruction, see the AMD Processor Recognition

Application Note, order# 20734.

The presence of the CPUID instruction is indicated by the ID

bit (21) in the EFLAGS register. If this bit is writable, the

CPUID instruction is supported. The following code sample

shows how to test for the presence of the CPUID instruction.

10 Programming Considerations

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

pushfd ; save EFLAGS

pop eax ; store EFLAGS in EAX

mov ebx, eax ; save in EBX for later testing

xor eax, 00200000h ; toggle bit 21

push eax ; put to stack

popfd ; save changed EAX to EFLAGS

pushfd ; push EFLAGS to TOS

pop eax ; store EFLAGS in EAX

cmp eax, ebx ; see if bit 21 has changed

jz NO_CPUID ; if no change, no CPUID

If the processor supports the CPUID instruction, the

programmer must execute the standard function, EAX=0. The

CPUID function returns a 12-character string that identifies the

processor’s vendor. For AMD processors, standard function 0

returns a vendor string of “Authentic AMD”. This string

requires the software to follow the AMD definitions for

subsequent CPUID functions and the values returned for those

functions.

The next step is for the programmer to determine if MMX

instructions are supported. Function 1 of the CPUID

instruction provides this information. Function 1 (EAX=1) of

the AMD CPUID instruction returns the feature bits in the EDX

register. If bit 23 in the EDX register is set to 1, MMX

instructions are supported. The following code sample shows

how to test for MMX instruction support.

mov eax,1 ; setup function 1

CPUID ; call the function

test edx, 800000 ; test 23rd bit

jnz YES_MM ; multimedia technology supported

Alternatively, the extended function 1 (EAX=8000_0001h) can

be used to determine if MMX instructions are supported.

mov eax,8000_0001h ; setup extended function 1

CPUID ; call the function

test edx, 800000 ; test 23rd bit

jnz YES_MM ; multimedia technology supported

Programming Considerations 11

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

Task Switching

A task switch is an event that occurs within operating systems

that allows multiple programs to be executed in parallel. Most

modern operating systems utilizing task switching, are called

multitasking operating systems.

There are two types of multitasking operating systems—

cooperative and preemptive.

Cooperative

Multitasking

In cooperative multitasking operating systems, applications do

not care about other tasks that may be running. Each task

assumes that it owns the machine state (processor, registers, I/O,

memory, etc.). In addition, these tasks must take care of saving

their own information (i.e., registers, stacks, states) in their own

memory areas. The cooperative multitasking operating system

does not save operating state information for the applications.

There are different types of cooperative multitasking operating

systems. Some of these operating systems perform some level of

state saves, but this state saving is not always reliable. All

software engineers programming for a cooperative multitasking

environment must save the MMX or floating-point states before

relinquishing control to another task or to the operating

system. The FSAVE and FRSTOR commands are used to

perform this task. Figure 4 illustrates this task switching

process.

Note: Some cooperative operating systems may have API calls to

perform these tasks for the application.

Figure 3. Cooperative Task Switching

PROGRAM MUST

RESTORE STATES

FRSTOR

code executing

code module

finished

PROGRAM MUST

SAVE STATES

FSAVE

goto TASK 1

executing

MMX™/FP code PROGRAM MUST RESTORE

STATES

FRSTOR

executing code

TASK 1 TASK 2 TASK 1

Task Switch

to TASK 2

PROGRAM MUST

SAVE STATES

FSAVE

12 Programming Considerations

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

Preemptive

Multitasking

In preemptive multitasking operating systems like OS/2,

Windows NT™, and UNIX, the operating system handles all

state and register saves. The application programmer does not

need to save states when programming within a preemptive

multitasking environment. The preemptive multitasking

operating system sets aside a save area for each task.

In a preemptive multitasking operating system, if a task switch

occurs, the operating system sets the Control Register 0 (CR0)

Task Switch (TS) bit to 1. If the new task encounters a

floating-point or MMX instruction, an interrupt 7 (int 7, Device

Not Available) is generated. The int7 handler saves the state of

the first task and restores the state of the second task. The int7

handler sets the CR0.TS to 0 and returns to the original

floating-point or MMX instruction in the second task. Figure 4

illustrates this task switching process.

Figure 4. Preemptive Task Switching

executing

MMX™/FP code

executing code Save Task 1 State

Restore Task 2

Set CR0.TS=0

Return to Task 2

MMX/FP code

TASK 1 TASK 2 INT 7 handler

Task Switch

to TASK 2

Set CR0.TS=1

Encounter

MMX/FP code

Because TS=1

goto INT 7

handler

Programming Considerations 13

20726D/0—January 2000 AMD-K6™ MMX™ Enhanced Processor Multimedia Technology

Preliminary Information

Exceptions

Table 1 contains a list of exceptions that MMX instructions can

generate.

The rules for exceptions have not changed in the

implementation of MMX instructions. None of the exception

handlers need to be modified.

Note:

1. An invalid opcode exception interrupt 6 occurs if an MMX

instruction is executed on a processor that does not

support MMX instructions.

2. If a floating-point exception is pending and the processor

encounters an MMX instruction, FERR# is asserted and, if

CR0.NE = 1, an interrupt 16 is generated.

Table 1. MMX™ Instruction Exceptions

Exception Real

Virtual

8086 Protected Description

Invalid opcode (6) X X X The emulate MMX instruction bit (EM) of the control

register (CR0) is set to 1.

Device not available (7) X X X Save the floating-point or MMX state if the task switch

bit (TS) of the control register (CR0) is set to 1.

Stack exception (12) X X X During instruction execution, the stack segment limit

was exceeded.

General protection (13) X During instruction execution, the effective address of

one of the segment registers used for the operand

points to an illegal memory location.

Segment overrun (13) X X One of the instruction data operands falls outside the

address range 00000h to 0FFFFh.

Page fault (14) X X A page fault resulted from the execution of the

instruction.

Floating-point exception

pending (16)

X X X An exception is pending due to the floating-point

execution unit.

Alignment check (17) X X An unaligned memory reference resulted from the

instruction execution, and the alignment mask bit

(AM) of the control register (CR0) is set to 1. (In

Protected Mode, CPL = 3.)

14 Programming Considerations

AMD-K6™ MMX™ Enhanced Processor Multimedia Technology 20726D/0—January 2000

Preliminary Information

Mixing MMX™ and Floating-Point Instructions

The programmer must take care when writing code that

contains both MMX and floating-point instructions. The MMX

code modules should be separated from the floating-point code

modules. All code of one type (MMX or floating-point code)

should be grouped together as often as possible. To obtain the

highest performance, routines should not contain any

conditional branches at the end of loops that jump to code of a

different type than the code that is currently being executed.

In certain multimedia environments, floating-point and MMX

instructions may be mixed. For example, if a programmer wants

to change the viewing perspective of a three-dimensional scene,

the perspective can be changed through transformation

matrices using floating-point registers. The picture/pixel

information is integer-based and requires MMX instructions to

manipulate this information. Both MMX and floating-point

instructions are required to perform this task.

The software must clean up after itself at the end of an MMX

code module. The EMMS instruction must be used at the end of

an MMX code module to mark all floating-point registers as

empty (11=empty/invalid). In cooperative multitasking

operating systems, the EMMS instruction must be used when

switching between tasks.

Note: In some situations, experienced programmers can utilize the

MMX registers to pass information between tasks. In these

situations, the EMMS instruction is not required.

The tag bits are affected by every MMX and floating-point

instruction. After every MMX instruction except EMMS, all the

tag bits in the floating-point tag word are set to 0. When the

EMMS instruction is executed, all the tag bits in the tag word

are set to 1.

Prefixes

All instructions in the x86 architecture translate to a binary

value or opcode. This 1 or 2 byte opcode value is different for

each instruction. If an instruction is two bytes long, the second

byte is called the Mod R/M byte. The Mod R/M byte is used to

further describe the type of instruction that is used.

Test Tech 

need other links go to SITE MAP

home B & I SNOWDEN-Find a Wealth of Products and Services

   Bisnowden,3330 Adeline st. Berkeley,Ca94703 or send to bisnowden@yahoo.com Tele 510-595-1332
send mail to bisnowden@yahoo.com with questions or comments
  about this web site.

Last modified: July 07, 2011