Talk:Assembly language/Archive 3

From Wikipedia, the free encyclopedia

This is an archive of past discussions about Assembly language. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Archive 3

Archive 4

User asm

Hi, I've suggested to rename seven User asm categories to User ASM. The lower case names would conflict with the Ethnologue/IANA/ISO asm language code as used in the {{#babel:…|asm-?|…}} magic. The existing {{User Assembly Language}} templates -0…-5 and -N won't be affected. –Be..anyone (talk) 05:48, 15 February 2014 (UTC)

ERRATA sheet, what?

I consider myself an assembler expert, and I've never come accross an erata sheet, much less one that is treated by a linker. The suggestion ensues that one pass assemblers were the norm in primitive system, while in fact two pass assemblers were, and passing a long source on paper tape was twice done on e.g. Intel's first development system for the 8080. Later came the Isis system featuring double 8 inch floppies, and undoubtedly the assembler was two pass, but who to prove it. 80.100.243.19 (talk) 04:21, 19 February 2014 (UTC)

variable instruction length

The authors are muddying the waters by considering early on in the discussion of the number of passes the possibility that the assembler decides whether a jump could be long or short. It is quite common that the assembler forces or allows the programmer to specify the length. Under that assumption the one versus two pass can be discussed clearly. It must also be pointed out that in that case more than two passes are never applicable.

Address fix-ups and exploitation of short jumps are not the only reasons for multiple passes, especially in the early days. Shmuel (Seymour J.) Metz Username:Chatul (talk) 20:10, 26 February 2014 (UTC)

Early alternatives to assemblers for systems programming

While it is certainly true that most compilers and operating systems in the 1950s and 1960s were written in assembler, the article overstates the degree of dominance. Notable examples of languages used for system implementation include

Algol 60
BLISS
ESPOL
FORTRAN^[a]
JOVIAL
NELIAC
Pascal^[a]
PL/I

Notes

[a]
I happen to believe that it was an awful choice -:(

Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:12, 18 November 2010 (UTC)

I think that such lists should be avoided, there are too many languages for them to be documented in this fashion. For instance, you somehow missed naming what is arguably the most influental language of all, a language that has probably been used to create more operating systems than any other. That language is 'C' which often has the nickname of Portable Assembly or High Level Assembly. So how many other languages got missed from that list? How about something exotic like FORTH which was an operating system complete unto itself? You have defined a nearly impossible task. If you limit it to generalities, then yes, I agree most operating system development in the later days was done in higer level languages. The earlier it was, the more likely it was to be done in assembly, it was an evolutionary process -- go back far enough and it was done in hand coded binary. OldCodger2 (talk) 07:52, 29 January 2013 (UTC)

You seem to miss the point here... The languages mentioned were used decades before C became dominant as a systems implementation language in the late 1980's (basically because of it's tight association with UNIX). The early C (or B) of 1972 was a hack used to port an early version of UNIX to the PDP-8. B as well as early versions of C lacked typing and was basically just a simplified and – sadly enough – a syntactically changed variant of Martin Richard's simple but elegant BCPL (which in turn was a partial implementation of the CPL language). A modern C appeared on the public scene around 1978, but it was not until 10 years later that it became really dominant. At that point, languages like Algol68, PL/1, BLISS, JOVIAL, PL/M, Simula, Pascal, Modula, and even Ada, had been used for systems programming for many years, just like C today. — Preceding unsigned comment added by 83.253.229.235 (talk) 01:33, 27 April 2014 (UTC)

Sample code missing language

User:Ankitapasricha (no talk page) added a sample code section, but failed to mention the language. Any sample code in an article covering multiple languages should indicate the specific language, and, in this case, also the hardware platform. If anybody knows the processor, OS and assembler for the sample code, please update Assembly language#Sample Code to reflect that.

Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:06, 2 October 2014 (UTC)

Do we even need this section? It's nice-looking code and all, but what point is being made by this? Even if the language is given, do we expect readers to decode the instructions to follow the algorithm? If the point is just to give a "feel" for assembly, we have an image of some assembly code already included at the start of the article. --A D Monroe III (talk) 17:46, 16 October 2014 (UTC)

I could make a case for having examples, but I could also make a case for not having them, at least in the same article. My major concern is that if there are to be examples that they should be properly identified. I certainly have no objections should you decide to remove the unattributed example. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:01, 24 October 2014 (UTC)

reverted edit - opinions?

One of my recent edits was reverted:

While the original version may have been more concise I felt that it was incomplete. I'd appreciate it if someone could provide another opinion. Peter Flass (talk) 22:45, 12 February 2016 (UTC)

Typical applications - reverse engineering

Reverse engineering of software/firmware can be used for many reasons, from the most worthy to the most unworthy:

To address a flaw in a product that is no longer supported by the company who designed it, either because that company is defunct or because they moved on to fancier products...
- for example, the Turbo-C 1.0 compiler has been release into the public domain, but if you try to use a timing function on a modern computer (400Mhz+), it won't work. That particular library function would have to be rewritten for fast computers.
For educational purpose. Figuring out who things are done.
- for example, you can disassemble any product to learn how good (or bad) algorithms are written.
For making a competitive product, changing just enough to circumvent a patent.
for finding the security holes in a product in order to created viruses or other malware.

I wonder if the sentence that was added today in the article, belongs here or better under reverse engineering. Dhrm77 (talk) 13:26, 22 June 2016 (UTC)

Bytecode assemblers

Would it be worth making specific mention of bytecode assemblers (e,g, Jasmin)? Or are these subsumed under the notion of a virtual machine architecture? Peter Flass (talk) 13:33, 16 January 2012 (UTC)

Well, if you want to go down that path, I think you would have to start talking about Java, Java Script, PHP, etc, etc, all of which are Byte Code Assemblers for Virtual Machines. And none of those would really be appropriate for this page which focuses on Machine Language as excuted by actual Hardware. Certainly we get into grey areas with things like QEMU or MIX, but they still are dealing with Virtual Hardware. I feel that Byte Code would really be a separate topic. As a further consideration, nobody is expected to write programs in the Byte Code itself, it is an internal language that is not normally exposed. OldCodger2 (talk) 08:25, 29 January 2013 (UTC)

It seems to me that there isn't anything fundamentally different about Byte Code assemblers. Even the name is somewhat strange, as many hardware architectures are byte based. (See VAX for one example.) At one time, Sun did have hardware that at least partly executed JVM in hardware, and others could still build such hardware. As above, nobody is expected to write programs in JVM code, but pretty much nobody writes assembly code for the RISC processors, either, though someone has to write the templates for compiler code generators. So, it seems to me that JVM code has as much use here as, for example, SPARC assembly code. Gah4 (talk) 03:34, 20 October 2016 (UTC)

No, Java et al are not assemblers; the source languages are not machine oriented. Jasmin, OTOH, is, although the machine in question is virtual. Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:04, 25 October 2016 (UTC)

Rumors are that Sun at least was working on hardware to execute JVM code, and might have even fabbed a chip. I don't like the name byte code, as all that means is that the opcodes and operand specifiers are in units of bytes, which is true for many hardware architectures. Sun calls (called) it JVM, which is fine with me. But maybe I was mixing assembly code and machine code. For most architectures, there is an assembly language designed along with the hardware. As long as there is only one assembler, or all assemblers accept the same input source, there is no confusion. Normally, one can discuss machine code and assembly code without any confusion. Since Jasmin is, as well as I know, not written by Sun, there can be confusion. But Sun (and now Oracle) supply the javap disassembler, so at least some of the assembler syntax had to be defined. (I don't know how close javap output is to what is needed for Jasmin input.) Even though JVM is mostly emulated, without any dedicated hardware, I don't find the fundamental ideas of machine code (JVM bits) and assembly code (Jasmin input source) fundamentally different from other machine code and assembly code. And yes, I was not trying to confuse Java source and assembler source. Gah4 (talk) 18:57, 25 October 2016 (UTC)

Machine Language Assembly Code

Since the assembly represents actual executable content and does not require decoding or compiling to run. Assembler code is not "Source" code, it is simply code. More accurately, it is object code. This machine code may be loaded into memory and called with no need for compiling as long as it is in binary format. The term source code is improper but acceptable in a non academic discussion. Whether it is appropriate in a Wikipedia Article I will leave up to others. Scottprovost (talk) 03:11, 14 November 2016 (UTC) Scottprovost (talk) 03:18, 14 November 2016 (UTC)

Incorrect. Assembly code is not the same as the machine code (what you call "binary format"), does not usually express the same thing as machine code, and is not necessarily invertibly mappable to machine code. Assembly code must be translated to the actual machine code. There is not necessarily a one-to-one correspondence between the two - for just one of the reasons, see the above comment regarding symbolic names for memory locations.

Anyway this opinion of yours is moot as far as Wikipedia is concerned unless you have references for recognized authorities in the field expressing it. (I seriously doubt your idea is common in academia.) Do you have references? Jeh (talk) 19:45, 14 November 2016 (UTC)

Not only is he wrong, but it's possible to write useful assembly code that generates no machine code, e.g., OS/360 SysGen Stage 1. Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:19, 18 November 2016 (UTC)

Just to be sure, some call it Assembly Source. Yes it is source code, much more readable than object code. As for OS/360 sysgen, I would not call that assembly code, but use of the assembler for something else. But one can assemble a table of address constants, which aren't machine code, but are referenced by machine code, or even non-address constants. (The initialization for a Fortran COMMON block is all data, no instructions.) Gah4 (talk) 01:35, 19 November 2016 (UTC)

Assemblers scheduling instructions?

"Modern assemblers, especially for RISC architectures, such as SPARC or Power Architecture, as well as x86 and x86-64, optimize instruction scheduling to exploit the CPU pipeline efficiently." - What? There is no citation but this suggests there exists an assembler which can do this. The closest thing I can think of would be MIPS branch delay slots, where assemblers exist (such as GNU gas) that can fill the slot with an instruction. 24.85.180.193 (talk) 04:27, 11 January 2014 (UTC)

OK, how about for IA-64 (Itanium) where a 128 bit word contains three instructions, controlling different functional units at the same time. Hopefully the assemblers provide some help on organizing instructions such that things happen at the right time. Gah4 (talk) 05:44, 29 November 2016 (UTC)

Machine Language Assemblers

An assembler for assembling machine language uses Mnemonics to represent the binary codes of machine language. So it is not technically a separate language but an easier to remember alphabet for typing that machine language. Very high level macro-assemblers create what may look like an addition to the language giving rise to the idea that a new and separate language has been created but this article is about Machine Language Assemblers and should avoid such confusion IMHO Scottprovost (talk) 02:59, 14 November 2016 (UTC) Scottprovost (talk) 02:55, 14 November 2016 (UTC)

You are forgetting (or perhaps you were never aware of) the fact that the vast majority of assemblers, even non-macro assemblers, support symbolic names for memory locations - both for code (branch points) and for data. The assembler must assign numeric locations for each symbol and include the appropriate address in the generated binary code for each of the instructions that reference such locations. So it is not just a matter of changing mnemonic opcodes, etc., into the machine language. Plus, in most modern environments, the output of the assembler is not yet executable; it must be processed by a "linker"; some of the assembler language constructs are interpreted by the linker, or are instructions to it. Jeh (talk) 19:53, 14 November 2016 (UTC)

Usually there is a link step, but not always. For IBM S/360, there is a three card loader that will load an object program, as generated by the assembler, into core and start executing it. While the usual way to use an assembler is to generate relocatable object code, many have the ability to generate absolute addresses. For the IBM 360/20, the usual way to load programs is with a one card loader. (There is no OS, you just load and run your program.) Other systems have a similar ability to load and run programs. Gah4 (talk) 22:57, 29 November 2016 (UTC)

Errata assemblers, huh?

I count myself as an assembler expert, and even have written a few. I never have come accross an errata assembler as described here, let alone as a class in its own standing apart from multi pass assemblers. Is this a crippled way to describe assemblers that give relocatable output? But in that case the term one pass is misleading, as relocatable output must be processed by a linker before it can be executed.

I'm so non plussed that I hesitate to try clear things up. — Preceding unsigned comment added by 80.100.243.19 (talk) 14:09, 2 December 2014 (UTC)

I've read about such one-pass assemblers, although I've never encountered one. But then I only started in 1960, and there were a lot of assemblers out by then. Shmuel (Seymour J.) Metz Username:Chatul (talk) 20:17, 2 December 2014 (UTC)

I remember stories about Soviet processors, where each came with a list indicating which instructions didn't work. (Similar to a bad-block map for older disk drives.) I suspect that having assemblers and compilers that could work with such lists would be useful. I don't know if that is related to errata assemblers, though. Gah4 (talk) 22:59, 29 November 2016 (UTC)

Link to assembler disambiguation page?

The last update changed the bare word assembler to a link. However, assembler is just a disambiguation page rather than an explanation of assembler programs. Is such a link appropriate? Shmuel (Seymour J.) Metz Username:Chatul (talk) 22:22, 7 February 2013 (UTC)

In general, you are not supposed to link to disambiguation pages. In some cases, though, it might be interesting to see other uses for a word. More specifically, what is an assembler assembling? Is that meaning related to other uses for the words assembler and assembly? (And, for that matter, should it be assembly language or assembler language?) Gah4 (talk) 23:11, 29 November 2016 (UTC)

A pseudo-opcode is a directive

A pseudo-opcode is a directive, and some assembler use only the term pseudo-op for the functions listed in the article as pertaining to directives. The text

* An '''assembler directive''' or ''pseudo-opcode'' is a command given to an assembler. These directives may do anything from telling the assembler to include other source files, to telling it to allocate memory for constant data. Some assemblers use special syntax for directives; others do not.

should be reinstated Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:57, 22 August 2010 (UTC)

A pseudo-opcode is not a directive. A pseudo-opcode is a stand-in opcode for another opcode. For example, many older CPUs do not have a nop instruction. But often there is another instruction that can be used instead with the same effect as a nop. In 8086 CPUs the instruction xchg ax,ax was always used for nop. With nop being a pseudo-opcode to encode the instruction xchg ax,ax.

That's not a psuedo-op, just an alias (sometimes called an extended mnemonic). Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:51, 23 August 2010 (UTC)

A directive is something that generates no output code but instead directs the assembler to do some internal function.

So is a pseudo-op. Of course, some pseudo-ops do generate output, e.g., the PUNCH statement in the System/360 assemblers. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:51, 23 August 2010 (UTC)

Please don't misrepresent what I wrote. I said "generates no output code" not "generates no output". HumphreyW (talk) 18:00, 23 August 2010 (UTC)

Fine, then try the DC statement, which can be used to generate output code. And lest you quibble about that not being a directive, the article says For example, directives would be used to reserve storage areas and optionally their initial contents. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:45, 24 August 2010 (UTC)

Even the names say exactly what they are, a directive directs the assembler, and a pseudo-opcode is a fake opcode for something else. HumphreyW (talk) 16:52, 22 August 2010 (UTC)

Agree with HumphreyW. It may be true that some assemblers use the same term for both; nevertheless the concepts are quite different and should be named differently here. Perhaps the article could say something like "some assemblers, including x, y, and z, use the term 'a' for both 'a' and 'b'." Jeh (talk) 07:47, 23 August 2010 (UTC)

I have already made some changes to the main article. But I think it would not be a good idea to start listing specific assemblers that misuse the term. HumphreyW (talk) 07:53, 23 August 2010 (UTC)

I'm just thinking that a statement that "some assemblers use one term for both" would really require support from an example. Jeh (talk) 11:30, 23 August 2010 (UTC)

What is the referent for both? Some assemblers use the term pseudo-op for what the articles calls directive, and AFAIK the tern is older than directive in that context. See, e.g., IBM (April 1964). IBM 7090/7094 Programming Systems FORTRAN II Assembly Program (FAP). C28-6235-3. IBM (December 30, 1966). IBM 7090/7094 IBSYS Operating System Version 13 Macro Assembly Program (MAP) Language. Fifth Edition. C28-6392-4,

I can provide examples of assemblers that use the term properly. What is your basis for claiming the historical usage to be a misuse? Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:51, 23 August 2010 (UTC)

The nomenclature should agree with what is actually used by the authors of assemblers, not the local CS department. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:51, 23 August 2010 (UTC)

IBM (1961). FORTRAN Assembly Program for the IBM 709/7090. pp. 24–53. J28-6098-1. In addition to recognizing all the 709 machine operation codes and extended operation codes listed in the 709 Reference Manual, the FAP language also recognizes the following psueod-operations, described in detail in the succeeding chapters.

psueod-operations [sic] is not a pseudo-opcode. HumphreyW (talk) 18:00, 23 August 2010 (UTC)

In matters of language, e.g. word usage, I think ancient references are less good than current ones - language, especially technical language, does evolve. Jeh (talk) 19:07, 23 August 2010 (UTC)

Is April, 2010 too ancient? AIX Version 6.1 Assembler Language Reference, "Pseudo-ops are sometimes called assembler instructions, assembler operators, or assembler directives." Shmuel (Seymour J.) Metz Username:Chatul (talk) 23:39, 23 August 2010 (UTC)

I see three problems with your reference.

It says "pseudo-ops" and not "pseudo-opcode".

Fine, then use pseudo-op in the article. Historically pseudo-op and pseudo-operation are synonymous: IBM (1961). FORTRAN Assembly Program for the IBM 709/7090. pp. 24–53. J28-6098-1. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:45, 24 August 2010 (UTC)

It uses weasel words. "sometimes"? When and by whom?

That would be relevant if I were arguing for the legitimacy of directive; it doesn't use weasel words about the use of pseudo-op. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:45, 24 August 2010 (UTC)

It is very vague and seems to cover everything with one catch-all term.

Historically the term pseudo-op has covered everything that is not a machine instruction or macro invocation. The same is true of the term directive; it's as much of a catch-all term as pseudo-op. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:45, 24 August 2010 (UTC)

But more pointedly, do you really think that using that as a reference would really help the encyclopaedia? It looks to me like it creates more confusion rather than actually clarify anything. HumphreyW (talk) 02:06, 24 August 2010 (UTC)

Have you stopped beating your wife? You wanted a reference as to the legitimacy of the term, and I provided references. The question is whether the reference establishes the usage, not whether it is the best reference to cite in the article. I've established that the usages is decades old and still current. That should be enough to justify restoring the deleted text. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:45, 24 August 2010 (UTC)

Reading through this discussion again I see that there is confusion about what is being discussed. I have been careful to always say pseudo-opcode, but I think the point did not get across properly. One commenter talks about pseudo-op, and said "Fine, then use pseudo-op ". Well we can't do that because it is not the same thing. Pseudo-op is ambiguous and can mean other things. Pseudo-operation is also ambiguous and is not used consistently in many cases. Pseudo-opcode has a very clearly defined meaning, and currently the article gives that meaning.

No, it does not have a clearly defined meaning. Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:22, 7 September 2010 (UTC)

If one wants to add pseudo-op to that article, then by all means do that. But please do not do it at the expense of removing/replacing the pesudo-opcode description. It will require a different/new description explaining the ambiguity and alternative uses. However I think that adding pseudo-op, and talking at length about various meanings in various places, will not help to make the article any clearer or better. HumphreyW (talk) 14:07, 6 September 2010 (UTC)

All of the terms are ambiguous. My intent is not to replace one parochial view with another parochial view, but rather to convey the fact that the nomenclature is not standardized; in fact, not even the taxonomy is standardized. The initial issue was the removal of text indicating the variability.

What I'd like to do is to use neutral descriptions of various categories and cite the various terms that are used for those categories, with references if that's not TMI. Can I do that without the changes being reverted? Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:22, 7 September 2010 (UTC)

I wonder if it is even worthwhile to have such a lengthy description of taxonomy and/or nomenclature? Since you say it is currently wrong (or misleading, or whetever), then maybe the whole section should actually be axed from the article. But to replace with something even more cumbersome and lengthy seems unhelpful. If it is really as complicated as you say then it sounds like it is just going to confuse readers more than help them. If every term can mean every other term the the terms themselves become useless.

As for your question "Can I do that without the changes being reverted?" that is unanswerable. There are so many editors here on Wikipedia that no one can speak for all of them. What I would suggest is that you can put you proposal here in the talk page and assuming it is well referenced with relevant sources and no one bitterly complains then it can be copied into the main page later. That will likely give the best chance of not having edits reverted. HumphreyW (talk) 04:41, 8 September 2010 (UTC)

I wholly support the mention of alternative usage of technical terms as used in reliable sources. I think we should stay away from "x vendor says y" because there's just too much variation across vendors and authors. A short nomenclature section that briefly discusses the lack of an industry standard or clear norm and makes it clear that definitions in this article are not entirely representative sounds very reasonable. It could also be done with footnotes where appropriate, though this sounds more cumbersome. ButOnMethItIs (talk) 05:00, 8 September 2010 (UTC)

I've added some text to make it more neutral, and have also added a reference to support the usage of pseudo opcode as equivalent to directive. I was going to make it a footnote, but I saw that the article has a long list of references that are simply links. Shmuel (Seymour J.) Metz Username:Chatul (talk) 14:59, 8 September 2010 (UTC)

OK, I am a little late to this one, but it seems to me that different assemblers, or assemblers for different processors, use different names for some terms. A table indicating the meaning, and the different names would be useful. Still remembering from when I was first learning about assemblers, this was what confused me. The descriptions, in at least the IBM manuals, of assemblers mostly describes what the assembler does. You need to find somewhere else the descriptions of machine instructions, and how to use them. Gah4 (talk) 05:25, 29 November 2016 (UTC)

As far as I can tell, IBM OS/360 assemblers call all these assembler instructions, and PDP-10/MACRO-10^[1] call them all pseudo-ops.

And VAX/Macro^[2] seems to call them assembler directives. Gah4 (talk) 06:31, 29 November 2016 (UTC)

For the IBM System/360 line, a given assembler will have a programmer's guide^[3] and a language reference manual^[4], neither of which explains the semantics of machine instructions in detail. The architectural details are in a separate principles of operation manual^[5], possibly supplemented by manuals on specific feature. Other vendors do something similar, although some do include hardware details in their assembler documentation. Shmuel (Seymour J.) Metz Username:Chatul (talk) 22:13, 29 November 2016 (UTC)

Yes, IBM also had both manuals for most compiled languages, keeping the language definition separate from how to use the compiler. Some other companies don't make this distinction.Gah4 (talk) 23:43, 29 November 2016 (UTC)

Maybe I am the only one to learn OS/360 Assembler only reading IBM reference manuals. It was some time of looking at the manuals, and then understanding the distinction, and Assembler Instruction doesn't make it easier, if you don't yet know the distinction. In the cases I show above, the same wording is used for allocating and initializing data blocks, defining macros, defining entry points and external names, and formatting the output listing. That is, the distinctions that some used above don't seem to exist in the cases shown. Gah4 (talk) 23:43, 29 November 2016 (UTC)

References

[1]
decsystem10 Macro Assembler Reference Manual (PDF). DEC. AA-C780C-TB. Retrieved 29 November 2016. {{cite book}}: |website= ignored (help)
[2]
VAX-11_MACRO Language Reference Manual (PDF). DEC. Retrieved 29 November 2016. {{cite book}}: |website= ignored (help)
[3]
OS/VS-VM/370 Assembler Programmer's Guide (Fifth ed.). IBM. September 1982. GC33-4021-4.
[4]
OS Assembler Language OS Release 21 (Tenth ed.). IBM. January 1974. GC28-6514-9.
OS/VS-DOS/VSE-VM/370 Assembler Language (Sixth ed.). IBM. March 1979. GC33-4010-5.
[5]
IBM System/360 Principles of Operation (EIGHTH ed.). IBM. September 1968. A22-6821-7.
IBM System/370 Principles of operation (Eleventh ed.). IBM. September 1987. GA22-7000-10.

Not one to one

Assemblers are, in general, not one-to-one. They frequently have multiple mnemonics for the same opcode, and may perform optimizations, e.g., selecting near branches versus far branches. Then there are statements like EQU that do not generate code at all. Shmuel (Seymour J.) Metz Username:Chatul (talk) 18:22, 26 June 2017 (UTC)

OK, for the actual quote: Unlike high-level languages, there is usually a one-to-one correspondence between simple assembly statements and machine language instructions. I do agree that it could use fixing, though I am not so sure about your explanations. I read it as one line of assembler source generates one hardware machine instruction. You do have to separate what IBM calls assembler instructions (which complicates the quoted part) and others call pseudo-ops, from machine instructions. I don't read it to disallow that different assembler opcodes can generate the same machine opcode, or vice versa. Just that, as written, it is one of each. But okay, CNOP can generate more than one NOPR. But then again, it says usually. Another complication in counting is prefixes, such as in x86. They might be written on separate lines, or on the same line (an actual prefix) as the instruction they apply to, or even part of an address specifier (segment overrides). Some assemblers might fill a branch delay slot with a NOP. I suspect that there are some other cases, in other assemblers, but rare enough to satisfy usually. And, of course, there are macros which, sort of by definition, often generate more than one instruction.

I agree that most assemblers support a few non-one-to-one operations, but the main distinction of assemblers from compilers is that assemblers focus on one-to-one operations, while compilers have little or no concept of one-to-one support. The "usually" statement emphasizes that distinction, as we should. So, "one to one" must stay. Is there some suggestion for tweaking the wording? --A D Monroe III (talk) 23:53, 26 June 2017 (UTC)

As I suggested above, there is the complication that IBM calls the assembler source for executable instructions machine instructions, which is confusing, as the quoted statement calls the assembler output machine instructions. Statements that don't generate anything, but tell the assembler something are called assembler instructions. The quoted statement could have some way to make the distinction between such statements. Gah4 (talk) 02:38, 27 June 2017 (UTC)

Why? Of course there are details of reality that are more complicated than a single sentence can ever completely encompass. But the quoted statement is not attempting to completely encompass everything; it's only stating the main distinction between assemblers and compilers. I see nothing brought up here that affects the stated distinction, or implies the "usually" qualification insufficient or incorrect. Again, the wording might be improved, but we need to keep its substance and simplicity. --A D Monroe III (talk) 19:49, 27 June 2017 (UTC)

Yes. If someone happens to have better wording, it would be interesting to see. No complaints about the usually, but about the meaning of machine instruction, which can mean different things to different people. Gah4 (talk) 22:40, 27 June 2017 (UTC)

Potential Citation/Reference Issue

Don't like the example

Resolved

marked as resolved by Diamondl (talk) 03:25, 18 January 2018 (UTC)

I really don't like the section called "Example listing of assembly language source code" It provides no useful information and is in no way informative. Maybe a section of code from a real micro(processor/controller) would be useful but a snippet of code from a virtual device is useless. — Preceding unsigned comment added by Mtpaley (talk • contribs) 22:57, 20 December 2012 (UTC)

I agree, the example fails to convey anything meaningful, it appears to just be a bunch of random additions without any apparent purpose. Would it be okay with people if I were to replace it with this code instead? (taken from an actual program) OldCodger2 (talk) 09:41, 29 January 2013 (UTC)

       Example: x86 32 bit  NASM, Note: this is a subroutine not a complete program.
    
    
   178                                  ;ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
   179                                  ;
   180                                  ; counts a zero terminated ASCII string to determine it's size
   181                                  ; in:   eax = zstr.addr
   182                                  ; out:  ecx = zstr.count
   183		    
   184                                  zstr_count:                       ; entry point 
   185                                  
   186 00000030 B9FFFFFFFF                      mov  ecx, -1              ; init the loop counter, pre-decrement to compensate for increment
   187                                  
   188                                  .loop:
   189 00000035 41                              inc  ecx                  ; add 1 to the loop counter
   190                                  
   191 00000036 803C0800                        cmp  BYTE [eax + ecx], 0  ; compare the value at the base memory address + the loop offset to zero
   192 0000003A 75F9                            jne  .loop                ; if the memory value is Not Equal to Zero then jump to the label called '.loop'
   193                                  	
   194                                  .done:
   195                                                                    ; we don't do a final increment because, even though the count is base 1, 
   196                                                                    ; we do not include the zero terminator in the count
   197 0000003C C3                              ret                       ; return to the calling program
   198                                  	
   199                                  	
   200

I agree, and I note that one difference is that the suggested example specifies the architecture and the assembler for the listing. However, it might be better to create a separate article for assembler examples and to include examples for disparate assemblers, including IBM's HLASM and an assembler for at least one RISC. Shmuel (Seymour J.) Metz Username:Chatul (talk) 21:57, 31 January 2013 (UTC)

okay, thanks for the feedback, glad you agree, I have gone ahead and updated the article. Hopefully this won't lead to any disagreements. I will leave the creation of a separate page with lots of assembly examples for a different day. OldCodger2 (talk) 18:58, 5 February 2013 (UTC)

CPU loading

someone asked: are there programmable devices, that have assemblers, that aren't computers?

Overly broad claim.

Assembly language has the statement "Despite the power of macro processing, it fell into disuse in many high level languages (major exceptions being C, C++ and PL/I) while remaining a perennial for assemblers." The preprocessor facilities of C and C++ are not particularly powerful, and can't even do simple computations.

I added the footnote "Of those listed, only the PL/I macro facility is Turing Complete" and user:Wtshymanski reverted it with no explanation. I'm throwing this open to discussion before I reinstate my correction. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:08, 24 May 2018 (UTC)

This is an article about assembly language, not macro preprocessors. It is hugely common in C, even today, and even though it's not powerful. The point here isn't whether macro languages are powerful in particular languages, it's about whether they're used ot not. Andy Dingley (talk) 17:20, 24 May 2018 (UTC)

That's a good reason to remove the references to C and C++ entirely. It's not a good reason to make misleading claims about them, or to remove clarifications of those claims. And what is the it that's hugely common in C? Certainly not the types of mac4ros that have been common in assembler code for the last half century. Shmuel (Seymour J.) Metz Username:Chatul (talk)

Does anyone but comp-sci profs actually care if thus-and-so is Turing complete? And more importantly, is it necessary to send the reader of this article off on a wild goose chase to learn what "Turing complete" means? --Wtshymanski (talk) 19:34, 24 May 2018 (UTC)

Is it necessary to give the reader a misleading reference to C and C++? If you don't want the reference to those languages to be clarified, then remove them entirely, not just the clarification. Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:02, 25 May 2018 (UTC)

I was going to say that it belongs in an article on assemblers, and not on the language that they process, but it seems that this is also, with a redirect, the article on assemblers. There are articles on specific assemblers, but not on the general idea of an assembler program. One solution is to actually split this up, with the more theoretical parts here, and more practical parts, such as the need for multiple passes, in another article. Since macros are an integral part of many assemblers, discussing them doesn't seem so far off. Gah4 (talk) 20:37, 24 May 2018 (UTC)

Assembly language is the article on the general idea of an assembly program; the presence of examples doesn't change that. If you know of something that makes a claim specific to a particular assembler, please correct it or point it out here. BTW, do you know of an inappropriate redirect to Assembly language? Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:02, 25 May 2018 (UTC)

I was thinking along the lines of the difference between IBM's Language Reference and Programmer's Guide, or more generally, between theory and practice. As an example, which may not actually apply here, it is usual to use a hash table in an assembler to keep track of symbols. That is an implementation detail unrelated to the actual language. More obviously, as I noted above, the use of multiple passes. I do believe that it is possible to separate the language from the programs that process it, and still independent of any specific assembler. Gah4 (talk) 18:28, 25 May 2018 (UTC)

Does wiki currently have articles on compiler implementations^[a]? If so, one on assembler implementations might be a good idea if someone is willing to do the work. Such an article could go into the tradeoffs among, e.g., a hash table, a B-tree, sorting the symbol table. Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:28, 27 May 2018 (UTC)

[a]
Programming language implementation is not that article

Related Articles