Talk:Type system/Archive 1
This is an archive of past discussions about Type system. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 |
Liskov substitution
Taku, I appreciate your intent, but the discussion of Liskov Substitution Principle was out of place the way it was. I have removed it.
If we want to integrate all discussions of type systems into a single article, that's fine with me. If we want to put the Liskov discussion into the Subtype article, that's fine too (though we ought to mention that Liskov's is only one definition of "subtype"). However, just unceremoniously sticking the Liskov stuff into this article is definitely not appropriate.
Furthermore, a redirect is certainly not appropriate. "Liskov Substitution Principle" is not equivalent to "Datatype". Therefore, while the LSP article could be a very small article that just points to Datatype, IMHO it should not be a redirect. -- Doradus 14:25, 20 Sep 2003 (UTC)
- What I thought is that is LSP such a big topic? To me, it doesn't say more about what I know about subtypes. Since subtype is redirected to datatype, then it makes sense to me to merge LSP to here. Subtype certainly deserves to its own article, but for now it is fine to put a redirect to subtypes. If I use an analogy, it is like there is no object-oriented programming article while there is an article about what is a class. LSP looks out of place because this article should be expanded more and subtype should have its own article. -- Taku 00:12, 22 Sep 2003 (UTC)
- You make a good point. I had thought "Subtype" had its own article. Perhaps we should take a look at all the type-related pages (including LSP) and look for a way to merge them. -- Doradus 10:51, 22 Sep 2003 (UTC)
What a type specifies
Taku, Taku, Taku. The "what a type speficies" section is totally bogus. "alighments"? Sheesh. Byte size? I'm speechless. It's truly exhausting keeping track of your changes. Sometimes I wish you weren't so exuberant and prolific. :-) -- Doradus 15:20, 3 Oct 2003 (UTC)
- I am not sure what is wong with that section. For example, you have int type in C and it specifies the value of the type is 4 bytes long usually and signed and so on. Maybe wording doesn't make sense to you. Please remember I am always happy to see you editing too. -- Taku
- int = 4 bytes. Sorry to tell you but int >= 2 bytes. It should be the size of a CPU word - which resulted into lots of broken progamms when the transition from 16 bit (desktop) CPU's to 32 bit was made (you are probably to young to remember) and for most 64 bit CPU's they did not bother to make the change (as they should have done) because there are to many programmers (like you) which make false assumtions about int so there are even more buggy programmes around these days. --Krischik T 18:58, 26 June 2006 (UTC)
- Be careful generalizing based on what happens in C. If you only look at C, then a "function" is any subroutine whatsoever, a char is one byte in size, all parameters are passed by value, etc.
- Datatypes may occasionally specify objects sizes, layouts, and padding, but not usually. Can you tell the size, layout, or padding of types in Haskell? Lisp? Eiffel? Smalltalk? Python? Pascal? BASIC? Java specifies sizes of primitive types, but not of objects. Even C doesn't specify how big an int is (despite your assertion) nor how structs should be padded. In fact, I'm having a hard time thinking of any languages in which a datatype specifies size, layout, and padding.
- Well there is one: Ada allows you to specify any aspect of a primitive datatype - including the range of valid values or how to pack them into an array. --Krischik T 18:58, 26 June 2006 (UTC)
- Anyway, I'm sorry if I was condescending before. -- Doradus 01:07, 4 Oct 2003 (UTC)
- Oh, I see now what confused you. You thinking of specification of languages. Sure, C specification only says int must be of the size of natural length in the target language. What I was thinking is what kind of information is compilers use to analyze code and what information is still avaliable in run-time. For example, you can say virtual methods are kept even at run-time to enable dynamic binding. I guess it is really confusing to say what datatype specifies. For now, I just removed that portion and put revised version later. Let me know your thoughts then! -- Taku
- Refactoring type related articles gets complicated; I put a list of related articles at see also section of the article. They surely need to be merged or seprated. -- Taku
Strong/weak vs. static/dynamic
There still seems to be some loose use of language in this article, particularly on the difference between the strong/weak value axis and the static/dynamic type-checking variable axis.
Type safety means that the language will not execute code which uses a value in an operation for which its type is not acceptable. Virtually all languages make some effort towards type safety, in the form of type-checking (be it static or dynamic) and type-conversion. A failure of type-safety occurs in C or assembler when one treats the memory location of a short integer as if it contained a pointer -- and indirecting through the "pointer", one reaches a nonsense value.
Static type-checking means that the type-safety of a program is verified at compile-time, in terms of its variables. Static type-systems may require type declaration, or may use type inference -- either way, a type is assigned to each variable and function argument, and it is shown that no type-unsafe calls are made. Dynamic type-checking means that types are checked at run-time, in terms of argument values. A language which does not check types at run-time (e.g. C) does not have dynamic type-checking.
Static type-checking has been called incompatible with certain kinds of dynamic metaprogramming. This is the topic of a current controversy on comp.lang.lisp which may be enlightening (when it isn't being flamy).
Strong typing means that a value of one type may not legally be used in place of a value of another type. The set of operations which will accept a type is exactly the set of operations defined in terms of that type. Values must be converted or cast.
- Then the Scheme language is strongly typed rather than weakly typed, and the article's table should be corrected because of that. E.g. you have to explicitly convert between exact and inexact integers. But AFAIK, you can not add own types to the given, strong type system. --Thomas Hafner
Weak typing is the opposite: conversion is automatic and implicit. Weak typing does not imply that type-unsafe operations must be executed: it does imply that undesirable conversions (such as number truncation) may occur in order to prevent same. A single language may be strong with respect to some type differences, and weak w.r.t. others; weak typing might be more pleasantly called permissive or implicit-conversion-ful. :)
http://cliki.tunes.org/Type%20System has a discussion of these and two other axes upon which type-systems can be described.
Static type-checking prevents you from compiling a program with type errors or undeclared ambiguities in it. Dynamic type-checking stops the type error from causing dangerous behavior, usually by stopping the program. Strong typing protects you from unexpected truncations and other conversion problems.
Here are some possible type-related bugs, and the way that different type systems respond to them:
- I tried to use an integer value as a pointer, and wrote junk data on top of something. Any type-safe system (static or dynamic) will prevent this, by raising a type error either at compile-time or run-time. A system which is weakly typed with respect to the integer/pointer distinction is type-unsafe on most computer architectures!
- I used an integer operation on a real number, but I don't really need the decimal part. A static strong type-system will raise an error at compile-time. A dynamic strong type-system will raise an error at run-time. A weak type-system will convert your real to an integer -- which is not a bug in this case, but see the next.
- I used an integer operation on a real number, and returned an incorrect answer because of it. Any strong type-system will prevent this. Type-systems which are weak (or "permissive") with respect to numeric conversion will not; they will cheerfully truncate your real.
- I used the string "69" where I meant the number 69. Only a type-system that is weak wrt the string/number distinction -- such as Perl's -- will accept this code.
- There's a library function which usually returns a Foo object, but sometimes returns integer zero to indicate error. I write a function which can only accept a Foo, and call it on the return of the library function. A static type-system will refuse to compile your function. A dynamic type-system will not raise a fuss until the erroneous case actually happens. Only a type-unsafe system (such as assembler) will let your code try to use the zero as if it were a Foo, and get corrupt data.
And here are some languages of each category:
- Static, strong: Haskell (type-inference); Java (type declaration)
- Static, weak: C (type declaration -- also, plenty of ways to get around the type-system and do something type-unsafe!)
- Dynamic, strong: Common Lisp (type declarations are optional); Python
- Dynamic, weak: Perl (loves to do implicit conversions ....)
--FOo 23:16, 15 Nov 2003 (UTC)
Structure
As you might notice, I made a major structural edit. I basically eliminated a lot of duplication and reorganized contents for readability. I think I worked very carefully not to lose important points but if you have noticed some missing, don't hesitate to restore them. Also, my interest was of organization (as usual?) so spelling and grammar might be terrible. Any copyedit is highly welcome. -- Taku 06:44, Apr 8, 2004 (UTC)
Examples
In Subsection 2.1 Static and dynamic typing, in the paragraph that begins with "By contrast, a purely dynamically typed system," a code example is referenced that doesn't appear until later, in the next section, as if it would be close by.
What is the best way to fix this, do you think?
Ehn 19:59, 28 Jul 2004 (UTC)
Latent typing
Latent typing redirects here, but there is no mention of it on this page, let alone a prominent one. - Furrykef 16:12, 4 Sep 2004 (UTC)
Static/Dynamic typing
It's odd that there aren't any seperate pages dedicated to both static and dynamic typing. There could be pages full of pro's, con's, examples, languages, ... but all there is is a small section here? Wouter Lievens 08:59, 22 Mar 2005 (UTC)
- Maybe we should as that section used to be a separate page. I merge it to here because I thought discussing datatype without mentioning typing makes little sense. -- Taku 14:39, Mar 22, 2005 (UTC)
- The mention of type systems and the discussion of type systems are two totally different things. Of course this page should not go without a mention and link to the two other pages, but it shouldn't actually contain them (is my opinion). --seliopou 20:35, 2 Apr 2005 (UTC)
- When I have time, I shall draft new static and dynamic typing pages. Wouter Lievens 4 July 2005 22:59 (UTC)
Reorganization
Discussing static and dynamic typing in one article isn't that bad because they're alternative implementations of the same idea and thus about the same topic. What concerns me more is that a lot of type-related topics redirect here: Real data type, Dynamic typing, Data type, Static typing, Type system, Type checking, Latent typing, Type rule, Type (computer science), Datatypes. I'm afraid the current article is so broad it stops potential contributions. We could sketch how the material would be best divided into introductory articles that would refer to more special topics. It's nice how there's a paragraph here and then a separate article on polymorphism.
Good entry points might be Type theory, Type system, Type checking, Static typing on one hand, Programming, Programming language, Data structure, Dynamic typing on the other hand. Currently, this article fails to mention history of anything. I would expect an article on Datatype to mention unlimited integers, tuples, hashes. Because of types that are not first class, function types, type erasure, not all static types are datatypes at all.
To make a concrete proposal, we could proceed as follows:
- Rename Datatype to Type system, as that would better describe the theoretical nature of this article, and it's a good meeting point for the worlds of static and dynamic typing.
- Create a new article Data type that would discuss concrete types used in programming such as the number types, string types, container types, pointer types and data structures of theirs. Currently we have just links to individual articles.
- See into that Type theory points to Type system, and Programming and Programming language both point to Type system and Data type.
--TuukkaH 14:41, 19 February 2006 (UTC)
- I agree that most of {static/dynamic typing, strong/weak typing, type safety} should be moved to an actual type system article. The "Type system cross reference list" should be replaced with categories for those particular properties, e.g. Category:Strongly-typed programming languages, Category:Dynamically typed progamming languages, etc.
- Maybe the concrete types would fit in well at abstract data type? I don't know, but it's worth considering. "Datatype" as I usually hear it refers to either abstract data type or algebraic data type; maybe we need to just make datatype a disambiguation page to sort between abstract, algebraic, and type system. What do you think? --bmills 16:51, 19 February 2006 (UTC)
- As I understood it, the idea of the cross-reference list was that there's no way around naming everybody's language on every page so that table does it quickly. Would it be ok to you to rename the current Datatype to Type system as a start instead of moving a lot of sections over?
- That's another issue, people don't say datatype when they mean a primitive datatype such as int, they say 'type',
- Exactly which "people don't say"? I certainly describe 'int' as a 'datatype'... mfc
- and a compound datatype is a 'data structure'. Before making any larger decisions, it would be useful to see what sections there already are in the articles, what is missing, and then try to choose article titles so that the resulting articles could become encyclopedically notable and balanced. Anyway, as I imagined it, Data type would give an overview of concrete types, abstract types, algebraic types etc. but not about type systems, type erasure, existential types etc. --TuukkaH 19:46, 19 February 2006 (UTC)
- That's another issue, people don't say datatype when they mean a primitive datatype such as int, they say 'type',
- I'd like to throw in some comments:
- To people who use or talk about ML-like type systems a lot, a "datatype" is a particular sort of type. Type theorists never call their discipline "datatype theory" as far as I know. On the other hand, I would not be surprised to learn that people refer to int, float and so on as "data types" (open compound), since they are types of data.
- Yes, ML people (like me) generally refer to algebraic data types when they say "datatype". It is my understanding that OOP proponents generally mean abstract data type, and procedural programmers generally mean concrete types like int, float, etc. --bmills 03:59, 20 February 2006 (UTC)
- By the way, I'm an ML person too. (And a CMU person, which you seem to be as well.) --Cjoev 19:10, 20 February 2006 (UTC)
- The type system cross-reference list is a terrible idea, since (a) people disagree on the meanings of the various classifications, (b) people who agree on the meanings of the classifications can disagree on how they apply to a particular language (e.g., is Java dynamically typed just because it has casts?), (c) it relies on the appealing but mistaken belief that dichotomies like strong/weak, safe/unsafe and static/dynamic are mutually orthogonal and span the entire design space of type systems.
- I agree. See the rant on my user page regarding lists. Static vs. dynamic typing is easy (there are some languages that have a static type of "things that are dynamically typed", so both are definitely possible in the same language), and (hypothesized) type safety vs. unsafety is easy (though few languages have actually been proven type-safe); strong vs. weak typing is very iffy, as there isn't really a good definition for what constitutes a "strongly-typed" language. I suppose you could sort by whether subtyping is coercive, though. One potentially useful dichotomy, though, is explicit typing vs. type inference. At any rate, none of these attributes belongs in a list, but categories might be acceptable. --bmills 03:59, 20 February 2006 (UTC)
- Explicit typing vs. type inference is an interesting example: for one thing, it applies only to statically typed languages -- illustrating that there needs to be an article, or a sizable section of an article, where static-only issues can be discussed without worrying about how or whether they apply to dynamic languages (see my next numbered comment below). Also interesting is the fact that type systems for real programming languages can use type inference to varying degrees. Haskell, for example, sometimes requires annotations IIUC but infers most of the time. Cjoev 19:23, 20 February 2006 (UTC)
- The whole idea that the article on type systems should be based on a series of supposedly orthogonal dichotomies is extremely unfortunate. There are things one must know in order to understand static typing that do not make sense in dynamically typed languages, and vice versa. Either paradigm has strong points and important features that are much too important to be glossed over by one sentence in an "Xists say P(X), Yists say Q(Y)" paragraph. Worse, it seems to invite people to make meaningless or redundant contributions (cf. "Safely and unsafely typed", "Nominative vs Structural Typing" and the recent "Type 'Tags'", which (assuming their contents made perfect sense, which I do not grant) would have been better named "Type Safety", "Theories of Type Equivalence" and "Implementation Issues in Dynamic Typing" respectively).
- Dichotomies can be illustrative; however, the article as currently formulated does make the erroneous assumption that all such attributes are orthogonal; it's the same phenomenon that has infested strict programming language, lazy evaluation, and eager evaluation. I say, explain the ideas without trying to justify them (readers can come to their own conclusions, or look elsewhere for debates). --bmills 03:59, 20 February 2006 (UTC)
- Sure, they can be illustrative, but they need to be illustrative of something. The current state of all these articles seems to imply that understanding the dichotomies is sufficient to understand the subject, which I think we both agree is not the case. --Cjoev 19:10, 20 February 2006 (UTC)
- A merge with the existing Type theory article is in order. The result of that merge, I think, should be called Type system. It was mentioned a while ago on Talk:Type theory that that article is really about type systems anyway.
- So maybe we should move Datatype to Type system, merge in Type theory, and replace Datatype with a disambig between Abstract data type, Algebraic data type, and Type system? --bmills 03:59, 20 February 2006 (UTC)
- Sounds good to me Cjoev 19:10, 20 February 2006 (UTC)
- I appreciate your comments, Cjoev and bmills! We should really weaken the dichotomies. I read the existing Type theory and it mentions that it's about static typing ("real type systems") whereas I'm hoping the new Type system would be about the practise which seems to be pretty distant, perhaps excluding static inferring ML-Haskell line languages. So dynamic typing and duck typing would perhaps not be the best fit to the TODO I saw on Talk:Type theory. Anyway, we can do the rename first and see about the merge later.
- Hmm. You really don't want to be excluding Hindley-Milner type systems from a discussion of type systems in general. I'm currently at one of the major universities for type systems research, and I can tell you that a very significant fraction of new type systems are extensions of ML-like systems. However, there was a very strong POV issue in the Type theory paragraph you described; I have attempted to fix it. Also, C# now has type inference, too, oddly enough. --bmills 16:13, 20 February 2006 (UTC)
- I wanted to say that Hindley-Milner is one of the few type systems near type theory. More or less dynamic systems are popular in practice, and even if you rewrote the paragraph to yet another static vs. dynamic explanation, I don't think type theory has much to say about them. Thus my suggestion to keep Type system and Type theory separate. --TuukkaH 16:43, 20 February 2006 (UTC)
- I'm not sure exactly what you mean. Many languages these days do use static type systems (though few are strong enough to guarantee safety); C is static (but not safe), C++ is (I think) the same way, Java is dynamic for casts but static the rest of the time. Type systems and type theory are intimately related; it's certainly not the case that dynamic type systems are outside the scope of type theory, though certainly many type systems were created without much regard for underlying principles. I guess what I'm trying to say is that there isn't really a well-defined boundary between type systems and their formal study, so it doesn't make sense to impose one arbitrarily. --bmills 18:47, 20 February 2006 (UTC)
- Unfortunately, it is hard to make statements like "dynamic type systems are well within the scope of type theory, a formal discipline in which the word 'type' unambiguously denotes a static concept" without sounding like one is taking a side in the static-versus-dynamic language war. :-) I fear it is even harder to convey any of the considerable insight type theory provides into things like overloading, coercion, "strong" typing, inheritance, generics and so on without running afoul of NPOV (or even No Original Research). This frustrates me continually, because using (static) type theory to explain language concepts formally is (a) almost perfectly standard in the PLT community, and (b) not even close to the same thing as saying one kind of programming language is better than another. Cjoev 20:11, 20 February 2006 (UTC)
- Let's not drown in dictionary definition arguments :-) It makes sense to pick article names that make sense, but apart from that, disambiguations, wikilinks and See also's should ensure that all content is easy to find. For example, if we create a new article Data type it can mention algebraic data types and abstract data types so Datatype can redirect there, and it can disambiguate to Type system at top if it doesn't start with "Data type is the [[type system|type]] of data in programming. Types of data in programming languages include primitive types, tuples, records, algebraic data types, pointer types, objects ..." Of course many sections would have "Main article" links. Basicly, the articles on general topics should summarize the special topics, going backwards from Wikipedia:Guide to writing better articles#Articles covering subtopics. --TuukkaH 08:39, 20 February 2006 (UTC)
- The following discussion is an archived debate of the proposal. Please do not modify it. Subsequent comments should be made in a new section on the talk page. No further edits should be made to this section.
The result of the debate was move. —Nightstallion (?) 08:18, 28 February 2006 (UTC)
Requested move
Datatype → Type system – better match current topic, make space for a new article
- Add *Support or *Oppose followed by an optional one-sentence explanation, then sign your vote with ~~~~
- Oppose. Datatype refers to concrete types with practical programming issues kept more constantly in mind, while Type system refers to more abstract, sometimes almost purely logical or theoretical constructions at some remove from the daily grind. Jon Awbrey 21:20, 20 February 2006 (UTC)
- I hope you consider that we can have the range of articles from type theory, type system to data type, data structure so that data type can concentrate much more on programming issues. --TuukkaH 06:38, 21 February 2006 (UTC)
- Support. The content of the current revision of this article is more about type systems in particular than datatypes in
particulargeneral; moving this content to a more appropriate location frees up this article for more emphasis on the more general usage. bmills 22:43, 20 February 2006 (UTC) - Support. Cjoev 21:07, 20 February 2006 (UTC) (Moved this vote up here from the Discussion section -- oops. Cjoev 01:40, 21 February 2006 (UTC))
- Support. Per bmills. —Ruud 15:58, 21 February 2006 (UTC)
- Support per above. Fredrik Johansson 16:04, 21 February 2006 (UTC)
- Support Given the different interpretation the current Type theory page should become a disambiguation page pointing to Type Systems, Intuitionistic Type Theory and other possible meanings of the term --Thorsten 21:22, 21 February 2006 (UTC)
Discussion
- Add any additional comments
- Starting a formal move proposal per #Reorganization discussion above as the target page Type system already has previous history before it was made a redirect here. --TuukkaH 20:49, 20 February 2006 (UTC)
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section on this talk page. No further edits should be made to this section.
Type System Terminology
- Type System Terminology may be usefull. --Michal Jurosz 14:44, 19 January 2006 (UTC)
Type "Tags"
An anonymous editor added this section that seems mostly bogus or stuff already covered elsewhere in the article. I moved it here in case it motivates someone:
- Some dynamic languages use internal "tags" to track how a declaration happens, and some don't, relying on the value itself as the sole indicator of content. For example:
a = "52"; b = 52;
- In "tag-free" languages, the contents of variable "a" and "b" would be identical. In tagged dynamic languages, internally the variables would carry an internal flag(s) to indicte the type. If we could "X-ray" the variable storage inside a language "engine", we would see something resembling:
Tagged Types: a: ["52", type: string] b: ["52", type: number]
Untagged Types: a: ["52"] b: ["52"]
- In untagged languages, operator overloading is not very effective and more explicit operators are needed. For example, string concatenation may use an ampersand or period instead of a plus sign to avoid ambiguity. Some prefer this approach, saying "what you see is what you get". Which approach is favored is highly debated and seems to be a personal preference.
- Tagged dynamic languages usually have a standard operation or set of operations that return the "type" of a given variable. Such a function may be called "typeOf", "typeName", "type", "getType", etc. Non-tagged languages cannot have such (although they may have parsing-based validation operations such as "isNumber"). Some languages are partially tagged in that scalar types are not tracked, but the difference between compound types, such as arrays, and scalars is. It is possible to make this distinction blurred and/or automatic also, but in practice very few languages use such an approach.
- Using the tagged approach, types are checked by seeing that a variable or value has the appropriate tag. In some languages a math operation, for example, may refuse to compute with a value having a "string" type since it expects only numbers. Other languages will attempt to convert any string parameters encountered into a number by parsing the string and copying it into a temporary number. Non-tagged languages will always have to parse and validate to do such operations.
--TuukkaH 12:02, 19 February 2006 (UTC)
Revisiting?
Pardon me for breaking your nice talk website, but what does "Revisiting the previous example" mean? Or what do you mean by it? It could be confusing for some readers (like me now). --www.doc 04:06, 1 April 2006 (UTC)
- Apparently, that phrase indicates that the section in which it appears is the second of three in which the pseudocode fragment following it is used to illustrate a concept. Thanks for pointing out that it's problematic.
- Let me take this opportunity to say to anyone watching this page that I think this example-driven style of explanation fails to convey the essence of the concepts being described and should be replaced with something more rigorous than "if this were a program in language X, it would have meaning A because P(X), but if it were in language Y, it would have meaning B because not P(Y); on the other hand, since Q(Z), it would mean C in language Z regardless of whether P(Z)". See earlier discussion threads where I rant against dichotomies. Cjoev 18:21, 3 April 2006 (UTC)
Closures easier with dynamic typing??
I have removed closures from the list of "advanced" constructs cited as more difficult to use in statically typed languages than dynamically typed ones. If the word closure here means "function as first-class value", then as an ML programmer I disagree: I use those things all the time and they're not the least bit difficult -- on the contrary, I find that the higher in order my functions get, the more types help to keep me from getting confused. I expect most ML or Haskell programmers would agree. If you want to claim that closures are easier to use in dynamically typed languages, please either provide or cite a more detailed explanation of why. Cjoev 22:27, 14 April 2006 (UTC)
- You're right, that was a mistake. Thanks for fixing that. I still don't know about the correctness of my claim that some "advanced" constructs are easier to use in dynamically typed languages. Perhaps what I should have said is that some "advanced" constructs are easier to use in languages that feature run-time execution of code, type introspection, closures, the ability to dynamically modify existing functions and attributes of existing objects, etc and by association dynamically typed languages tend to feature these things. Hence in practice dynamically typed languages tend to be better for "avoiding patterns in code" by abstracting away those patterns. However, in theory a statically typed language could do all of these things and simply require more typing. This is why I'm not sure my claim has much validity. But I figured I should "be bold" in editing, since in practice there does seem to be "something" different about dynamically typed languages in terms of flexibility. - Connelly 19:22, 19 April 2006 (UTC)
Assembly Language NOT untyped!!!
Strictly speaking, assembly language is NOT untyped. Why this is even mentioned in a page comparing high-level languages, I don't know.
But there are several points to be made about assembly, instead of the gross generalizations here and elsewhere:
- In most assembly languages, instructions operate on one of several predefined types: 1-, 2-, 4-, 8-byte integers, 4- and 8-byte reals, bits, etc. In some assembly languages there are even pseudo-instructions for declaring registers and data with these types. This is hardly "typeless".
- Assembly may be considered "static" in its typing, because all primitive types are static -- In my 20 years of assembly programming, I've never seen an instruction which uses the contents of a datum to determine its type. The type is always implied as a part of the instruction or register.
- There may be special instructions which treat collections of bits and/or bytes in a special way, like a data structure, and which operate directly on registers or memory in terms of this data structure. This is seen more often in systems programming than in application programming, because the structures are very low-level, such as the page tables of the machine. Nonetheless, these data structures can be considered "types" in themselves. And don't forget grandma's vector machine.
- Variables such as hardware registers may have predefined or inferred type: On some architectures, registers can hold both integer and floating-point data, and only the instruction operating on the register determines the type of the data inside it. In others, the types of the registers are predefined: always integer or always FP, with restrictions on which instructions may operate on each of these sets of registers. Registers with inferred type tend to be friendlier to programming, because it makes things easier at procedure boundaries: If you call a routine and pass arguments in registers, the calling convention is simple and uniform if the registers' types are inferred -- the caller and callee simply have to agree on the types of the arguments. But if the registers are of predefined types, then you have to create complicated calling conventions on how arguments are passed, since how they are passed now depends on their type -- integers must go to one set of registers, while FP values must go to another set of registers, etc.
- More subtly, some architectures are strong or weak with respect to certain sub-types, such as unsigned vs. signed integers, even though the same might not hold in general -- it may be weaker in distinguishing between integer subtypes, than between integer and other types. On some architectures, signedness is a strong property, and there are separate instructions to perform signed and unsigned arithmetic. With the advent of Two's complement notation, signedness is largely a weak property on today's machines, and only certain things like comparison instructions, branch instructions, or condition codes, need to distinguish between signedness and unsignedness.
So I would say that assembly language is: Statically typed, "varied" in strong/weak (depends on the machine and assembly language syntax), unsafe, structural, and types may either be predefined or inferred (depends on the machine and assembly language syntax).
It is NOT "untyped", "always weak", or "undefined".
- I don't want to sound like I disagree with you entirely, but I think one has to be careful. There is a distinction to be made between the formal types of the language -- sets of data recognized as different by the machine and to which different sets of operations apply -- and the informal types in the mind of the programmer, even if the latter were anticipated by the design of the hardware. For example, on the x86 the contents of a 4-byte region of memory can be (1) used as an operand to an ALU instruction such as
add
, (2) loaded into a general-purpose register and then used as the address in a memory operand, or (3) loaded into an FPU register and used for floating-point arithmetic. Any of these operations is allowed and produces a well-defined behavior (possibly an exception) no matter what the contents of those locations is. This suggests that there is one type of "32-bit data", which supports all three of these kinds of operations -- in particular, integers, pointers and floats of the same size are not different types. "Integer", "pointer" and "float" are refinements of the one 32-bit type that are unofficially used by the programmer to organize data so as to make the most productive use of the capabilities of the machine. If one considers these types distinct, then one must conclude that assembly is weakly typed because it silently coerces between them (and not in particularly meaningful ways). This is even more the case with larger data structures such as page tables. One can put together any nonsensical mess that one wants and call it a "page table", and the hardware will not notice the difference; it will result in some well-defined, though not necessarily useful, behavior. I conclude that "page table" is not a distinct type from any other type of data of the same size. - On the other hand, since the IA-32 does use separate FPU registers for all floating-point operations, it does make sense to consider the extended-precision floats stored therein as a separate "type", since there are operations applicable to them that do not apply to anything else. Moving data between the FPU and memory can be thought of as a type conversion operation. Cjoev 23:51, 18 April 2006 (UTC)
Deleted "Nominative vs structural"
I have deleted the section on "Nominative vs structural" type systems, as it is redundant with my rewrite of "Compatibility, equivalence and substitutability". Some points here are not covered in my new text and may be worthy of being put back in somewhere, so here is the text I removed:
- Two primary schemes exist for distinguishing two types as equivalent and/or as subtypes; nominative (by name) and structural (by structure). As the names indicate, nominative type systems operate based on explicit annotations in the code; they "recognize" only explicitly declared types, and also require the explicit declaration of subtype relationships. Structural type systems, on the other hand, perform type judgements based on the structure of the two types under consideration.
- Few languages implement nominative or structural schemes strictly; many have features of both. The major statically typed object-oriented languages use nominative typing; the two major familes of functional programming languages use structural typing.
- Many popular languages show a fruitful combination: nominative type equivalence and structural subtyping; known as duck typing. It appears in many dynamically-typed OO languages.
Discussing a term before it is defined
"Static typing" -- for one -- is a term that is discussed before it is defined. In fact the first two distinct references to this phrase (first in the preface, second in the section "Static and Dynamic Typing") seem to assume that the reader knows what it means, while *seeming* to explain what it means. I do not think the term is *ever* defined in the article. But where else in Wikipedia would it be defined?
This definition-free usage of expert terminology is a serious blunder in an expository article like this. Fixing it should not have to wait for someone's getting around to making separate pages for static and dynamic typing (as mentioned in Static/Dynamic Typing section above).71.224.204.167 07:14, 26 June 2006 (UTC)
Concurrency Types
Some languages have types that express concurrent/communicating processes (CSP ...) or parallel execution. Where would these types fit? 84.62.137.188 18:11, 7 July 2006 (UTC)
Static and dynamic type checking in practice
Do these actually have something to do with typing?
Dynamic typing allows debuggers greater functionality; in particular, the debugger can modify the code arbitrarily and let the program continue to run.
You can do this with GDB and C++, so dynamic vs. static typing doesn't appear to come into play here. And, no, I'm not talking about just changing values with print, but patching in place.
Dynamic typing typically makes metaprogramming more powerful and easier to use.
Again, this doesn't seem tied much to type systems, but more to what metaprogramming facilities are offered. Sure, in dynamically-typed languages it's trivial to say "call the X method on object Y." And, sure, many statically-typed languages require that you add hints regarding return types, etc., but type inference gets rid of that requirement, so this isn't tied directly to static typing. It's a syntax of that language issue.
eval
is about executing arbitrary code. Can't that code include typing information? Such as
eval {int X = 1; std::cout << 1 << std::endl;}
Yes, implementation might be hairy, but then again, so is the dynamically-typed implementation. In nearly all cases it calls the interpreter (a.k.a. compiler). Besides, eval
ing strings together reminds me a lot of the C preprocessor. I really have a hard time considering eval
releated to dynamic vs. static typing.
Metaprogramming means a lot more than "call the X method on object Y." It means a lot more than having a working eval
. It means programming at a different level of abstraction. It may mean fiddling with the symbol table, or getting hooks into the runtime system, or moving computation from runtime to compile time, or getting the right std::swap
called, or any other number of things. Sure, a language's type system will affect that, but it's only a flyspeck of detail there. --64.132.106.194 21:27, 8 August 2006 (UTC)
- That example is misleadingly simple, because it contains a statement rather than an expression, and statements have trivial result type. eval expressions are where the difference arises. In a dynamically typed language, the static type of the result of eval does not need to be known. By contrast, in a statically typed language, an eval expression must have some static type, which is impossible if the argument to eval can be an arbitrary runtime-constructed AST or string. Type inference does not help you here, and it is not merely a syntactic issue. It is a fundamentally hard semantic problem to bridge the "phase distinction" between a static typechecker and dynamically constructed code.
- You can get around this problem by assigning the static type Object or dynamic (see, e.g. Dynamics in ML by Leroy and Mauny) to the result of eval. However, that essentially concedes the point that dynamic typing is better suited to metaprogramming. The principled integration of metaprogramming into statically typed languages is an open research problem (see, e.g., the work on MetaML by Walid Taha et al., or the ongoing work of the aspect-oriented programming community), whereas eval appeared in the very first implementation of Lisp. This contrast alone should be evidence that metaprogramming is much harder to integrate into a statically typed language than a dynamically typed one.
- This isn't to say that it is impossible to offer metaprogramming facilities in a statically typed programming system --- with sufficiently clever implementation techniques, you can do anything through ad hoc tools and libraries --- but rather that language-level metaprogramming becomes significantly more complex in the face of static typing.
- Also, note that the C preprocessor is dynamically typed, even though C itself is statically typed.
- I agree that the statement about debuggers is bogus though. k.lee 22:03, 9 August 2006 (UTC)
- Thanks for the discussion regarding
eval
expression. I can think of an easy way to punt around the problem (declareeval
to return void), but I can now see the problem. --64.132.106.194 21:01, 14 August 2006 (UTC)
- Thanks for the discussion regarding
- It is also possible to make
eval
have the typeTerm a -> a
(in Haskell notation), whereTerm
is a type constructor of kind * -> *. The GADT section of GHC manual has an example of this. esap 21:15, 25 September 2006 (UTC)