Tops Tag Model Two

Pseudo-Code Kit and Examples

A kit of samples and conventions to build language-specific models of type-related aspects of "typical" dynamic or scripting programming languages that follow an Algol-like influence (which includes languages resembling C, Pascal, VB, and Python syntax, for example). This can aid in predicting and remembering type-related results by allowing one to "x-ray" the parts and their actions. They can be viewed as mental exercises to help build and strengthen mental models of type-related processes in a target language.

This kit may not reflect actual construction of the workings of production interpreters, and the "parts" used in the model are model-specific and not intended to be "canonical" or "official" in a general sense[5], but this does not prevent it from being useful for prediction and serve as a memory aid. Actual interpreters have a lot of layers and artifacts that although may improve machine performance, are generally difficult to remember and mentally apply, and are thus not necessary nor helpful for our training-related goal.

The modelling kit can create descriptive imperative pseudo-code models that use XML representations of variables (and variable-like entities) in order to model the "parts" of a variable during the processing of a given operator in a given dynamic "Algol-style" language in order to better predict the language's behavior with regard to type-related issues. Sample implementations are given, and the model user can select among the sample implementations that best the behavior of the language. Test snippets are built to isolate the target operators in a way that allows for testing and experimentation. TypeTagDifferenceDiscussion gives examples typical tests that can be performed. TypeHandlingGrids can be used to track experiments in a systematic way.

If the pseudo-code is not detailed enough to explain the behavior, then in theory StepwiseRefinement can be used to eventually create actual implementations of the target operators. We only turn the refinement dial as detailed as we need to in order to understand the "reason" for the result in executable code.

We are essentially creating a model or models of an abstract interpreter of a given operator in question. We don't want to model an entire language if we don't have to such that we only create models of subsets of the language. In order to simplify the model description, certain deconstructions are done manually, such as expression reduction. Thus, it's up to the model user to perform the correct order of operations. These can be done by reading the documentation and/or experimentation. In an expression such as "x=a+b+c", it can sometimes make a difference whether the right "+" is evaluated before the left one (per language) where mixed types are involved.

An example of the internal structure generated for a variable [2]:

 foo = 123;

<var name="foo" type_tag="number" value="123">

Here is the model we'd use for a non-tag language:

 <var name="foo" value="123">

Experiments are probably needed to determine if the language uses type-tags or not. Generally one finds this out if the type tag makes no difference. In other words, it's considered non-tagged if no experiment can be found that shows there is information other than the "value" as described in the structures above.


API's

(Pseudo-code syntax roughly based on AlternativesToCeeSyntax, roughly a C/Pascal hybrid.)

 func updateVar(vname: string, type_tag: string optional, value: string optional) {
   // Update the XML structure for the given variable; create new variable if not exists
   ifVarNotExistsInThisScope(vname) {
     if LANG_ALLOWS_DYNAMIC_VAR_CREATION {
        createVariable(vname, LANG_DEFAULT_TYPE, LANG_DEFAULT_VALUE);     
     } else {
        raiseError("Variable not defined" & vname)
     }
   }   
   if parameterGiven("type_tag") {
      updateTypeTag(vname, type_tag);           
   }
   if parameterGiven("value") {
      updateValueAttrib(vname, value);
   }
 }
 .
 func updateTypeTag(vname: string, typeName: string) {
   // code to update the "type_tag=" attribute of the 
   // corresponding XML structure for given variable 
 }
 func updateValueAttrib(vname: string, value: string) {
   // code to update the "value=" attribute of the 
   // corresponding XML structure for given variable 
 }
 func getTypeTag(vname: string):string {
   // Retrieves the value (X) of the "type_tag=X" attribute of a given 
   // variable's XML representation.
 }
 func getValueAttrib(vname: string):string {
   // Retrieves the value (X) of the "value=X" attribute of a given 
   // variable's XML representation.
 }
 .
 func passParam(oldVarName:string, newVarName:string, typeName:string optional) {
   // To emulate parameter passing. For languages that
   // allow for explicit parameter names, type coercion is available.
   var useTag: string, useValue: string;
   .
   useTag = getTypeTag(oldVarName);
   useValue = getValueAttrib(oldVarName);
   .
   if LANG_ADJUSTS_PARAM_TYPE And parameterGiven(typeName){  //[footnote future]
     if isParsableAsType(useValue, typeName) {
       useTag = typeName
     } else {
       raiseError("Parameter cannot be converted to given type.")
     }
   }
   setTypeTag(newVarName, useTag);
   setValueAttrib(newVarName, useValue);
 }
 .
 CONST-LIST-OF-VALID-QUOTES = {ascii(34), ascii(39)} // varies per language
 .
 func assignLiteral(vname: string, value:string, delimiter: string optional) {
   // Assign a literal to a variable. (Use copyVar() for non-literals.)
   var useTypeTag;
   // Is the literal quoted? (Is the literal one of the following...)
   if delimiter in CONST-LIST-OF-VALID-QUOTES {  
     useTypeTag = "String"
   } else {
     useTypeTag = "Number"
   }
   // (Insert any other "type markers" a given language may use above in an ELSE IF)
   updateVar(vname, useTypeTag, value);
 } 
 func copyVar(vname: string, copiedVarName: string) {
    // To copy one variable into another: "a=b;" would be copyVar("a", "b");
    var useValue: string, useTag: string;
    useValue = getValueAttrib(copiedVarName);  
    useTag = getTypeTag(copiedVarName);
    // Save the results
    updateVar(vname, useTag, useValue);
 }

(Dots used as work-around for a wiki spacing bug on some browsers.)


Example "Reduction"

The modelling kit is not intended to be an entire virtual interpreter, for that would over-complicate it. Instead, the human does most of the parsing and reducing. But it's only done in a fine enough level to leave type-related behavior to the pseudo-code models. If possible, every expression will be reduced to functions[3] with not more than 2 parameters.

Original Test Snippet:

  var a = 123;
  var b = "123";
  writeLine(a + b + 7);
Reduction level 1:
  var a = 123;
  var b = "123";
  var temp01 = a + b;
  var temp02 = 7;  // the order of evaluation may vary in some langs
  var temp03 = temp01 + temp02
  writeLine(temp03);
  // This rewrite allows us to examine each part using multiple techniques and
  // helps us write further reductions. Embedding sub-expressions reduces that ability
  // such that we de-embed our adjusted representation of the snippet. [4]
Reduction level 2:
  assignLiteral("a", "123");
  assignLiteral("b", "123", CONST_DOUBLE_QUOTE); // if a literal has quotes, we indicate such
  plus("temp01", "a", "b");
  assignLiteral("temp02", "7");
  plus("temp03", "temp01", "temp02");
  writeLine(temp03);


Plus-Sample-1 -- Example Implementation of "Plus" (from above)

  func plus(targetVarName, vnameLeft, vnameRight) {
    var valLeft, valRight, tagLeft, tagRight, useTag, useVal;
    // Extract parts of operands
    valLeft  = getValueAttrib(vnameLeft);
    valRight = getValueAttrib(vnameRight);
    tagLeft  = getTypeTag(vnameLeft);
    tagRight = getTypeTag(vnameRight);
    // Process the parts
    if isParsableAsType(valLeft, "Number") And isParsableAsType(valRight, "Number") { // 4837
      useVal = performMathAddition(valLeft, valRight);
      useTag = "Number";
    } else {
      useVal = performStringConcat(valLeft, valRight);
      useTag = "String";
    }
    // Save the results
    updateVar(targetVarName, useTag, useVal);
    // some langs will handle other types, like dates, not shown here.
  }
  // Different languages will use different logic. This is generally
  // the most flexible approach in terms of dynamic conversion.


Plus-Sample-2 -- Example Implementation of "Plus", ALTERNATIVE 2, to mirror JavaScript:

  // Sample partial experiment in JS for reference
  alert(2 + 3);   // 5   alert(typeof(2 + 3)); // number
  alert("2" + 3); // 23  alert(typeof("2" + 3)); // string
  alert(2 + "3"); // 23  alert(typeof(2 + "3")); // string

func plus(targetVarName, vnameLeft, vnameRight) { var valLeft, valRight, tagLeft, tagRight, useVal, useTag; // Extract parts of operands valLeft = getValueAttrib(vnameLeft); valRight = getValueAttrib(vnameRight); tagLeft = getTypeTag(vnameLeft); tagRight = getTypeTag(vnameRight); // Process the parts if tagLeft == "Number" And tagRight="Number" { // 94838 useVal = performMathAddition(valLeft, valRight); useTag = "Number"; } else { useVal = performStringConcat(valLeft, valRight); useTag = "String"; } // Save the results updateVar(targetVarName, useTag, useVal); }


Templates

Template for typical one-operand operator:

  func myOp1(targetVarName, paramVname) {
    var paramVal, paramTag, useVal, useTag;
    // Extract parts of operand (parameter)
    paramVal = getValueAttrib(paramVname);  
    paramTag = getTypeTag(paramVname);
    // Do stuff with the parts we just unpacked, including conditionals 
    // if necessary, and update useTag and useVal.
    // ...
    // Save the results
    updateVar(targetVarName, useTag, useVal);
  }

Template for typical two-operand operator:

  func myOp2(targetVarName, vnameLeft, vnameRight) {
    var valLeft, valRight, tagLeft, tagRight, useTag, useVal;
    // Get parts of operands
    valLeft  = getValueAttrib(vnameLeft);
    valRight = getValueAttrib(vnameRight);
    tagLeft  = getTypeTag(vnameLeft);
    tagRight = getTypeTag(vnameRight);
    // Do stuff with the parts we just unpacked, including conditionals 
    // if necessary, and update useTag and useVal.
    // ...
    // Save the results
    updateVar(targetVarName, useTag, useVal);
  }


Notes and Footnotes


Why create a new page instead of using TopsTagModel?

I wish to have the reference material at the top for easy re-opening and finding. Perhaps TopsTagModel can be refactored to have the discussion part in TopsTagModelDiscussion??

Sounds good. Feel free to move the TopsTagModel discussion somewhere else.

Okay, but I have to think about the org a bit more, such as whether reference material should be a diff topic than examples. The resulting sizes will likely dictate that.

{I have been thinking for some time that all of this discussion should be in a category. Would anyone care to suggest one?}

In the spirit of CategoryOopDiscomfort, how about CategoryTypeDiscomfort?.

{Good.}

We already have CategoryTypingDebate, I now remember.


Discussion continued from ValueExistenceProof:

I don't see what purpose there is in giving variables a "type" property, nor do I see the purpose in dealing with quotes. (I'm happy to continue this threadlet on the appropriate page -- feel free to move it there.)

In the first "plus" sample above, the type tag is indeed not used. But if we modified it to better match say JavaScript, then it would, per Plus-Sample-2.


From TypeDefinitionsSmellBadly:

[Your] func updateTypeTag(vname: string, typeName: string) apparently changes the type of a variable independently of its value representation. Why?

That's just a low-level accessor. In practice, it probably would not be used in isolation, although the details of that depend on the specific language being modeled. In other words, both the type tag and the value are usually changed together in the same mid-level operation/function. An example of where the type tag may be changed in isolation is a "cast" (conversion) operation that converts a number to a string. However, that's not necessarily the only way to implement such.

How do we know what are "low-level accessors" (presumably inaccessible outside the API) vs high-level functions exposed by the API?

Why would you mutate a value in-situ as part of a typecast? I can see occasions where an optimiser might choose to do so, but it seems peculiar to model it that way. A typecast is better modelled as a function that accepts a value of type 'x' and returns a value of type 'y'.

Why is it "peculiar"? Another way is to simply pass a reference to the target variable (internal or external) and let the typecast operation do whatever it wants to with the referenced variable (which may simply be to change the type tag anyhow, depending on the target language to model and/or kit user personal preference).

It's peculiar because you're mutating a value, which means conflating values and variables. For something intended to simplify understanding of values, variables and types, it doesn't help to treat values as variables. I don't know what you mean by the text following "Another way is to ..."

The discussion chain starting with ValueExistenceProof discusses this issue. I don't want to reinvent those debates here and clutter it up.


CategoryTypingDebate, CategoryLanguageTyping


EditText of this page (last edited October 27, 2014) or FindPage with title or text search