Thursday, October 5, 2017

Max Column Sum by Key (part 4)


Further continuing on from my previous explorations that followed Learn to Code Grand Rapids “Building a Real World App in VS 2017, Part I” of August 10 I decided to try Pascal - a language that I first (and last) used in 1990 at SCI in Huntsville Alabama.  At that time I had liked it for the brief time that I was part of a small project.  Maybe since after a couple of weeks I was able to guide others who had been on the project longer about how to program in it.

I had seen that GNAT GPS had Pascal as one of its options as a language.  However, in trying to use it GPS would give me an error that it didn't recognize a compiler for Pascal.  So I did an internet search and found Free Pascal and downloaded its Win32 version.  It installs to C:\FPC\3.0.2 and the install did put the path to its bin subfolder into the Windows system path. 

However running it was extremely ugly.  I tried GNAT GPS again but even with the path to it in the system path, GNAT GPS didn't find it to use as the Pascal compiler.

So I went searching for something else and in doing so I came across Lazarus.  I downloaded lazarus-1.6.4-fpc-3.0.2-win32.exe (128.8 MB) into C:\lazarus.  This proved to be a Windows IDE for the previously installed Free Pascal compiler.  A much, much better solution.  So I was off and running.  (Up and running?)  I sure had no memories of what the language looked like.

It assumed that the source code would be in C:\lazarus\fpc\3.0.2\source.  Multiple kinds of projects can be selected.  I forget which I used at the beginning that created .lpr Lazarus Project Main Source file.  After I had problems with the debugger I tried a Console application project that is a .pas Pascal file for the source.

Since Pascal has both a record structure and the ability to declare and implement a class I have included examples of both.  The record structure for the KeyData record type and the KeyTable class that includes the number of different keys and an array of the KeyData record type.  Unlike in C# and Java the implementation of a procedure or function is not included within the class declaration.  Instead a declaration of the procedure/function is included within the class and separate code has to be provided to implement the procedure or function like happens for an Ada package.  Although, with Ada, the private procedures and functions don't need a declaration in the package specification or even in the package body if the implementation is done prior to any invocation of it.

It took me a while to get the use of a class worked out.  As I found problems with the code that caused exceptions I don't know whether this was because I wasn't storing values within the bounds of an array or because I hadn't created the KeyTable class as associated with a Pascal class; in my code TCustomApplication with a constructor and destructor.

In any case, the example of a record is
// Global types
type

  fieldArray = array[1..3] of integer;
  valueArray = array[1..20] of integer;

  // Data about a particular key
  KeyData = record
    key : integer;        // numeric key
    valueCount : integer; // number of different values associated with key
    values : valueArray;  // different values associated with key
    sums : valueArray;    // number of references to a value
  end;
  keyArray = array[1..30] of KeyData;
where keyArray is an array of the KeyData record while the record has two fields of the valueArray.

The KeyTable class is
// KeyTable class declaration
  KeyTable = class(TCustomApplication)

    public
    keyCount : integer; // number of different keys
    keys : keyArray;    // value of key - one in each array position

    // Data to be retained as the key and its associated value with the
    // maximum number of references
    maxKey : Integer;
    maxValue : Integer;
    maxSum : Integer;


    constructor Create(TheOwner: TComponent); override;
    destructor Destroy; override;

    // Initialize
    procedure Clear();
    // Update keyTable with data from a parsed line
    procedure Update(data : fieldArray);
    // Add the values or increment their sums
    procedure addValues(keyIndex : Integer; value1 : Integer; value2 : Integer);
    procedure add2ndValue(keyIndex : Integer; value2 : Integer);
    procedure updateTotals(key : Integer; value : Integer; sum : Integer);
    // Report the results
    procedure Report();

  end; // KeyTable class

Each of the procedures are separately implemented as with
procedure KeyTable.Clear();

var
  k : integer;

begin // Clear
  keyCount := 0;
  for k := 1 to 30 do
  begin
    keys[k].valueCount := 0;
  end;
  maxKey := 0;
  maxValue := 0;
  maxSum := 0;
  savedData[1] := 0;
  savedData[2] := 0;
  savedData[3] := 0;
end; // Clear
where the name of the class precedes the name of the procedure.  Each has to name its local variables first following the var keyword.  This is also similar to Ada where the variables are named prior to the begin statement.  However, unlike Ada where a loop index type can frequently be determined by the context without needing to be declared, it has to be included after the var keyword for Pascal.  In C kinds of languages it is specified in the loop statement.

Instances of the class are named following a var keyword visible to the none class procedure calls that invoke one of the procedures of the class.  Such as
// Static variables
var
  Table : KeyTable;  // instance of class

// Entry point from operating system
begin // LearnToCodeGR program

  // Initialize
  Table := KeyTable.Create(nil);

  . . .

  // Report the results
  Table.Report();
Here, of course, the name of the class is used in setting the instance of the class while the second dotted notation has been used to specify that that instance of the class is to be invoked.

It should be noted here that, like Ada and unlike C, case makes no difference.  Therefore I named the instance of the class as Table rather than keyTable as I would have in a language where case mattered.

However, initially (as with the other languages) I first determined how to read the max-col-sum-by-Key.tsv file.  In doing so I found that the
Readln(FileIn,buffer);
statement, where buffer is a string, inputs up to the end-of-line LF and/or CR but doesn't include these characters in the data string.  Therefore, the Parse procedure (when it was added) had to change in recognizing the end of the final field - the second value and the third field of interest.

Note that, like Ada, Pascal has both functions, that return a value, and procedures that don't.   Also, in many other ways creating this Pascal application there were reminders of the syntax of Ada.  Which must have been the reason that I liked Pascal on that long ago occasion when I had reason to use it.  I did find the need to surround the code following an if or loop statement with begin … end; blocks a little excessive.  Ada doesn't need the begin but does have the end statement (end if; and end loop;).  C and the like have { } brackets of course.  Whereas { } brackets can be used to surround a comment like /* */ in C.

There is some required sequence of keywords.  For instance, the internal procedures and functions have to be included after the variables (that follow the 'var' keyword) of the program.

The arrays can start anywhere so I have switched to begin some arrays with an index of 1 while leaving others to start at 0 to correspond to the C code.

The code in Pascal is
program LearnToCodeGR;

{$mode objfpc}{$H+}

uses
  {$IFDEF UNIX}{$IFDEF UseCThreads}
  cthreads,
  {$ENDIF}{$ENDIF}
  Classes, SysUtils, CustApp;

// Global types
type

  fieldArray = array[1..3] of integer;
  valueArray = array[1..20] of integer;

  // Data about a particular key
  KeyData = record
    key : integer;        // numeric key
    valueCount : integer; // number of different values associated with key
    values : valueArray;  // different values associated with key
    sums : valueArray;    // number of references to a value
  end;
  keyArray = array[1..30] of KeyData;

// KeyTable class declaration
  KeyTable = class(TCustomApplication)

    public
    keyCount : integer; // number of different keys
    keys : keyArray;    // value of key - one in each array position

    // Data to be retained as the key and its associated value with the
    // maximum number of references
    maxKey : Integer;
    maxValue : Integer;
    maxSum : Integer;


    constructor Create(TheOwner: TComponent); override;
    destructor Destroy; override;

    // Initialize
    procedure Clear();
    // Update keyTable with data from a parsed line
    procedure Update(data : fieldArray);
    // Add the values or increment their sums
    procedure addValues(keyIndex : Integer; value1 : Integer; value2 : Integer);
    procedure add2ndValue(keyIndex : Integer; value2 : Integer);
    procedure updateTotals(key : Integer; value : Integer; sum : Integer);
    // Report the results
    procedure Report();

  end; // KeyTable class

// Static variables
var
  Table : KeyTable;  // instance of class

  savedData : fieldArray; // data to be visible between multiple procedures

// KeyTable constructor and destructor
constructor KeyTable.Create(TheOwner: TComponent);
begin
  inherited Create(TheOwner);
  StopOnException:=True;
  Initialize();
end; // Create

destructor KeyTable.Destroy;
begin
  inherited Destroy;
end;


// KeyTable procedures
procedure KeyTable.Clear();

var
  k : integer;

begin // Clear
  keyCount := 0;
  for k := 1 to 30 do
  begin
    keys[k].valueCount := 0;
  end;
  maxKey := 0;
  maxValue := 0;
  maxSum := 0;
  savedData[1] := 0;
  savedData[2] := 0;
  savedData[3] := 0;
end; // Clear

procedure KeyTable.Update(data : fieldArray);
// This procedure first checks if the key is new and, if so, adds it to
// the keys array.  It then does similar for the values associated with it.
// Note: data[1] is the key while data[2] and data[3] are the two values
//       associated with the key.

var
  keyIndex : integer;
  k : integer;

begin // Update

  // Save data to be updated for use by updateTotals since needed by multiple functions
  savedData[1] := data[1];
  savedData[2] := data[2];
  savedData[3] := data[3];

  // Check whether the key is already in the table
  keyIndex := -1;
  for k := 1 to Table.keyCount do
  begin
    if (keys[k].key = data[1]) then
    begin
      keyIndex := k; // key already in the table
      break; // exit loop
    end;
  end; // for loop

  if keyIndex < 0 then // key not in the table
  begin // add the key
    keyCount := keyCount + 1;
    keyIndex := keyCount;
    keys[keyIndex].key := data[1];
  end; // end if

  // Add the values for the key
  addValues(keyIndex, savedData[2], savedData[3]);

end; // Update

procedure KeyTable.addValues(keyIndex : Integer; value1 : Integer; value2 : Integer);

var
  valueIndex : integer;
  v : integer;

begin // addValues

  // Check whether the first value is already in the table
  valueIndex := -1;
  for v := 1 to keys[keyIndex].valueCount do
  begin
    if (keys[keyIndex].values[v] = value1) then
    begin // value already in the table
      valueIndex := v;
      keys[keyIndex].sums[v] :=  // increment its number of references
      keys[keyIndex].sums[v] + 1;
      if (value1 = value2) then // 2nd value the same
      begin
        keys[keyIndex].sums[v] := // increment again
          keys[keyIndex].sums[v] + 1;
      end;
      // check if new max
      Table.updateTotals(savedData[1],value1,keys[keyIndex].sums[v]);
      break; // exit loop
    end;
  end; // end loop

  if (valueIndex < 0) then // value not yet in table
  begin // add value to the table - index points to last value checked
    keys[keyIndex].valueCount := keys[keyIndex].valueCount + 1;
    v := keys[keyIndex].valueCount;
    keys[keyIndex].values[v] := value1;
    keys[keyIndex].sums[v] := 1;
    if (value1 = value2) then
    begin
      keys[keyIndex].sums[v] := // increment
        keys[keyIndex].sums[v] + 1;
      // check if new max
      Table.updateTotals(savedData[1],value1,keys[keyIndex].sums[v])
    end;
//  else
    if (value1 <> value2) then
    begin
      Table.add2ndValue(keyIndex, value2);
    end;
  end; // end outer if

end; // addValues

// Add the second value or increment its sum
procedure KeyTable.add2ndValue(keyIndex : integer; value2 : integer);

var
  valueIndex : integer;
  v : integer;

begin
  // Check whether the second value is already in the table
  valueIndex := -1;
  for v := 1 to keys[keyIndex].valueCount do
  begin
    if (keys[keyIndex].values[v] = value2) then
    begin // value already in the table
      valueIndex := v;
      keys[keyIndex].sums[valueIndex] := // increment its number of references
      keys[keyIndex].sums[valueIndex] + 1;
      // check if new max
      updateTotals(savedData[1],value2,keys[keyIndex].sums[valueIndex]);
      break; // exit loop
    end;
  end; // end loop

  if (valueIndex < 0) then // value not yet in table
  begin // add value to the table
    v := keys[keyIndex].valueCount;
    keys[keyIndex].values[v] := value2;
    keys[keyIndex].sums[v] := 1;
    keys[keyIndex].valueCount := keys[keyIndex].valueCount + 1;
  end;

end; // add2ndValue

procedure KeyTable.updateTotals(key : Integer; value : Integer; sum : Integer);
begin
  if (sum > maxSum) then
  begin
    maxKey := key;
    maxValue := value;
    maxSum := sum;
  end;
end; // updateTotals

procedure KeyTable.Report();
begin
  WriteLn( 'Key ', maxKey, ' with Value ', maxValue, ' has maximum Sum of ', maxSum );
end; // Report


// General functions and procedures

function toInt(data : string; iStart : integer; iEnd : integer) : integer;
  const
    NINE = '9';
    ZERO = '0';
    numZero = integer('0');

  var
    index : integer;
    digit : integer;
    m : integer = 1;      // multiplier for shift
    number : integer = 0; // Numeric result

begin // toInt
  index := iEnd; // loop in reverse
  while (index >= iStart) do
  begin
    if ((data[index] >= ZERO) and (data[index] <= NINE)) then
    begin
      digit := integer(data[index]) - numZero; // convert ASCII to digit
      number := number + (m * digit);
      m := m * 10;
      index := index - 1;
    end;
  end;
  toInt := number; // return converted value
end; // toInt

// Parse each line of data in the buffer to obtain the three fields of
// interest, converting those fields to integers into an array, and
// then updating a data structure to retain the data for evaluation
// when the complete buffer has been parsed.
procedure Parse(count : Integer; data : string);
  const
    HT : char = #9; // horizontal tab

  var
    convertedValue : integer;
    nextField : integer = 0;  // index of the beginning of next field
    startField : integer;     // range of indexes of
    numFields : integer = 0;  // index into dataFields array
    dataFields : fieldArray;  // Integer values of the three fields of interest

    index : integer = 0;

begin // Parse
  while index < count do
  begin
    // Parse the buffer line
    if (data[index] = HT) then // beginning of a field
    begin
      startField := nextField; // save starting index
      nextField := index + 1;  // the next byte will contain part of next field
      if (numFields > 0) then
      begin
        //convert and store in dataFields
        dataFields[numFields] := toInt(data, startField, index - 1);
      end; // end if numFields > 0
      numFields := numFields + 1;
    end; // end if HT
    index := index + 1;
  end; // while loop
  // Convert and store last field since it ends at the last char of data
  convertedValue := toInt(data, nextField, count);
  dataFields[numFields] := convertedValue; //toInt(data, nextField, count);
  // Save the data to determine the key, value combination with
  // maximum number of references
  Table.Update(dataFields);
end; // Parse

// Variables needed by the main entry point program
var
  FileIn : Text;
  buffer : string;   // line of data from the Text file

// Entry point from operating system
begin // LearnToCodeGR program

  // Initialize
  Table := KeyTable.Create(nil);
  Table.Title:='LearnToCodeGR';

  // Open file
  AssignFile(FileIn, 'C:\\Source\\LearnToCodeGR\\max-col-sum-by-Key.tsv');
  Reset(FileIn); // Ensure that at the start of the file

  // Read and Parse each line
  while not Eof(FileIn) do
  begin
    Readln(FileIn,buffer); // input the string of the file line
    WriteLn(Length(buffer));
    WriteLn(buffer);

    // Parse the line and update to retain all the necessary data.
    Parse(Length(buffer), buffer);
  end;

  Close(FileIn);

  // Report the results
  Table.Report();

  // Terminate
  Table.Free;

end. // LearnToCodeGR program

Note that in KeyTable.addValues I wanted to use an else statement but had trouble getting the compiler to accept it.  So the else is commented out and is followed by a second if statement to obtain the opposite result.  That is, the two values being equal prior to the //else and being unequal following it.


Saturday, September 30, 2017

Max Column Sum by Key (part 3)



Continuing on from my previous explorations that followed Learn to Code Grand Rapids “Building a Real World App in VS 2017, Part I” of August 10 I decided to try Kotlin after having attended a MeetUp presentation of this new language. 

It was supposed to be an extension of Java 8 (I think it was) for Android devices and to be able to gradually go from Java code to Kotlin with them working together while gradually switching over.  As a new language it was supposed to be getting frequent releases.  Well I must not have downloaded a real recent release – at least I couldn't compile with part Kotlin code and part Java code (of course my Java download was back with a 3.6 release or something like that so no where near Java 8).

In my previous two Max Column Sum by Key documents I described my initial C# attempt before I had the file containing the data that was aborted for that reason, then my application using Ada as the language after getting the data file, then with C# once again using the data file, and finally with Python as a new language for me.  Then, in the second document, using Java and redoing the C# application once again.

This document will describe the application in Kotlin which ended up somewhat simpler since I had to work through some problems.  While trying to find a way to do a similar structure to that used in "Max Column Sum by Key (part 2)" in Java and then C# once again and failing to accomplish it in Kotlin, I gave up and then a simpler solution occurred to me.

Following that discussion will be one concerning doing the same application in C.

Discussion of the Kotlin Application


I started out following the program structure of the Java version of the application with a Key class to encapsulate the data of the key field of the file.  Remember that the file being used consisted of 20 records for each of 5 different keys with a pair of values in each record associated with the key with the objective of finding the key and value that had the most instances in the file.  (Each key following "who cares" leading text and the key and value fields preceded by a horizontal tab character and the original file having a line feed character following the second value.  Since I am using the UltraEdit editor which likes to ask if you want to convert to a DOS format when opening a file, I happened to reply in the affirmative early on so that records got a carriage return preceding the trailing LF when I was examining the format of the file.)

In the Java application and the later C# application there was the Key class to organize the data for a key including arrays to retain the different values associated with the particular key (taking the place of a C struct structure).  Then a KeyTable class to keep track of all the data with an array of the Key class to retain the data on each different key found in the file.  Plus an Update function to access the two classes to build the tables.

I tried to duplicate this in Kotlin – after all it was supposed to be able to gradually convert from a Java application to Kotlin.  Kotlin had array constructs where a class or object could be assigned as the type of the array.  (object being something similar to a class it seemed.)  But try as I might I couldn't get an array of the Key class to work as I had in Java and C#. 

So I decided to use a double dimension array of the Int type.  Again there was a problem since it seemed from online searches that multiple dimension arrays weren't supported although some said it had been added supplying what looked to me as a strange syntax.  But, most likely not in the version I had downloaded even if I could have figured out the syntax.

After fussing with the array of the Key class once again it dawned on me that I could do a work around for a double dimension array where I made a single dimension array of the Int type contain imaginary arrays of a second dimension.  In that way I could have the multiple arrays of the past Key class value data for a particular key embedded in the single dimension array – one group of the value data for each key as follows.
  // The double dimension arrays are simulated as
  //  +-----------+-----------+-----------+-----------+-----------+-----------+
  //  |           |           |           |           |           |           |
  //  +-----------+-----------+-----------+-----------+-----------+-----------+
  //  |           |
  //  | elements  +--> boundary at end of first array
  //    of first
  //     array
  // Access the double array via v + k * (first array size) since arrays begin at 0
  //  [0,0] computes to 0 for the single dimension array;
  //  [1,0] computes to 1; assuming 30 elements in the k array
  //  [0,1] computes to 30;
  //  [1,1] computes to 31
  var keyCount : Int = 0        // number of different keys
  var keys = IntArray(30)       // value of key - one in each array position
  var valueCount = IntArray(30) // number of different values for a particular key
  var values = IntArray(30*20)  // particular values for each key
  var sums = IntArray(30*20)    // sum of instances of each value for each key
Here "elements of first array" means the values associated with the first key (the k index of the illustration) with v being the value index within a particular set of elements.  Thus the code only needs to compute the actual index from the k and v indexes.

After getting this worked out, the Key class no longer had any purpose and so was discarded.

Then while implementing the Kotlin version of the Update function accessing the KeyTable class functions it occurred to me that it would be much simpler to move the update function inside the KeyTable class to directly access the data and that this could be accomplished more simply than I had in the past implementations by passing in all three fields from each record at a time.  This is when it occurred to me that this new KeyTable function had the same purpose as the previous Update function so I changed the name I had given it to update (following the usual C convention of making the first character of a variable or function name lower case).  When the code is presented it can be seen how this version of update is simpler than the previous versions.

At the beginning I had had some problems remembering the syntax for declaring variables and function parameters and function data return declarations since it is non-C like.  It took me a while to get used to it and then I realized it had some characteristics of that of Ada – that is, that the variable is named first followed by a ':' and then its type.  For instance
var index : Int = 0
val HT : Byte = 9
which in Ada would be (where case doesn't matter)
index : Integer := 0;
HT : constant Byte := 9;
where the Byte type in Ada needs to be defined as an 8 bit unsigned integer.
Whereas in C (for instance) it is
int index = 0;
with the type given first.

Also Kotlin can have constants by using the keyword val instead of var.  While, of course, Ada, C, etc don't have a keyword to specify when a variable is being created.

As with Java an ending ';' can be given but isn't need.

For function declarations Kotlin uses
fun name(param1 : Int, param2 : String) : Int
whereas Ada would use
function name(param1 : Integer; param2 : String) return Integer;
or
function name(param1 : in Integer; param2 : in String) return Integer;
where it is specified that a function can only have parameters that input data to the function.  Ada also has
procedure name(param1 : in out Integer; param2 : in String, param3 : out Integer);
that doesn't have a return statement but can output data via the out keyword (with param1 in this case being both input to the procedure and output).  C function declarations, of course, are of the form
int name(int param1);
void name(int param1);
without the keyword to specify a function and with the return type preceding the function name and the type of the parameters preceding the parameter name.

So, in this regard, Katlin syntax is more like Ada (and the older and deader language Pascal) with the name first and then the type.

As can be seen in the Kotlin code, I also modified the parse function to just build an array of 3 integers to capture the data fields of interest (the key and the two values) and just count the number of fields processed (including the leading don't care field) rather than use a switch type of statement.

My Kotlin code for the application is (where Java IO is imported to read the file)
import java.io.File
import java.io.InputStream

// Update structure to contain the different values for each key with the
// number of instances of each combination
class KeyTable
{
  // Data tracking which value of which key has the greatest number of references
  var maxKey : Int = -1   // key with the greatest number of references for a value
  var maxValue : Int = -1 // value with the greatest number of references
  var maxSum : Int = -1   // number of references for key, value combination
 
  // This class retains the data for all possible keys with their associated
  // values.  As such it simulates double dimension arrays for the values and
  // the sum of the number of instances for a particular value.  The first
  // simulated dimension is the same as that for the keys array; one index for
  // each key.  The second simulated dimension is of the various different values
  // for a particular key.
  //
  // The double dimension arrays are simulated as
  //  +-----------+-----------+-----------+-----------+-----------+-----------+
  //  |           |           |           |           |           |           |
  //  +-----------+-----------+-----------+-----------+-----------+-----------+
  //  |           |
  //  | elements  +--> boundary at end of first array for values of first key
  //    of first
  //      key
  // Access the double array via v + k * (first array size) since arrays begin at 0
  //  [0,0] computes to 0 for the single dimension array;
  //  [1,0] computes to 1; assuming 30 elements in the key array and 20 for the values
  //  [0,1] computes to 30 for first value of 2nd key
  //  [1,1] computes to 31 for second value of 2nd key
  var keyCount : Int = 0        // number of different keys
  var keys = IntArray(30)       // value of key - one in each array position
  var valueCount = IntArray(30) // number of different values for a particular key
  var values = IntArray(30*20)  // particular values for each key
  var sums = IntArray(30*20)    // sum of instances of each value for each key

  var savedData = IntArray(3) // global values from call to update

  fun initialize()
  {
    for (k: Int in 0..29)
    {
      keys[k] = 0
      valueCount[k] = 0
    }
    for (kv: Int in 0..(30*20)-1)
    {
      values[kv] = 0
      sums[kv] = 0
    }
  } // end initialize

  // Add key and its values
  fun update(data : IntArray)
  {
    // This function first checks if the key is new and, if so, adds it to the
    // keys array.  It then does similar for the values associated with it.
    // Note: data[0] is the key while data[1] and data[2] are the two values
    //       associated with the key.
   
    savedData = data // save data to be updated for use by updateTotals
   
    // Check whether the key is already in the table
    var keyIndex : Int = -1
    for (k : Int in 0..keyCount-1)
    {
      if (keys[k] == data[0])
      {
        keyIndex = k // key already in the table
        break; // exit loop
      }
    } // end for loop

    if (keyIndex < 0) // key not in the table
    { // add the key
      keyIndex = keyCount
      keys[keyIndex] = data[0]
      keyCount++
    } // end if

    // add the values for the key
    addValues(keyIndex, data[1], data[2])

  } // end update

  // Add the values or increment their sums
  fun addValues(keyIndex : Int, value1 : Int, value2 : Int)
  {
    // Check whether the first value is already in the table
    var valueIndex : Int = -1
    var index : Int = keyIndex * 30 // location of any first value (0, 30, ...)
    for (v : Int in 0..valueCount[keyIndex]-1)
    {
      index = index + v
      if (values[index] == value1)
      { // value already in the table
        valueIndex = index
        sums[valueIndex]++ // increment its number of references
        if (value1 == value2) // 2nd value the same
        {
          sums[valueIndex]++ // increment again
        }
        updateTotals(savedData[0],value1,sums[valueIndex]) // check if new max
        break; // exit loop
      }
    } // end loop
    if (valueIndex < 0) // value not yet in table
    { // add value to the table - index points to last value checked
      index = index + valueCount[keyIndex] // next location
      values[index] = value1
      sums[index] = 1
      valueCount[keyIndex]++
      if (value1 == value2)
      {
        sums[index]++ // increment
        updateTotals(savedData[0],value1,sums[index]) // check if new max
      }
      else
      {
        add2ndValue(keyIndex, value2)
      }
    } // end outer if

  } // end addValues

  // Add the second value or increment its sum
  fun add2ndValue(keyIndex : Int, value2 : Int)
  {
    // Check whether the second value is already in the table
    var valueIndex : Int = -1
    var index : Int = keyIndex * 30 // location of any second value (0, 30, ...)
    for (v : Int in 0..valueCount[keyIndex]-1)
    {
      index = index + v
      if (values[index] == value2)
      { // value already in the table
        valueIndex = index
        sums[valueIndex]++ // increment its number of references
        updateTotals(savedData[0],value2,sums[valueIndex]) // check if new max
        break; // exit loop
      }
    } // end loop
    if (valueIndex < 0) // value not yet in table
    { // add value to the table - index points to last value checked
      index++ // index to next location
      values[index] = value2
      sums[index] = 1
      valueCount[keyIndex]++
    }
       
  } // end add2ndValue

  fun updateTotals(key : Int, value : Int, sum : Int)
  {
    if (sum > maxSum)
    {
      maxKey = key
      maxValue = value
      maxSum = sum
    }
  } // end updateTotals
 
  // Report the key and value with the most references
  fun report()
  {
    println("Key " + maxKey + " with value " + maxValue + " with maximum references of " + maxSum)
  }
 
} // end class KeyTable

// Instantiate KeyTable class
val keyTable = KeyTable()


// Convert Byte array to Int
fun toInt(data: ByteArray, iS : Int, iE: Int): Int
{
  val NINE: Int = 57 // ASCII character for digit 9
  val ZERO: Int = 48 // ASCII character for digit 0

  var index: Int = iE // loop in reverse
  var digit: Int
  var m: Int = 1      // multiplier for shift
  var number: Int = 0 // Numeric result

  while (index >= iS)
  {
    if ((data[index] >= ZERO) && (data[index] <= NINE))
    {
      digit = data[index] - ZERO; // convert ASCII to digit
      number = number + (m * digit);
      m = m * 10;
      index--;
    }
  }

  return number;

} // end toInt

// Parse
fun parse(buffer: ByteArray, length: Int)
{
  val CR: Byte = 13 // carriage return
  val HT: Byte = 9  // horizontal tab
  val LF: Byte = 10 // line feed
 
  // Parse each line of data in the buffer to obtain the three fields of
  // interest, converting those fields to integers into an array, and
  // then updating a data structure to retain the data for evaluation
  // when the complete buffer has been parsed.
  // Also, output the contents of the file buffer when not commented out.
  var nextField: Int = 0  // index of the beginning of next field
  var startField: Int     // range of indexes of
  var numFields: Int = 0  // index into dataFields array
  var dataFields = IntArray(3) // Integer values of the three fields of interest
 
  var index: Int = 0
  while (true)
  {
    // Parse the buffer line
    if (buffer[index] == HT) // beginning of a field
    {
      startField = nextField // save starting index
      nextField = index + 1  // the next byte will contain part of next field
      if (numFields > 0)
      {
        dataFields[numFields-1] = toInt(buffer, startField, index - 1) //convert and store in dataFields
      } // end if numFields > 0
      numFields++
    } // end if HT
    if ((buffer[index] == CR) || (buffer[index] == LF))
    {
      if (numFields < 4) // only do CR or LF once
      {
        dataFields[numFields-1] = toInt(buffer, nextField, index - 1) //convert and store in dataFields
        keyTable.update(dataFields)

        numFields++
      }
      if (buffer[index] == LF)
      {
        numFields = 0
      }
    } // end else if

    index++
    if (index == length)
    {
      break // out of while loop
    }
  } // end while

} // end of parse function

// Main entry point
fun main(args: Array<String>)
{
  // Open the file.
  try
  {
    // Open the inputStream.
    var inputStream: InputStream = File( "C:\\Source\\LearnToCodeGR\\max-col-sum-by-Key.tsv").inputStream()

    // Read from this input stream into an array of bytes
    var buffer = ByteArray(8192)
    var length = inputStream.read(buffer)

    keyTable.initialize() // initialize the KeyTable class

    // Parse the buffer and keep track of the needed data.
    parse( buffer, length)
   
    // Report the results
    keyTable.report()
   
  } // end try
  catch (e: NoSuchFileException)
  {
    System.out.println("File not found" + e);
  }

} // end main function

Kotlin, like Java and Python, doesn’t seem to have a compiler that runs in a Windows window.  So it has to be run in a Command Prompt window.  I, as usual, changed the folder in the window to that where I was developing the application (c:\Source\Kotlin) so the command was just
> kotlinc learntocodegr.kc -include-runtime -d learntocodegr.jar
since LearnToCodeGR.kc was the file name that I used and kotlinc.exe is the kotlin compiler.  (With the path to the binary folder added to the Windows system path.)

To run the application I used the command
> java –jar learntocodegr.jar
which result in the output in the Command Prompt window of
Key 3000 with value 1 with maximum references of 20
as output by the report function of the KeyTable class.  So the use of Java shows up here.


Discussion of the C Application


Once again I had the "main" function open the max-col-sum-by-Key.tsv file and read it.  However, as I have been meaning to do for a while now, instead of reading the contents of the entire file into a buffer, I used a C function to only read in a line at a time.  That is, until the Line Feed character and read the file as containing characters since it is now known that the file contains lines of characters rather than, for instance, a mix with binary numeric values.  This also has the advantage of being able to process a much larger file since the buffer can be sized to a much smaller value.

And I used C's struct type that I've found missing in C#, etc, etc so its use can be observed.  The object of the struct type are
  struct KeyData keydata[30]; // array of possible keys with their values
in the declaration of the KeyTable structure type and
struct KeyTable keyTable; // static table
for the keyTable variable.

Otherwise, the C application is a rework of the Kotlin version of the application.  So the two languages can be directly compared.

The contents of the max-col-sum-by-Key.tsv file is displayed in the main function.  I've left it in but normally it would either be removed or commented out.  When the application is run there is a blank line after each line read from the file since each line (except the last) has its own ending new line character and then my code outputs another new line using the \n new line character of C.  Of course, the horizontal tabs in the lines are only seen as blank space since HTs (\t), like the LF (\n), aren't printable characters.

#include <stdio.h>

// Data about a particular key
struct KeyData {
  int key;        // numeric key
  int valueCount; // number of different values associated with key
  int values[20]; // different values associated with key
  int sums[20];   // number of references to a value
};

// Data about all the keys of the file
struct KeyTable {
  int keyCount; // number of different keys in the file
  struct KeyData keydata[30]; // array of possible keys with their values
};

struct KeyTable keyTable; // static table

// Initialize keyTable
void initialize()
{
  keyTable.keyCount = 0;
  for (int i = 0; i < 30; i++)
  {
    keyTable.keydata[i].valueCount = 0;
  }
} // end initialize

// Data to be retained as the key and its associated value with the maximum
// number of references
int maxKey = 0;
int maxValue = 0;
int maxSum = 0;

int savedData[3];

// Report the key and value with the most references
void report()
{
  printf("\n");
  printf("Key %d with value %d with maximum references of %d \n", maxKey, maxValue, maxSum );
}

void updateTotals(int key, int value, int sum)
{
  if (sum > maxSum)
  {
    maxKey = key;
    maxValue = value;
    maxSum = sum;
  }
} // end updateTotals

// Add the second value or increment its sum
void add2ndValue(int keyIndex, int value2)
{
  // Check whether the second value is already in the table
  int valueIndex = -1;
  for (int v = 0; v < keyTable.keydata[keyIndex].valueCount; v++)
  {
    if (keyTable.keydata[keyIndex].values[v] == value2)
    { // value already in the table
      valueIndex = v;
      keyTable.keydata[keyIndex].sums[valueIndex]++; // increment its number of references
      // check if new max
      updateTotals(savedData[0],value2,keyTable.keydata[keyIndex].sums[valueIndex]);
      break; // exit loop
    }
  } // end loop
  if (valueIndex < 0) // value not yet in table
  { // add value to the table
    int index = keyTable.keydata[keyIndex].valueCount;
    keyTable.keydata[keyIndex].values[index] = value2;
    keyTable.keydata[keyIndex].sums[index] = 1;
    keyTable.keydata[keyIndex].valueCount++;
  }
       
} // end add2ndValue

// Add the values or increment their sums
void addValues(int keyIndex, int value1, int value2)
{
  // Check whether the first value is already in the table
  int valueIndex = -1;
  for (int v = 0; v < keyTable.keydata[keyIndex].valueCount; v++)
  {
    if (keyTable.keydata[keyIndex].values[v] == value1)
    { // value already in the table
      valueIndex = v;
      keyTable.keydata[keyIndex].sums[v]++; // increment its number of references
      if (value1 == value2) // 2nd value the same
      {
        keyTable.keydata[keyIndex].sums[v]++; // increment again
      }
      // check if new max
      updateTotals(savedData[0],value1,keyTable.keydata[keyIndex].sums[v]);
      break; // exit loop
    }
  } // end loop
  if (valueIndex < 0) // value not yet in table
  { // add value to the table - index points to last value checked
    //    index = index + valueCount[keyIndex] // next location
    int index = keyTable.keydata[keyIndex].valueCount;
    keyTable.keydata[keyIndex].values[index] = value1;
    keyTable.keydata[keyIndex].sums[index] = 1;
    keyTable.keydata[keyIndex].valueCount++;
    if (value1 == value2)
    {
      keyTable.keydata[keyIndex].sums[index]++; // increment
      // check if new max
      updateTotals(savedData[0],value1,keyTable.keydata[keyIndex].sums[index]);
    }
    else
    {
      add2ndValue(keyIndex, value2);
    }
  } // end outer if

} // end addValues

// Update keyTable with data from a parsed line
void update(int* data)
{
  // This function first checks if the key is new and, if so, adds it to the
  // keys array.  It then does similar for the values associated with it.
  // Note: data[0] is the key while data[1] and data[2] are the two values
  //       associated with the key.

  // save data to be updated for use by updateTotals since needed by multiple functions
  savedData[0] = data[0];
  savedData[1] = data[1];
  savedData[2] = data[2];

  // Check whether the key is already in the table
  int keyIndex = -1;
  for (int k = 0; k < keyTable.keyCount; k++)
  {
    if (keyTable.keydata[k].key == data[0])
    {
      keyIndex = k; // key already in the table
      break; // exit loop
    }
  } // end for loop

  if (keyIndex < 0) // key not in the table
  { // add the key
    keyIndex = keyTable.keyCount;
    keyTable.keydata[keyIndex].key = data[0];
    keyTable.keyCount++;
  } // end if

  // add the values for the key
  addValues(keyIndex, savedData[1], savedData[2]); //data[1], data[2]);

} // end update


// Convert character array to int
int toInt(char* data, int iS, int iE)
{
  #define NINE '9' //57 // ASCII character for digit 9
  #define ZERO '0' //48 // ASCII character for digit 0

  int index = iE; // loop in reverse
  int digit;
  int m = 1;      // multiplier for shift
  int number = 0; // Numeric result

  while (index >= iS)
  {
    if ((data[index] >= ZERO) && (data[index] <= NINE))
    {
      digit = data[index] - ZERO; // convert ASCII to digit
      number = number + (m * digit);
      m = m * 10;
      index--;
    }
  }

  return number;

} // end toInt

// Extract the fields of interest from the file's line and retain the data
void parse(int count, char* data)
{

  #define CR '\r' // carriage return
  #define HT '\t' // horizontal tab
  #define LF '\n' // line feed
  #define TRUE 1
 
  // Parse each line of data in the buffer to obtain the three fields of
  // interest, converting those fields to integers into an array, and
  // then updating a data structure to retain the data for evaluation
  // when the complete buffer has been parsed.
  int nextField = 0;  // index of the beginning of next field
  int startField;     // range of indexes of
  int numFields = 0;  // index into dataFields array
  int dataFields[3];  // Integer values of the three fields of interest
 
  int index = 0;
  while (TRUE)
  {
    // Parse the buffer line
    if (data[index] == HT) // beginning of a field
    {
      startField = nextField; // save starting index
      nextField = index + 1;  // the next byte will contain part of next field
      if (numFields > 0)
      { //convert and store in dataFields
        dataFields[numFields-1] = toInt(data, startField, index - 1);
      } // end if numFields > 0
      numFields++;
    } // end if HT
    if ((data[index] == CR) || (data[index] == LF))
    {
      if (numFields < 4) // only do CR or LF once
      { //convert and store in dataFields
        dataFields[numFields-1] = toInt(data, nextField, index - 1);
        // save the data to determine the key, value combination with
        // maximum number of references
        update(dataFields);

        return; // finished with the line in the
      }
    } // end if

    index++;
    if (index == count)
    {
      break; // out of while loop
    }
  } // end while

} // end parse


void main(argc, argv)
int argc;
char* argv[];
{
  FILE* file;
  char buffer[30];
  file = fopen("C:\\Source\\LearnToCodeGR\\max-col-sum-by-Key.tsv", "r");

  int currPos;
  int lastPos = 0;
  int length;

  initialize();

  if(file == NULL)
  {
    perror("Error opening file");
  }
  else
  { // Read file records until end-of-file
    while (!feof(file))
    {
      fgets(buffer, 30, (FILE*)file);
      currPos = ftell(file);
      length = currPos - lastPos; // number of characters read into buffer
      lastPos = currPos;
      for (int i=0; i < length; i++)
        printf("%c ", buffer[i]);
      printf("\n");
     
      // Parse the line and update to retain all the necessary data.
      parse(length, buffer);
    }
    fclose(file);

    // Report the results.
    report();
  }

} // end main