Continuing on from my previous
explorations that followed Learn to Code Grand Rapids “Building a Real World App in VS 2017, Part I”
of August 10 I decided to try Kotlin after having attended a MeetUp
presentation of this new language.
It was supposed to be an
extension of Java 8 (I think it was) for Android devices and to be able to
gradually go from Java code to Kotlin with them working together while
gradually switching over. As a new
language it was supposed to be getting frequent releases. Well I must not have downloaded a real
recent release – at least I couldn't compile with part Kotlin code and part
Java code (of course my Java download was back with a 3.6 release or something
like that so no where near Java 8).
In my previous two Max Column
Sum by Key documents I described my initial C# attempt before I had the file
containing the data that was aborted for that reason, then my application using
Ada as the language after getting the data file, then with C# once again using
the data file, and finally with Python as a new language for me. Then, in the second document, using Java and
redoing the C# application once again.
This document will describe the
application in Kotlin which ended up somewhat simpler since I had to work
through some problems. While trying to
find a way to do a similar structure to that used in "Max Column Sum by
Key (part 2)" in Java and then C# once again and failing to accomplish it
in Kotlin, I gave up and then a simpler solution occurred to me.
Following that discussion will
be one concerning doing the same application in C.
Discussion of the Kotlin Application
I started out following the
program structure of the Java version of the application with a Key class to
encapsulate the data of the key field of the file. Remember that the file being used consisted of 20 records for
each of 5 different keys with a pair of values in each record associated with
the key with the objective of finding the key and value that had the most
instances in the file. (Each key
following "who cares" leading text and the key and value fields
preceded by a horizontal tab character and the original file having a line feed
character following the second value.
Since I am using the UltraEdit editor which likes to ask if you want to
convert to a DOS format when opening a file, I happened to reply in the
affirmative early on so that records got a carriage return preceding the
trailing LF when I was examining the format of the file.)
In the Java application and the later C# application there
was the Key class to organize the data for a key including arrays to retain the
different values associated with the particular key (taking the place of a C
struct structure). Then a KeyTable
class to keep track of all the data with an array of the Key class to retain
the data on each different key found in the file. Plus an Update function to access the two classes to build the
tables.
I tried to duplicate this in Kotlin – after all it was
supposed to be able to gradually convert from a Java application to
Kotlin. Kotlin had array constructs
where a class or object could be assigned as the type of the array. (object being something similar to a class
it seemed.) But try as I might I
couldn't get an array of the Key class to work as I had in Java and C#.
So I decided to use a double dimension array of the Int
type. Again there was a problem since
it seemed from online searches that multiple dimension arrays weren't supported
although some said it had been added supplying what looked to me as a strange
syntax. But, most likely not in the
version I had downloaded even if I could have figured out the syntax.
After fussing with the array of the Key class once again it
dawned on me that I could do a work around for a double dimension array where I
made a single dimension array of the Int type contain imaginary arrays of a
second dimension. In that way I could
have the multiple arrays of the past Key class value data for a particular key
embedded in the single dimension array – one group of the value data for each
key as follows.
// The
double dimension arrays are simulated as
//
+-----------+-----------+-----------+-----------+-----------+-----------+
// |
| | | |
| |
//
+-----------+-----------+-----------+-----------+-----------+-----------+
// |
|
// | elements
+--> boundary at end of first array
// of first
// array
// Access
the double array via v + k * (first array size) since arrays begin at 0
// [0,0] computes to 0 for the single dimension
array;
// [1,0] computes to 1; assuming 30 elements in
the k array
// [0,1] computes to 30;
// [1,1] computes to 31
var
keyCount : Int = 0 // number of
different keys
var keys =
IntArray(30) // value of key - one
in each array position
var
valueCount = IntArray(30) // number of different values for a particular key
var values
= IntArray(30*20) // particular values
for each key
var sums =
IntArray(30*20) // sum of instances
of each value for each key
Here "elements of first array" means the values
associated with the first key (the k index of the illustration) with v being
the value index within a particular set of elements. Thus the code only needs to compute the actual index from the k
and v indexes.
After getting this worked out, the Key class no longer had
any purpose and so was discarded.
Then while implementing the Kotlin version of the Update
function accessing the KeyTable class functions it occurred to me that it would
be much simpler to move the update function inside the KeyTable class to
directly access the data and that this could be accomplished more simply than I
had in the past implementations by passing in all three fields from each record
at a time. This is when it occurred to
me that this new KeyTable function had the same purpose as the previous Update
function so I changed the name I had given it to update (following the usual C
convention of making the first character of a variable or function name lower
case). When the code is presented it
can be seen how this version of update is simpler than the previous versions.
At the beginning I had had some problems remembering the
syntax for declaring variables and function parameters and function data return
declarations since it is non-C like. It
took me a while to get used to it and then I realized it had some
characteristics of that of Ada – that is, that the variable is named first
followed by a ':' and then its type.
For instance
var index : Int = 0
val HT : Byte = 9
which in Ada would be (where case doesn't matter)
index : Integer := 0;
HT : constant Byte := 9;
where the Byte type in Ada needs to be defined as an 8 bit
unsigned integer.
Whereas in C (for instance) it is
int index = 0;
with the type given first.
Also Kotlin can have constants by using the keyword val
instead of var. While, of course, Ada,
C, etc don't have a keyword to specify when a variable is being created.
As with Java an ending ';' can be given but isn't need.
For function declarations Kotlin uses
fun name(param1 : Int, param2 : String) : Int
whereas Ada would use
function name(param1 : Integer; param2 : String)
return Integer;
or
function name(param1 : in Integer; param2 : in
String) return Integer;
where it is specified that a function can only have
parameters that input data to the function.
Ada also has
procedure name(param1 : in out Integer; param2 : in
String, param3 : out Integer);
that doesn't have a return statement but can output data via
the out keyword (with param1 in this case being both input to the procedure and
output). C function declarations, of
course, are of the form
int name(int param1);
void name(int param1);
without the keyword to specify a function and with the
return type preceding the function name and the type of the parameters
preceding the parameter name.
So, in this regard, Katlin syntax is more like Ada (and the
older and deader language Pascal) with the name first and then the type.
As can be seen in the Kotlin code, I also modified the parse
function to just build an array of 3 integers to capture the data fields of
interest (the key and the two values) and just count the number of fields
processed (including the leading don't care field) rather than use a switch
type of statement.
My Kotlin code for the application is (where Java IO is
imported to read the file)
import java.io.File
import java.io.InputStream
// Update structure to contain the different values
for each key with the
// number of instances of each combination
class KeyTable
{
// Data
tracking which value of which key has the greatest number of references
var maxKey
: Int = -1 // key with the greatest
number of references for a value
var
maxValue : Int = -1 // value with the greatest number of references
var maxSum
: Int = -1 // number of references for
key, value combination
// This
class retains the data for all possible keys with their associated
//
values. As such it simulates double
dimension arrays for the values and
// the sum of
the number of instances for a particular value. The first
//
simulated dimension is the same as that for the keys array; one index for
// each
key. The second simulated dimension is
of the various different values
// for a
particular key.
//
// The
double dimension arrays are simulated as
//
+-----------+-----------+-----------+-----------+-----------+-----------+
// |
| | | |
| |
// +-----------+-----------+-----------+-----------+-----------+-----------+
// |
|
// | elements
+--> boundary at end of first array for values of first key
// of first
// key
// Access
the double array via v + k * (first array size) since arrays begin at 0
// [0,0] computes to 0 for the single dimension
array;
// [1,0] computes to 1; assuming 30 elements in
the key array and 20 for the values
// [0,1] computes to 30 for first value of 2nd
key
// [1,1] computes to 31 for second value of 2nd
key
var keyCount
: Int = 0 // number of different
keys
var keys =
IntArray(30) // value of key - one
in each array position
var
valueCount = IntArray(30) // number of different values for a particular key
var values
= IntArray(30*20) // particular values
for each key
var sums =
IntArray(30*20) // sum of instances
of each value for each key
var
savedData = IntArray(3) // global values from call to update
fun
initialize()
{
for (k:
Int in 0..29)
{
keys[k]
= 0
valueCount[k] = 0
}
for (kv:
Int in 0..(30*20)-1)
{
values[kv] = 0
sums[kv] = 0
}
} // end
initialize
// Add key
and its values
fun
update(data : IntArray)
{
// This
function first checks if the key is new and, if so, adds it to the
// keys
array. It then does similar for the
values associated with it.
// Note:
data[0] is the key while data[1] and data[2] are the two values
// associated with the key.
savedData
= data // save data to be updated for use by updateTotals
// Check
whether the key is already in the table
var
keyIndex : Int = -1
for (k :
Int in 0..keyCount-1)
{
if
(keys[k] == data[0])
{
keyIndex = k // key already in the table
break; // exit loop
}
} // end
for loop
if
(keyIndex < 0) // key not in the table
{ // add
the key
keyIndex = keyCount
keys[keyIndex] = data[0]
keyCount++
} // end
if
// add
the values for the key
addValues(keyIndex, data[1], data[2])
} // end
update
// Add the
values or increment their sums
fun
addValues(keyIndex : Int, value1 : Int, value2 : Int)
{
// Check
whether the first value is already in the table
var
valueIndex : Int = -1
var index
: Int = keyIndex * 30 // location of any first value (0, 30, ...)
for (v :
Int in 0..valueCount[keyIndex]-1)
{
index =
index + v
if
(values[index] == value1)
{ //
value already in the table
valueIndex = index
sums[valueIndex]++ // increment its number of references
if
(value1 == value2) // 2nd value the same
{
sums[valueIndex]++ // increment again
}
updateTotals(savedData[0],value1,sums[valueIndex]) // check if new max
break;
// exit loop
}
} // end
loop
if
(valueIndex < 0) // value not yet in table
{ // add
value to the table - index points to last value checked
index =
index + valueCount[keyIndex] // next location
values[index] = value1
sums[index]
= 1
valueCount[keyIndex]++
if
(value1 == value2)
{
sums[index]++ // increment
updateTotals(savedData[0],value1,sums[index]) // check if new max
}
else
{
add2ndValue(keyIndex, value2)
}
} // end
outer if
} // end
addValues
// Add the
second value or increment its sum
fun
add2ndValue(keyIndex : Int, value2 : Int)
{
// Check
whether the second value is already in the table
var
valueIndex : Int = -1
var index
: Int = keyIndex * 30 // location of any second value (0, 30, ...)
for (v :
Int in 0..valueCount[keyIndex]-1)
{
index =
index + v
if
(values[index] == value2)
{ //
value already in the table
valueIndex = index
sums[valueIndex]++
// increment its number of references
updateTotals(savedData[0],value2,sums[valueIndex]) // check if new max
break; // exit loop
}
} // end
loop
if
(valueIndex < 0) // value not yet in table
{ // add
value to the table - index points to last value checked
index++
// index to next location
values[index] = value2
sums[index] = 1
valueCount[keyIndex]++
}
} // end
add2ndValue
fun
updateTotals(key : Int, value : Int, sum : Int)
{
if (sum
> maxSum)
{
maxKey
= key
maxValue = value
maxSum
= sum
}
} // end
updateTotals
// Report
the key and value with the most references
fun
report()
{
println("Key " + maxKey + " with value " + maxValue
+ " with maximum references of " + maxSum)
}
} // end class KeyTable
// Instantiate KeyTable class
val keyTable = KeyTable()
// Convert Byte array to Int
fun toInt(data: ByteArray, iS : Int, iE: Int): Int
{
val NINE:
Int = 57 // ASCII character for digit 9
val ZERO:
Int = 48 // ASCII character for digit 0
var index:
Int = iE // loop in reverse
var digit:
Int
var m: Int
= 1 // multiplier for shift
var number:
Int = 0 // Numeric result
while
(index >= iS)
{
if
((data[index] >= ZERO) && (data[index] <= NINE))
{
digit =
data[index] - ZERO; // convert ASCII to digit
number
= number + (m * digit);
m = m *
10;
index--;
}
}
return
number;
} // end toInt
// Parse
fun parse(buffer: ByteArray, length: Int)
{
val CR:
Byte = 13 // carriage return
val HT:
Byte = 9 // horizontal tab
val LF:
Byte = 10 // line feed
// Parse
each line of data in the buffer to obtain the three fields of
//
interest, converting those fields to integers into an array, and
// then
updating a data structure to retain the data for evaluation
// when the
complete buffer has been parsed.
// Also,
output the contents of the file buffer when not commented out.
var
nextField: Int = 0 // index of the beginning
of next field
var
startField: Int // range of indexes
of
var
numFields: Int = 0 // index into
dataFields array
var
dataFields = IntArray(3) // Integer values of the three fields of interest
var index:
Int = 0
while
(true)
{
// Parse the buffer line
if
(buffer[index] == HT) // beginning of a field
{
startField = nextField // save starting index
nextField = index + 1 // the
next byte will contain part of next field
if
(numFields > 0)
{
dataFields[numFields-1]
= toInt(buffer, startField, index - 1) //convert and store in dataFields
} //
end if numFields > 0
numFields++
} // end
if HT
if
((buffer[index] == CR) || (buffer[index] == LF))
{
if
(numFields < 4) // only do CR or LF once
{
dataFields[numFields-1] = toInt(buffer, nextField, index - 1) //convert
and store in dataFields
keyTable.update(dataFields)
numFields++
}
if
(buffer[index] == LF)
{
numFields = 0
}
} // end
else if
index++
if (index
== length)
{
break
// out of while loop
}
} // end
while
} // end of parse function
// Main entry point
fun main(args: Array<String>)
{
// Open the
file.
try
{
// Open
the inputStream.
var
inputStream: InputStream = File(
"C:\\Source\\LearnToCodeGR\\max-col-sum-by-Key.tsv").inputStream()
// Read
from this input stream into an array of bytes
var
buffer = ByteArray(8192)
var
length = inputStream.read(buffer)
keyTable.initialize() // initialize the KeyTable class
// Parse
the buffer and keep track of the needed data.
parse(
buffer, length)
// Report
the results
keyTable.report()
} // end
try
catch (e:
NoSuchFileException)
{
System.out.println("File not found" + e);
}
} // end main function
Kotlin, like Java and Python,
doesn’t seem to have a compiler that runs in a Windows window. So it has to be run in a Command Prompt
window. I, as usual, changed the folder
in the window to that where I was developing the application (c:\Source\Kotlin)
so the command was just
> kotlinc
learntocodegr.kc -include-runtime -d learntocodegr.jar
since LearnToCodeGR.kc was the
file name that I used and kotlinc.exe is the kotlin compiler. (With the path to the binary folder added to
the Windows system path.)
To run the application I used
the command
> java –jar
learntocodegr.jar
which result in the output in the Command Prompt window of
Key 3000 with value 1 with maximum references of 20
as output by the report function of the KeyTable class. So the use of Java shows up here.
Discussion of the C Application
Once again I had the "main" function open the
max-col-sum-by-Key.tsv file and read it.
However, as I have been meaning to do for a while now, instead of
reading the contents of the entire file into a buffer, I used a C function to
only read in a line at a time. That is,
until the Line Feed character and read the file as containing characters since
it is now known that the file contains lines of characters rather than, for
instance, a mix with binary numeric values.
This also has the advantage of being able to process a much larger file
since the buffer can be sized to a much smaller value.
And I used C's struct type that I've found missing in C#,
etc, etc so its use can be observed.
The object of the struct type are
struct
KeyData keydata[30]; // array of possible keys with their values
in the declaration of the KeyTable structure type and
struct KeyTable keyTable; // static table
for the keyTable variable.
Otherwise, the C application is a rework of the Kotlin
version of the application. So the two
languages can be directly compared.
The contents of the max-col-sum-by-Key.tsv file is displayed
in the main function. I've left it in
but normally it would either be removed or commented out. When the application is run there is a blank
line after each line read from the file since each line (except the last) has
its own ending new line character and then my code outputs another new line
using the \n new line character of C.
Of course, the horizontal tabs in the lines are only seen as blank space
since HTs (\t), like the LF (\n), aren't printable characters.
#include <stdio.h>
// Data about a particular key
struct KeyData {
int
key; // numeric key
int
valueCount; // number of different values associated with key
int
values[20]; // different values associated with key
int
sums[20]; // number of references to a
value
};
// Data about all the keys of the file
struct KeyTable {
int
keyCount; // number of different keys in the file
struct
KeyData keydata[30]; // array of possible keys with their values
};
struct KeyTable keyTable; // static table
// Initialize keyTable
void initialize()
{
keyTable.keyCount = 0;
for (int i
= 0; i < 30; i++)
{
keyTable.keydata[i].valueCount = 0;
}
} // end initialize
// Data to be retained as the key and its associated
value with the maximum
// number of references
int maxKey = 0;
int maxValue = 0;
int maxSum = 0;
int savedData[3];
// Report the key and value with the most references
void report()
{
printf("\n");
printf("Key %d with value %d with maximum references of %d
\n", maxKey, maxValue, maxSum );
}
void updateTotals(int key, int value, int sum)
{
if (sum
> maxSum)
{
maxKey =
key;
maxValue
= value;
maxSum =
sum;
}
} // end updateTotals
// Add the second value or increment its sum
void add2ndValue(int keyIndex, int value2)
{
// Check
whether the second value is already in the table
int
valueIndex = -1;
for (int v
= 0; v < keyTable.keydata[keyIndex].valueCount; v++)
{
if
(keyTable.keydata[keyIndex].values[v] == value2)
{ //
value already in the table
valueIndex = v;
keyTable.keydata[keyIndex].sums[valueIndex]++; // increment its number
of references
//
check if new max
updateTotals(savedData[0],value2,keyTable.keydata[keyIndex].sums[valueIndex]);
break;
// exit loop
}
} // end
loop
if
(valueIndex < 0) // value not yet in table
{ // add
value to the table
int index
= keyTable.keydata[keyIndex].valueCount;
keyTable.keydata[keyIndex].values[index] = value2;
keyTable.keydata[keyIndex].sums[index] = 1;
keyTable.keydata[keyIndex].valueCount++;
}
} // end add2ndValue
// Add the values or increment their sums
void addValues(int keyIndex, int value1, int value2)
{
// Check
whether the first value is already in the table
int
valueIndex = -1;
for (int v
= 0; v < keyTable.keydata[keyIndex].valueCount; v++)
{
if
(keyTable.keydata[keyIndex].values[v] == value1)
{ //
value already in the table
valueIndex = v;
keyTable.keydata[keyIndex].sums[v]++; // increment its number of
references
if
(value1 == value2) // 2nd value the same
{
keyTable.keydata[keyIndex].sums[v]++; // increment again
}
//
check if new max
updateTotals(savedData[0],value1,keyTable.keydata[keyIndex].sums[v]);
break;
// exit loop
}
} // end
loop
if
(valueIndex < 0) // value not yet in table
{ // add
value to the table - index points to last value checked
// index = index + valueCount[keyIndex] //
next location
int index
= keyTable.keydata[keyIndex].valueCount;
keyTable.keydata[keyIndex].values[index] = value1;
keyTable.keydata[keyIndex].sums[index] = 1;
keyTable.keydata[keyIndex].valueCount++;
if
(value1 == value2)
{
keyTable.keydata[keyIndex].sums[index]++; // increment
//
check if new max
updateTotals(savedData[0],value1,keyTable.keydata[keyIndex].sums[index]);
}
else
{
add2ndValue(keyIndex, value2);
}
} // end
outer if
} // end addValues
// Update keyTable with data from a parsed line
void update(int* data)
{
// This
function first checks if the key is new and, if so, adds it to the
// keys
array. It then does similar for the
values associated with it.
// Note:
data[0] is the key while data[1] and data[2] are the two values
// associated with the key.
// save
data to be updated for use by updateTotals since needed by multiple functions
savedData[0] = data[0];
savedData[1] = data[1];
savedData[2] = data[2];
// Check
whether the key is already in the table
int
keyIndex = -1;
for (int k
= 0; k < keyTable.keyCount; k++)
{
if
(keyTable.keydata[k].key == data[0])
{
keyIndex = k; // key already in the table
break;
// exit loop
}
} // end
for loop
if
(keyIndex < 0) // key not in the table
{ // add
the key
keyIndex
= keyTable.keyCount;
keyTable.keydata[keyIndex].key
= data[0];
keyTable.keyCount++;
} // end if
// add the
values for the key
addValues(keyIndex, savedData[1], savedData[2]); //data[1], data[2]);
} // end update
// Convert character array to int
int toInt(char* data, int iS, int iE)
{
#define
NINE '9' //57 // ASCII character for digit 9
#define
ZERO '0' //48 // ASCII character for digit 0
int index =
iE; // loop in reverse
int digit;
int m =
1; // multiplier for shift
int number
= 0; // Numeric result
while
(index >= iS)
{
if
((data[index] >= ZERO) && (data[index] <= NINE))
{
digit =
data[index] - ZERO; // convert ASCII to digit
number
= number + (m * digit);
m = m *
10;
index--;
}
}
return
number;
} // end toInt
// Extract the fields of interest from the file's
line and retain the data
void parse(int count, char* data)
{
#define CR
'\r' // carriage return
#define HT
'\t' // horizontal tab
#define LF
'\n' // line feed
#define
TRUE 1
// Parse
each line of data in the buffer to obtain the three fields of
//
interest, converting those fields to integers into an array, and
// then
updating a data structure to retain the data for evaluation
// when the
complete buffer has been parsed.
int
nextField = 0; // index of the
beginning of next field
int
startField; // range of indexes of
int
numFields = 0; // index into dataFields
array
int
dataFields[3]; // Integer values of the
three fields of interest
int index =
0;
while
(TRUE)
{
// Parse
the buffer line
if
(data[index] == HT) // beginning of a field
{
startField = nextField; // save starting index
nextField = index + 1; // the
next byte will contain part of next field
if
(numFields > 0)
{ //convert and store in dataFields
dataFields[numFields-1] = toInt(data, startField, index - 1);
} //
end if numFields > 0
numFields++;
} // end
if HT
if
((data[index] == CR) || (data[index] == LF))
{
if
(numFields < 4) // only do CR or LF once
{
//convert and store in dataFields
dataFields[numFields-1] = toInt(data, nextField, index - 1);
//
save the data to determine the key, value combination with
//
maximum number of references
update(dataFields);
return; // finished with the line in the
}
} // end
if
index++;
if (index
== count)
{
break;
// out of while loop
}
} // end
while
} // end parse
void main(argc, argv)
int argc;
char* argv[];
{
FILE* file;
char
buffer[30];
file =
fopen("C:\\Source\\LearnToCodeGR\\max-col-sum-by-Key.tsv",
"r");
int
currPos;
int lastPos
= 0;
int length;
initialize();
if(file ==
NULL)
{
perror("Error opening file");
}
else
{ // Read
file records until end-of-file
while
(!feof(file))
{
fgets(buffer, 30, (FILE*)file);
currPos
= ftell(file);
length
= currPos - lastPos; // number of characters read into buffer
lastPos
= currPos;
for
(int i=0; i < length; i++)
printf("%c ", buffer[i]);
printf("\n");
//
Parse the line and update to retain all the necessary data.
parse(length, buffer);
}
fclose(file);
// Report
the results.
report();
}
} // end main
No comments:
Post a Comment