A while back (last summer) a coding group member recommended
Go as a new language. Since I have now
run out of things to do regarding the Expense It application, I decided to
start looking into Go.
At the end of July I had downloaded version 1.10.3 and
recently I downloaded 1.11.2 to a different PC that seems to be the latest
version. So these comments will mainly
be about 1.10.3 but could also apply to 1.11.2 since it seems to have the same
problems.
The comments that I make here will be about writing an
application in the Go language as well as a lesson learned about the Learn to
Code GR application of a number of posts from September 2017 thru February 2018
mostly under titles such as Max Column Sum by Key.
Comments concerning Go
First off, upon starting to use the Go compiler the second
of this month (it is now the twelfth) I tried the hello.go project as
instructed in How to Write Go Code as found online (golang.org/doc/code.html). This site provides the code example
package main
func main() {
}
Except that it wouldn't build using the 1.10.3 version of
the compiler on a Windows PC.
I guess I should back track a bit. Like many of the compilers that I used for the Learn to Code GR
Max Column Sum by Key, the compiler is not visual. So an editor, such as UltraEdit, has to be used and then the
compiler has to be run from a DOS command window the old fashion way. (My preferred way is a visual compiler such
as those of GNAT or Microsoft C#, etc where code can be entered in a window of
the compiler and then run with errors and warnings identified and located to
the file being compiled. And a clean
compile run via the debugger of the visual compiler.)
Go didn't provide any of that. First the hello.go file had to be formatted as a MAC file – that
is, with only Line Feed characters at the end of lines. Luckily this conversion is no big deal with
UltraEdit and once the file is converted it can remain in that format.
I then got the error
C:\Go\src>go build hello.go
# command-line-arguments
.\hello.go:4:30: newline in string
.\hello.go:4:30: syntax error: unexpected newline,
expecting comma or )
And noticed that line 4 column 30 was just past the end of
the
fmt.Println("Hello,
world.)
line. That was when
I switched to MAC formatting. Then I
got
C:\Go\src>go build hello.go
can't load package: package main:
hello.go:1:14: expected ';', found 'import'
How to Write Go Code had no ';' in the code that it
specified as the sample to try. Checking
further Wikipedia had that semi-colons still terminate lines but are
implicit. I added one anyway to see
what happens. Then I got
C:\Go\src>go build hello.go
# command-line-arguments
.\hello.go:1:28: syntax error: unexpected func,
expecting semicolon or newline
.\hello.go:1:53: string not terminated
This result appeared to be that the separate lines of the
hello.go file were all strung together as if all were on line 1. Not very helpful if the file had been of
greater length. However, I added a
semi-colon at the end of the
import "fmt"
line and got
C:\Go\src>go build hello.go
# command-line-arguments
.\hello.go:1:54: string not terminated
.\hello.go:1:74: syntax error: unexpected EOF,
expecting comma or )
so I put all the code on one line and saw that I had a
missing closing double quote after "world\n" of the line
package main; import "fmt"; func main() {
fmt.Printf("Hello, world.\n); }
from entering the example into UltraEdit. So I added it to complete the Printf
statement.
It then compiled. So
I had managed to find my coding error in spite of the inadequate error
output. And also that I needed a
semi-colon where Go wasn't supposed to need them. Note, by the time that I finished, the code was
package main;
import "fmt";
func main() {
fmt.Printf("Hello, world.\n");
}
with semi-colons after the first, second, and fourth
lines.
To double check for this post I removed them again and got
the error message
can't load package: package main:
hello.go:1:14: expected ';', found 'import'
The same is true if the file is all on one line
package main import "fmt" func main() {
fmt.Printf("Hello, world.\n") }
where position 14 is the 'i' of import. That is, assuming that position 1 is the p
of package.
Note: The size of
hello.go text file is 74 bytes in Windows and hello.exe is 2,058,752
bytes. Quite a difference.
Before continuing I captured the contents of a number of Go
/ Golang internet site into Word documents and read them. I also tried their Rectangle struct example
where I discovered the need to also put semi-colons after all the trailing }
brackets (except the last).
I then began writing a somewhat larger package to redo the
Learn to Code GR application in go.
This resulted in harder to locate errors due to the inadequate
identification of the location of the errors.
Not only were the errors all specified as being on line 1 but many
didn't identify the reason that the code was in error or provide any position
whatsoever (other than line 1). This
would be a significant handicap in writing an application of any size and was
problematic even with such a small application.
Not only did I find that semi-colons were needed in the
expected places but also following the trailing } brackets except for the very
last closing bracket.
Other differences I found writing the Learn to Code GR
package are
- // comments couldn't be used. Instead they had to be /* */ comments.
- The construct
import
(
"bufio"
"fmt"
"io"
"io/ioutil"
"os"
); that an example gave didn't
work. Separate import lines with
trailing semi-colons were needed.
- Variables had to have a leading var keyword and the type had to be supplied even though the Go documentation says that Go will recognize what the type has to be from its use.
Comments concerning Max Column Sum by Key
After the reading of the web sites and doing the two simple
examples I began the new version of Max Column Sum by Key on Nov 8 (after
starting looking into Go on the 2nd).
I found an example for reading a text file (this Go package
needs to read the max-col-sum-by-key.tsv file) at https://gobyexample.com. Using this example, the entire file was read
into a data buffer and could be displayed via
/* Open
and read the entire file into data */
data, err :=
ioutil.ReadFile("C:/Source/LearnToCodeGR-Ada/max-col-sum-by-key.tsv");
check(err);
fmt.Print(string(data));
fmt.Print("\n");
so that use of Go went quickly. Note: check is a function to determine if an error occurred.
Like in past implementations I then invoked a parse function
to obtain the key and values from the file so that the key with the associated
value with the most references could be found and reported.
At this point I found a helpful Scanner function of the Go
programming language. Where
var s
scanner.Scanner;
fset :=
token.NewFileSet();
file :=
fset.AddFile("", fset.Base(), len(data));
s.Init(file, data, nil, scanner.ScanComments);
initializes the scanner for the data passed to my parse
function via the Init function. Note:
public functions have a leading upper case letter in Go while private functions
have a lower case letter.
Then I added a for loop to scan each line of the data from
the file. That is, each line is
terminated by a new line character. Each line is decoded in the loop via
position,
tokenFound, literal := s.Scan();
if
tokenFound == token.EOF {
};
break;
};
where the Scan function requires that three variables be
declared for the function to return.
This is where a rule of Go comes into play. While debugging via Print output I had used
all three variables to check what was happening. But when I was satisfied with my code I no longer needed the
position of the location of the token and the literal. So I stopped referencing position in my
debug output. This resulted in a Go
error because Go doesn't allow a variable to be declared and not used. This is the reason for the dummy use of the
position variable in
if
position > 0 { /* just to use position to avoid error for non-use */
above. Although, I
would think that since the Scan function requires that all three variables be
declared (I did try the use of s.Scan() returning only tokenFound and literal
and it failed to compile) that the use by Scan could be interpreted as
satisfying the Go requirement.
While I still had the debugging output a portion of the
sample results were
%!s(token.Pos=400) ; "\n"
%!s(token.Pos=401) INT "0"
%!s(token.Pos=402) , ""
%!s(token.Pos=403) INT
"912"
%!s(token.Pos=406) IDENT
"_NUM"
%!s(token.Pos=411) INT
"3000"
%!s(token.Pos=416) INT "1"
%!s(token.Pos=418) INT "1"
%!s(token.Pos=420) ; "\n"
where the positions in the data are 400, 401, etc; the
tokens are ';', ',', INT, and IDENT; and the literals are strings of \n, 0, an
empty string, 912, _NUM, 3000, and 1.
These results are from the file record
0,912_NUM
3000 1 1
where there is a \t (horizontal tab) character after
"0,912_NUM", "3000", and the first "1" and a \n
(new line) after the second "1".
From this output I could tell that the literals that I
needed were those of the last three INT tokens of each line/file record. It's a mystery to me why \n was returned as
a token but the three instances of \t
were not. But since the INT tokens were
associated with the needed items the Go Scanner is quite usable and limits what
has to be done.
Therefore, in the for loop, I counted the number of times
the token was INT. Then, if the third
time, the associated literal was captured as the key; if the fourth time, it
was captured as the first value of the key; and if the fifth time, as the
second value of the key.
Prior to this I had, following previous Learn to Code GR
examples in other languages, created a struct of
type KeyValues struct {
key
string;
count1 int;
value2
string;
count2
int;
};
while preparing to get all the results
type keys [30]KeyValues;
type Keys struct {
count int;
keys;
};
var items Keys;
While beginning to write the captureKeyAndValue function to
keep track of the results I suddenly had an epiphany. I was again adding the extra code to
determine whether the second value was the same or different from the first
with the extra code necessary to handle each such case and whether another
array entry was going to need to be added.
And it suddenly occurred to me that all I had to do was call the
function twice; once for the key and the first value and again for the key and
the second value. A neat solution that
hadn't occurred to me when I was doing one language example after another back
at the end 2017 and the beginning of this year.
At that time I was on many occasions illustrating the use of
a class (which Go doesn't have) as well as language constructs of yet another
language so not really looking for a better way rather than how the same
concept could be written in the language.
But now, about a year later, the fact that the extra code wasn't
necessary suddenly occurred to me. An
example, I suppose, of the importance of code reviews.
Hence the first struct is reduces to
type KeyValues struct {
key
string;
value
string;
count int;
};
and, with the help of the scanner functions, the parse
function is reduced to
/* This function scans each line and finds the key
at the 3rd INT token.
There are
5 INT tokens in all per line where the line ends in \n. It
then finds
the two values for the key in the next two INT tokens. */
func parse(data []byte) {
var s
scanner.Scanner;
fset :=
token.NewFileSet();
file :=
fset.AddFile("", fset.Base(), len(data));
s.Init(file, data, nil, scanner.ScanComments);
var
intCount int = 0;
var key
string = "";
var value1
string = "";
var value2
string = "";
for {
position,
tokenFound, literal := s.Scan();
if
tokenFound == token.EOF {
if
position > 0 { /* just to use position to avoid error for non-use */
};
break;
};
if
tokenFound == token.INT {
intCount++; /* count INT tokens found */
};
/*
capture literals found for 3rd, 4th, and 5th INT tokens */
if
intCount == 3 {
key =
literal;
};
if
intCount == 4 {
value1
= literal;
};
if
intCount == 5 {
value2
= literal;
captureKeyAndValue(key, value1); /* capture the two */
captureKeyAndValue(key, value2); /*
key, value pairs */
intCount = 0;
key =
"";
value1
= "";
literal = "";
};
if
literal == "\n" { /* end of the line */
}
};
}; /* end parse */
Of course, clearing key, value1, and literal isn't really
necessary since they will be captured again the next time through the loop.
With this change, captureKeyAndValue is reduced to
/* Capture the key and value pair as data is parsed
*/
func captureKeyAndValue(key string, value string) {
/* Capture
initial entry */
if
items.count == 0 {
items.keys[items.count].key = key;
items.keys[items.count].value = value;
items.keys[items.count].count = 1;
items.count++;
return;
};
/* Capture
item with duplicate entry */
for i := 0;
i < items.count; i++ {
if
items.keys[i].key == key { /* then key already captured */
if items.keys[i].value
== value {
items.keys[i].count++;
return;
};
};
};
/* Capture
new key, value pair */
if
items.count < 30 {
items.keys[items.count].key = key;
items.keys[items.count].value = value;
items.keys[items.count].count
= 1;
items.count++;
};
}; /* end captureKeyAndValue */
Note: The above illustrates one advantage of Go that is
actually met. That is, unlike C and C#
the function doesn't need to declare void as the return value when no value is
to be returned.
Thus the entire code is
package main;
import "fmt";
import "go/scanner";
import "go/token";
import "io/ioutil";
func check(e error) {
if e !=
nil
{ panic(e) };
};
/* Struct to save parsed data */
type KeyValues struct {
key string;
value
string;
count int;
};
/* Array and struct to save parsed data */
type keys [30]KeyValues;
type Keys struct {
count int;
keys;
};
/* Saved data */
var items Keys;
/* Capture the key and value pair as data is parsed
*/
func captureKeyAndValue(key string, value string) {
/* Capture
initial entry */
if
items.count == 0 {
items.keys[items.count].key = key;
items.keys[items.count].value = value;
items.keys[items.count].count = 1;
items.count++;
return;
};
/* Capture
item with duplicate entry */
for i := 0;
i < items.count; i++ {
if
items.keys[i].key == key { /* then key already captured */
if
items.keys[i].value == value {
items.keys[i].count++;
return;
};
};
};
/* Capture
new key, value pair */
if
items.count < 30 {
items.keys[items.count].key = key;
items.keys[items.count].value = value;
items.keys[items.count].count = 1;
items.count++;
};
}; /* end captureKeyAndValue */
/* This function scans each line and finds the key
at the 3rd INT token.
There are
5 INT tokens in all per line where the line ends in \n. It
then finds
the two values for the key in the next two INT tokens. */
func parse(data []byte) {
var s
scanner.Scanner;
fset :=
token.NewFileSet();
file :=
fset.AddFile("", fset.Base(), len(data));
s.Init(file, data, nil, scanner.ScanComments);
var
intCount int = 0;
var key
string = "";
var value1
string = "";
var value2
string = "";
for {
position,
tokenFound, literal := s.Scan();
if
tokenFound == token.EOF {
if
position > 0 { /* just to use position to avoid error for non-use */
};
break;
};
if
tokenFound == token.INT {
intCount++; /* count INT tokens found */
};
/*
capture literals found for 3rd, 4th, and 5th INT tokens */
if
intCount == 3 {
key =
literal;
};
if
intCount == 4 {
value1
= literal;
};
if
intCount == 5 {
value2
= literal;
captureKeyAndValue(key,
value1); /* capture the two */
captureKeyAndValue(key, value2); /*
key, value pairs */
intCount = 0;
key =
"";
value1
= "";
literal
= "";
};
if
literal == "\n" { /* end of the line */
}
};
}; /* end parse */
/* main entry point */
func main() {
/* Open
and read the entire file into data */
data, err
:=
ioutil.ReadFile("C:/Source/LearnToCodeGR-Ada/max-col-sum-by-key.tsv");
check(err);
fmt.Print(string(data));
fmt.Print("\n");
/* Parse
the data read from the file */
parse(data);
/*
Display parsed results */
fmt.Printf("Table values count %d\n", items.count);
for i :=
0; i < items.count; i++ {
fmt.Printf("%d\t%s\t%s\t%d\n", i,items.keys[i].key,items.keys[i].value,items.keys[i].count);
};
/*
Determine key value combination with largest count and display */
var max
KeyValues;
max.key =
"";
max.value
= "";
max.count
= 0; /* how to initialize in the declaration? */
for i :=
0; i < items.count; i++ {
if
items.keys[i].count > max.count {
max
= items.keys[i];
};
};
fmt.Printf("Key and Value of maximum references:
%s\t%s\t%d\n",max.key,max.value,max.count);
} /* end main */
with the output
Table values count 13
0
1000 1 13
1
1000 2 7
2
2000 1 16
3
2000 2 4
4
3000 1 20
5
4000 1 15
6
4000 3 2
7
4000 2 2
8 4000
4 1
9
5000 2 8
10
5000 1 7
11
5000 3 4
12
5000 4 1
Key and Value of maximum references: 3000 1
20
This code is simpler than before but the past applications
could have been as well had the improvement implementing separate function
invocations for each of the key, value pairs.
Although each would have needed something to take the place of the Go
scanner functions.
What next?
I was going to do some larger application in Go to learn it
further. But, what with the compiler
not implementing what the documentation claims and the difficulty in
determining where an error actually is located or even what it is, it seems to
me that it would be too early to do so.
Of course, I could download a Linux version and see if it is closer to
the documentation. Otherwise, what to
do? What to do?
No comments:
Post a Comment