patternpythonModerate
Checking for any character common to two strings — Go is 50× slower than Python
Viewed 0 times
anycheckingcharacterslowerthantwoforpythonstringscommon
Problem
I decided to learn Go, knowing Python. My approach is by solving easy problems like this one:
You are given two strings, A and B. Find if there is a substring that
appears in both A and B. Several test cases will be given to you in a
single file. The first line of the input will contain a single integer
T, the number of test cases.
Then there will be T descriptions of the test cases. Each description
contains two lines. The first line contains the string A and the
second line contains the string B.
For each test case, display YES (in a newline), if there is a common substring. Otherwise, display NO.
The problem is straightforward: check if two sets intersect.
The real problem is that for the biggest testcase (20 strings of length 105 each) my Go code takes 2.6 seconds, which is unacceptable, knowing that my Python code solves the problem in 0.08 seconds. Based on the fact that Go is claimed to be near C speed, I assume that I am doing something super inefficiently.
Python implementation:
Nonetheless, two snippets look really different, they are actually doing similar things. In
You are given two strings, A and B. Find if there is a substring that
appears in both A and B. Several test cases will be given to you in a
single file. The first line of the input will contain a single integer
T, the number of test cases.
Then there will be T descriptions of the test cases. Each description
contains two lines. The first line contains the string A and the
second line contains the string B.
For each test case, display YES (in a newline), if there is a common substring. Otherwise, display NO.
The problem is straightforward: check if two sets intersect.
The real problem is that for the biggest testcase (20 strings of length 105 each) my Go code takes 2.6 seconds, which is unacceptable, knowing that my Python code solves the problem in 0.08 seconds. Based on the fact that Go is claimed to be near C speed, I assume that I am doing something super inefficiently.
package main
import "fmt"
func subst(s1 string, s2 string) bool {
m := map[uint8]bool{}
for i := 0; i < len(s1); i++ {
m[s1[i]] = true
}
for i := 0; i < len(s2); i++ {
if m[s2[i]] {
return true
}
}
return false
}
func main() {
var n int
var s1, s2 string
fmt.Scanf("%d", &n)
for i := 0; i < n; i++ {
fmt.Scanf("%s", &s1)
fmt.Scanf("%s", &s2)
if subst(s1, s2) {
fmt.Println("YES")
} else {
fmt.Println("NO")
}
}
}Python implementation:
def twoStrings(a, b):
if len(set(a) & set(b)):
print 'YES'
else:
print 'NO'
for i in xrange(input()):
twoStrings(raw_input(), raw_input())Nonetheless, two snippets look really different, they are actually doing similar things. In
go I create a hash-map and put boolean value for each character from the first string (this is how sets are implemented in a lot Solution
This entire answer refers to my rewrite of the asker's code on the Go Playground and is mostly a summary of our discussion on the chat.so Go/Golang room. For reference, I'm reproducing the code below:
There are two major changes here to get this to the point that it at least beats Python in my own test runs on HackerRank:
-
Replace fmt.Scanf with just reading and parsing where necessary. The only input that requires actual parsing is the first integer, so there's no need for fmt.Scanf beyond that. Even initially we don't need it, however, because we're really only interested in parsing the first line. As such, I just replaced the entire thing with a buffered reader and read up to a delimiter, assuming that all lines end with
This chops off a reasonable amount of time in reading because it avoids parsing a format string and then modifying the input read to conform to the format string, as well as possible time spent in reflection depending on the data. I don't think it'll actually hit the reflect package here, but I haven't looked at the fmt.Fscanf implementation, so take that with a grain of salt.
-
Use a limited bitset (i.e., just one integer) instead of
The
This results in pretty simple code, an acceptable interface for working with a bitset that makes changes to the implementation easier, and it's much faster than allocating memory for a map, creating new cells in the map, and lookups in the map. (I think it's likely that some improvement would be seen by giving the map enough initial capacity for the cardinality of its key, but I doubt if it'd be significant — haven't tested it, so that's something to look into.) These are all fast enough for day to day things, but if we're trying to beat the time on some program, data structures are low-hanging fruit.
With those changes, I get, at worst, 0.01s on HackerRank for test #4 and #5 (possibly both — I'm guessing it's rounding at this point) and 0s for all other tests. This is also fairly easy to benchmark using the testing package, but I don't have a way to compare those results with Python's, so I consider HackerRank's authoritative for now.
package main
import (
"bufio"
"io"
"os"
"strconv"
)
// charSet is a limited bitset to contain all lowercase latin characters (a-z).
type charSet struct {
bits uint32
}
// Set switches a bit to 1 for any given character i that is in the range 'a'
// to 'z'. Characters outside this range have undefined behavior.
func (c *charSet) Set(i uint8) {
c.bits |= 1 1 {
if n, err = strconv.Atoi(l[:len(l)-1]); err != nil {
panic(err)
}
} else if err != nil {
panic(err)
}
for i := 0; i 0 && len(s2) > 0 && subst(s1[:len(s1)-1], s2[:len(s2)-1]) {
io.WriteString(os.Stdout, "YES\n")
} else {
io.WriteString(os.Stdout, "NO\n")
}
}
}There are two major changes here to get this to the point that it at least beats Python in my own test runs on HackerRank:
-
Replace fmt.Scanf with just reading and parsing where necessary. The only input that requires actual parsing is the first integer, so there's no need for fmt.Scanf beyond that. Even initially we don't need it, however, because we're really only interested in parsing the first line. As such, I just replaced the entire thing with a buffered reader and read up to a delimiter, assuming that all lines end with
'\n' (they should for this case). After that, I just run the first line through strconv.Atoi.This chops off a reasonable amount of time in reading because it avoids parsing a format string and then modifying the input read to conform to the format string, as well as possible time spent in reflection depending on the data. I don't think it'll actually hit the reflect package here, but I haven't looked at the fmt.Fscanf implementation, so take that with a grain of salt.
-
Use a limited bitset (i.e., just one integer) instead of
map[uint8]bool. This is more obvious as to why it's faster: there's no need to allocate more than four bytes on the stack to use an integer, and because the problem constrains the input strings to lowercase characters in the range of 'a' to 'z', we know it'll fit in the bounds of a 32-bit integer.The
charSet type and set function (which is just me being lazy and sort of mimicking the Python code) covers the entire implementation of this. All it does is, given a byte, update the underlying bits integer and test for an intersection using a bitwise and. To get a complete bitset for a string, or byte slice, just iterate over its length and set those bits.This results in pretty simple code, an acceptable interface for working with a bitset that makes changes to the implementation easier, and it's much faster than allocating memory for a map, creating new cells in the map, and lookups in the map. (I think it's likely that some improvement would be seen by giving the map enough initial capacity for the cardinality of its key, but I doubt if it'd be significant — haven't tested it, so that's something to look into.) These are all fast enough for day to day things, but if we're trying to beat the time on some program, data structures are low-hanging fruit.
With those changes, I get, at worst, 0.01s on HackerRank for test #4 and #5 (possibly both — I'm guessing it's rounding at this point) and 0s for all other tests. This is also fairly easy to benchmark using the testing package, but I don't have a way to compare those results with Python's, so I consider HackerRank's authoritative for now.
Code Snippets
package main
import (
"bufio"
"io"
"os"
"strconv"
)
// charSet is a limited bitset to contain all lowercase latin characters (a-z).
type charSet struct {
bits uint32
}
// Set switches a bit to 1 for any given character i that is in the range 'a'
// to 'z'. Characters outside this range have undefined behavior.
func (c *charSet) Set(i uint8) {
c.bits |= 1 << (i - 'a')
}
// Intersects returns whether two charSets intersect (i.e., share one or more
// on bits).
func (c charSet) Intersects(o charSet) bool {
return c.bits&o.bits != 0
}
// set returns a charSet for all bytes in s. Bytes in s must be in the range of
// 'a' to 'z'. Anything outside that range is regarded as undefined behavior.
func set(s []byte) charSet {
var c charSet
for i := 0; i < len(s); i++ {
c.Set(s[i])
}
return c
}
// subst returns whether two strings share any characters. l and r are assumed
// to only contain characters in the range of 'a' to 'z'.
func subst(l, r []byte) bool {
return set(l).Intersects(set(r))
}
func main() {
var n int
r := bufio.NewReader(os.Stdin)
if l, err := r.ReadString('\n'); err == nil && len(l) > 1 {
if n, err = strconv.Atoi(l[:len(l)-1]); err != nil {
panic(err)
}
} else if err != nil {
panic(err)
}
for i := 0; i < n; i++ {
var s1, s2 []byte
var err error
s1, err = r.ReadBytes('\n')
if err == nil {
s2, err = r.ReadBytes('\n')
}
if err != nil && err != io.EOF {
panic(err)
}
if len(s1) > 0 && len(s2) > 0 && subst(s1[:len(s1)-1], s2[:len(s2)-1]) {
io.WriteString(os.Stdout, "YES\n")
} else {
io.WriteString(os.Stdout, "NO\n")
}
}
}Context
StackExchange Code Review Q#88307, answer score: 12
Revisions (0)
No revisions yet.