patternpythonCritical
Why is reading lines from stdin much slower in C++ than Python?
Viewed 0 times
whylinespythonstdinmuchreadingthanfromslower
Problem
I wanted to compare reading lines of string input from stdin using Python and C++ and was shocked to see my C++ code run an order of magnitude slower than the equivalent Python code. Since my C++ is rusty and I'm not yet an expert Pythonista, please tell me if I'm doing something wrong or if I'm misunderstanding something.
(TLDR answer: include the statement:
TLDR results: scroll all the way down to the bottom of my question and look at the table.)
C++ code:
import time
import sys
count = 0
start = time.time()
for line in sys.stdin:
count += 1
delta_sec = int(time.time() - start_time)
if delta_sec >= 0:
lines_per_sec = int(round(count/delta_sec))
print("Read {0} lines in {1} seconds. LPS: {2}".format(count, delta_sec,
lines_per_sec))
Read 5570000 lines in 9 seconds. LPS: 618889
$ cat test_lines | ./readline_test.py
Read 5570000 lines in 1 seconds. LPS: 5570000
CPP: Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 2 at Mon Feb 20 21:
(TLDR answer: include the statement:
cin.sync_with_stdio(false) or just use fgets instead.TLDR results: scroll all the way down to the bottom of my question and look at the table.)
C++ code:
#include
#include
using namespace std;
int main() {
string input_line;
long line_count = 0;
time_t start = time(NULL);
int sec;
int lps;
while (cin) {
getline(cin, input_line);
if (!cin.eof())
line_count++;
};
sec = (int) time(NULL) - start;
cerr 0) {
lps = line_count / sec;
cerr
Python Equivalent:
#!/usr/bin/env pythonimport time
import sys
count = 0
start = time.time()
for line in sys.stdin:
count += 1
delta_sec = int(time.time() - start_time)
if delta_sec >= 0:
lines_per_sec = int(round(count/delta_sec))
print("Read {0} lines in {1} seconds. LPS: {2}".format(count, delta_sec,
lines_per_sec))
Here are my results:
$ cat test_lines | ./readline_test_cppRead 5570000 lines in 9 seconds. LPS: 618889
$ cat test_lines | ./readline_test.py
Read 5570000 lines in 1 seconds. LPS: 5570000
I should note that I tried this both under Mac OS X v10.6.8 (Snow Leopard) and Linux 2.6.32 (Red Hat Linux 6.2). The former is a MacBook Pro, and the latter is a very beefy server, not that this is too pertinent.
$ for i in {1..5}; do echo "Test run $i at date"; echo -n "CPP:"; cat test_lines | ./readline_test_cpp ; echo -n "Python:"; cat test_lines | ./readline_test.py ; done
Test run 1 at Mon Feb 20 21:29:28 EST 2012CPP: Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 2 at Mon Feb 20 21:
Solution
tl;dr: Because of different default settings in C++ requiring more system calls.
By default,
Normally, when an input stream is buffered, instead of reading one character at a time, the stream will be read in larger chunks. This reduces the number of system calls, which are typically relatively expensive. However, since the
If more input was read by
To avoid this, by default, streams are synchronized with
Fortunately, the library designers decided that you should also be able to disable this feature to get improved performance if you knew what you were doing, so they provided the
If the synchronization is turned off, the C++ standard streams are allowed to buffer their I/O independently, which may be considerably faster in some cases.
By default,
cin is synchronized with stdio, which causes it to avoid any input buffering. If you add this to the top of your main, you should see much better performance:std::ios_base::sync_with_stdio(false);Normally, when an input stream is buffered, instead of reading one character at a time, the stream will be read in larger chunks. This reduces the number of system calls, which are typically relatively expensive. However, since the
FILE* based stdio and iostreams often have separate implementations and therefore separate buffers, this could lead to a problem if both were used together. For example:int myvalue1;
cin >> myvalue1;
int myvalue2;
scanf("%d",&myvalue2);If more input was read by
cin than it actually needed, then the second integer value wouldn't be available for the scanf function, which has its own independent buffer. This would lead to unexpected results.To avoid this, by default, streams are synchronized with
stdio. One common way to achieve this is to have cin read each character one at a time as needed using stdio functions. Unfortunately, this introduces a lot of overhead. For small amounts of input, this isn't a big problem, but when you are reading millions of lines, the performance penalty is significant.Fortunately, the library designers decided that you should also be able to disable this feature to get improved performance if you knew what you were doing, so they provided the
sync_with_stdio method. From this link (emphasis added):If the synchronization is turned off, the C++ standard streams are allowed to buffer their I/O independently, which may be considerably faster in some cases.
Code Snippets
std::ios_base::sync_with_stdio(false);int myvalue1;
cin >> myvalue1;
int myvalue2;
scanf("%d",&myvalue2);Context
Stack Overflow Q#9371238, score: 1963
Revisions (0)
No revisions yet.