| Both sides previous revision
Previous revision
Next revision
|
Previous revision
|
python3:input_output [2018/08/14 15:52] kericson [Benchmarks] Finished second table, added sentence |
python3:input_output [2018/08/14 19:04] (current) jguerin |
| |
| ===== Input Basics ===== | ===== Input Basics ===== |
| Input in Python3 is handled via the ''%%input()%%'' function, and reads a newline-terminated string from standard input.((Unlike C++ and Java, input is line-based rather than token-based.)) Additional processing is done to the resulting string. | Input in Python3 is handled via the ''%%input()%%'' function, and reads a newline-terminated string from standard input.((Unlike C++ and Java, input is line-based rather than token-based.)) Any additional processing is done to the resulting string. |
| |
| <code python> | <code python> |
| </code> | </code> |
| |
| ===== Advanced Options ===== | ===== Advanced Input ===== |
| Like the interpreted semantics of Python3, ''%%input()%%'' and ''%%print()%%'' are slow compared to their C++ and Java counterparts. The ''sys'' library provides faster counterparts. | ''input()'' will scale well for many easy and mid-level contest problems. For mid to upper problems with significant bounds on reads and writes ''input()'' will carry an increased risk of //time limit exceeded// judgements.((There is no guaranteed cutoff, but for many problems ''input()'' is //increasingly likely// to fail around 10<sup>3</sup>≤//n//≤10<sup>4</sup> lines.)) |
| |
| <code python> | === stdin.readline() === |
| | ''readline()'' is identical to the interface provided by ''input()'' except it //retains// the ''\n'' that is used to terminate the read.((This may require special handling in certain circumstance and can be safely ignored in others. E.g., ''.split()'' will //not// behave differently vs. ''input()''.)) |
| |
| <code python> | <code python> |
| | >>> from sys import * |
| | >>> x = stdin.readline() |
| | Hello World! # x="Hello World!\n" |
| | </code> |
| |
| ==== Benchmarks ==== | |
| 10 characters per line (//n//= number of lines): | |
| | //n// | input() | sys.stdin.readline() | sys.stdin.readlines() | | |
| | 10<sup>4</sup> | .034 | .016 | .018 | | |
| | 10<sup>5</sup> | .146 | .052 | .030 | | |
| | 10<sup>6</sup> | 1.301 | .301 | .130 | | |
| |
| | === stdin.readlines() === |
| | ''readlines()'' processes an entire file, terminated by an //end of file// character.((This behavior can be simulated on the terminal with ''<ctrl>+d''.)) |
| |
| 1000 characters per line (//n//= number of lines): | <code python> |
| | //n// | input() | sys.stdin.readline() | sys.stdin.readlines() | | >>> from sys import * |
| | 10<sup>4</sup> | .046 | .037 | .033 | | >>> x = stdin.readlines() |
| | 10<sup>5</sup> | .282 | .183 | .143 | | Hello |
| | 10<sup>6</sup> | 2.728 | 1.430 | 1.723 | | To Everyone! # x=["Hello\n", "To Everyone!\n"] |
| | </code> |
| |
| | ''<ctrl>+d'' would be used to terminate this example after the ''!''. |
| |
| The ''readline()'' version is actually slower on the largest dataset. It is attempting to store about 1GB in memory here, causing a slowdown, but still faster than ''input()''. | ''readlines()'' loads //and// stores an entire file into memory (as a list) its performance will exceed other options in Python3 for all but the most enormous files.((See the table below for an example where ''readlines()'' required storing ∼1GB of data.)) |
| |
| <file python input.py> | |
| num_lines = int(input()) | |
| |
| for i in range(num_lines): | ==== Benchmarks ==== |
| line = input() | The following benchmarks demonstrate the increased likelihood of failure of ''input()'' as input sizes increase. All files used for testing can be found [[files:python3:io_tests|here]]. |
| </file> | |
| |
| <file python readline.py> | 10 characters per line (//n//= number of lines): |
| from sys import stdin | | //n// | input() | sys.stdin.readline() | sys.stdin.readlines() | |
| | | 10<sup>4</sup> | .034s | .016s | .018s | |
| | | 10<sup>5</sup> | .146s | .052s | .030s | |
| | | 10<sup>6</sup> | 1.301s | .301s | .130s | |
| |
| num_lines = int(input()) | |
| |
| for i in range(num_lines): | 1000 characters per line (//n//= number of lines): |
| line = stdin.readline() | | //n// | input() | sys.stdin.readline() | sys.stdin.readlines() | |
| </file> | | 10<sup>4</sup> | .046s | .037s | .033s | |
| | | 10<sup>5</sup> | .282s | .183s | .143s | |
| <file python readlines.py> | | 10<sup>6</sup> | 2.728s | 1.430s | 1.723s((The ''readlines()'' version is actually slower than ''readline()'' on the largest dataset. It is attempting to store about 1GB in memory here, causing a slowdown, but still faster than ''input()''. |
| from sys import stdin | )) | |
| | |
| num_lines = int(input()) | |
| | |
| lines = stdin.readlines() | |
| </file> | |
| |
| <file python read_test.py> | The above tests were designed to showcase minimal reading functionality other than temporary storage.((We deliberately avoided additional processing such as typecasts, ''map()'', and ''split()'', as these are non-IO considerations in Python3.)) |
| from sys import argv | |
| from random import * | |
| from string import * | |
| |
| for i in range(int(argv[1])): | |
| print(''.join(choice(ascii_lowercase + ' ') for _ in range(int(argv[2])))) | |
| </file> | |
| |
| ((''python3 read_test.py 1000 10 > out.txt'')) | |
| ((''python3 read_test.py 1000 1000 > out.txt'')) | |
| |