-
-
Notifications
You must be signed in to change notification settings - Fork 34.5k
gh-90533: Implement BytesIO.peek() #30808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
b833b83
50a2cfb
eaa7672
00457ae
882579d
c1eed72
afc200c
79ab9a4
b493914
2a1c85c
d398717
26d1e81
9a19ff9
9300ade
d214089
d6691b8
3e51adb
3661b65
cd40d77
04372bd
6b9ae8c
f7406f6
d9528e2
bc8134b
b6ffca8
5fe5645
4126a64
1ea40c2
77e04d6
4d2f2dd
c16bebf
08bd7da
6174fca
b8b8cf4
7ac914e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -739,6 +739,15 @@ than raw I/O does. | |
|
|
||
| Return :class:`bytes` containing the entire contents of the buffer. | ||
|
|
||
| .. method:: peek(size=1, /) | ||
|
|
||
| Return bytes from the current position onwards without advancing the position. | ||
| At least one byte of data is returned if not at EOF. | ||
| Return an empty :class:`bytes` object at EOF. | ||
| If the size argument is less than one or larger than the number of available bytes, | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nitpick: I don't think the comma at the end of this end of this line is right / needed?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’m quite convinced it is needed because the first part of the sentence is an "introductory clause or phrase". (But if you swap the order and write "A copy of the buffer ... is returned if ...", no comma is needed before the "if".) |
||
| a copy of the buffer from the current position until the end is returned. | ||
|
vstinner marked this conversation as resolved.
|
||
|
|
||
| .. versionadded:: 3.15 | ||
|
|
||
| .. method:: read1(size=-1, /) | ||
|
|
||
|
|
@@ -772,8 +781,13 @@ than raw I/O does. | |
|
|
||
| .. method:: peek(size=0, /) | ||
|
|
||
| Return bytes from the stream without advancing the position. The number of | ||
| bytes returned may be less or more than requested. If the underlying raw | ||
| Return bytes from the current position onwards without advancing the position. | ||
| At least one byte of data is returned if not at EOF. | ||
| Return an empty :class:`bytes` object at EOF. | ||
| At most one single read on the underlying raw stream is done to satisfy the call. | ||
| The *size* argument is ignored. | ||
| The number of read bytes depends on the buffer size and the current position in the internal buffer. | ||
| If the underlying raw | ||
| stream is non-blocking and the operation would block, returns empty bytes. | ||
|
|
||
| .. method:: read(size=-1, /) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -996,6 +996,13 @@ def tell(self): | |
| raise ValueError("tell on closed file") | ||
| return self._pos | ||
|
|
||
| def peek(self, size=1): | ||
| if self.closed: | ||
| raise ValueError("peek on closed file") | ||
| if size < 1: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think behavior here is going to be correct if a negative size is passed in
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have the same instinct whenever I read this line, but note that |
||
| return self._buffer[self._pos:] | ||
| return self._buffer[self._pos:self._pos + size] | ||
|
|
||
| def truncate(self, pos=None): | ||
| if self.closed: | ||
| raise ValueError("truncate on closed file") | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -566,6 +566,55 @@ def test_issue141311(self): | |
| buf = bytearray(2) | ||
| self.assertEqual(0, memio.readinto(buf)) | ||
|
|
||
| def test_peek(self): | ||
| buf = self.buftype("1234567890") | ||
| with self.ioclass(buf) as memio: | ||
| self.assertEqual(memio.tell(), 0) | ||
| self.assertEqual(memio.peek(1), buf[:1]) | ||
| self.assertEqual(memio.peek(1), buf[:1]) | ||
|
vstinner marked this conversation as resolved.
|
||
| self.assertEqual(memio.peek(), buf[:1]) | ||
| self.assertEqual(memio.peek(3), buf[:3]) | ||
| self.assertEqual(memio.peek(5), buf[:5]) | ||
| self.assertEqual(memio.peek(0), buf) | ||
| self.assertEqual(memio.peek(len(buf) + 100), buf) | ||
| self.assertEqual(memio.peek(-1), buf) | ||
| self.assertEqual(memio.tell(), 0) | ||
|
|
||
| memio.read(1) | ||
| self.assertEqual(memio.tell(), 1) | ||
| self.assertEqual(memio.peek(1), buf[1:2]) | ||
| self.assertEqual(memio.peek(), buf[1:2]) | ||
| self.assertEqual(memio.peek(3), buf[1:4]) | ||
| self.assertEqual(memio.peek(5), buf[1:6]) | ||
| self.assertEqual(memio.peek(0), buf[1:]) | ||
| self.assertEqual(memio.peek(len(buf) + 100), buf[1:]) | ||
| self.assertEqual(memio.peek(-1), buf[1:]) | ||
| self.assertEqual(memio.tell(), 1) | ||
|
|
||
| memio.read() | ||
| self.assertEqual(memio.tell(), len(buf)) | ||
| self.assertEqual(memio.peek(1), self.EOF) | ||
| self.assertEqual(memio.peek(3), self.EOF) | ||
| self.assertEqual(memio.peek(5), self.EOF) | ||
| self.assertEqual(memio.peek(0), b"") | ||
| self.assertEqual(memio.tell(), len(buf)) | ||
|
|
||
| # Peeking works after writing | ||
| abc = self.buftype("abc") | ||
| memio.write(abc) | ||
| self.assertEqual(memio.peek(), self.EOF) | ||
| memio.seek(len(buf)) | ||
| self.assertEqual(memio.peek(), abc[:1]) | ||
| self.assertEqual(memio.peek(-1), abc) | ||
| self.assertEqual(memio.peek(len(abc) + 100), abc) | ||
| self.assertEqual(memio.tell(), len(buf)) | ||
|
marcelm marked this conversation as resolved.
|
||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. could you test that seek to end of file then peek() returns (Think it's right in implementation, just "no data was read because whole buffer skipped" can catch different cases from "no data was read but buffer has been touched")
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added within a new |
||
| with self.ioclass(buf) as memio: | ||
| memio.seek(len(buf)) | ||
| self.assertEqual(memio.peek(), self.EOF) | ||
|
|
||
| self.assertRaises(ValueError, memio.peek) | ||
|
|
||
| def test_unicode(self): | ||
| memio = self.ioclass() | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| Add :meth:`io.BytesIO.peek`. |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not default to
size=0likeBufferedReader?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read through the comments. In general anything returning a
bytesrather than abytes-likememoryview is going to require allocating and copying potentially many bytes... If the copy is really concerning I'd lean returning a memoryview rather than abyteswhich is a mandatory copy.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memoryviewcame up in this comment and the three following it: #30808 (comment)I don’t know what the conclusion is given your comment. Should a memoryview be returned instead? Most important to me is compatibility with what
BufferedReader.peek()returns.I am not too concerned about the extra memory for a
bytesobject as long as the default issize=1.