Why Do Tuples In A List Comprehension Need Parentheses?
Solution 1:
Python's grammar is LL(1), meaning that it only looks ahead one symbol when parsing.
[(v1, v2) for v1 in myList1 for v2 in myList2]
Here, the parser sees something like this.
[ # An opening bracket; must be some kind of list
[( # Okay, so a list containing some value in parentheses
[(v1
[(v1,
[(v1, v2
[(v1, v2)
[(v1, v2) for # Alright, list comprehension
However, without the parentheses, it has to make a decision earlier on.
[v1, v2 for v1 in myList1 for v2 in myList2]
[ # List-ish thing
[v1 # List containing a value; alright
[v1, # List containing at least two values
[v1, v2 # Here's the second value
[v1, v2 for # Wait, what?
A parser which backtracks tends to be notoriously slow, so LL(1) parsers do not backtrack. Thus, the ambiguous syntax is forbidden.
Solution 2:
As I felt "because the grammar forbids it" to be a little too snarky, I came up with a reason.
It begins parsing the expression as a list/set/tuple and is expecting a ,
and instead encounters a for
token.
For example:
$ python3.6 test.py
File "test.py", line 1
[a, b for a, b in c]
^
SyntaxError: invalid syntax
tokenizes as follows:
$ python3.6 -m tokenize test.py
0,0-0,0: ENCODING 'utf-8'
1,0-1,1: OP '['
1,1-1,2: NAME 'a'
1,2-1,3: OP ','
1,4-1,5: NAME 'b'
1,6-1,9: NAME 'for'
1,10-1,11: NAME 'a'
1,11-1,12: OP ','
1,13-1,14: NAME 'b'
1,15-1,17: NAME 'in'
1,18-1,19: NAME 'c'
1,19-1,20: OP ']'
1,20-1,21: NEWLINE '\n'
2,0-2,0: ENDMARKER ''
Solution 3:
There was no parser issue that motivated this restriction. Contrary to Silvio Mayolo's answer, an LL(1) parser could have parsed the no-parentheses syntax just fine. The parentheses were optional in early versions of the original list comprehension patch; they were only made mandatory to make the meaning clearer.
Quoting Guido van Rossum back in 2000, in a response to someone worried that [x, y for ...]
would cause parser issues,
Don't worry. Greg Ewing had no problem expressing this in Python's own grammar, which is about as restricted as parsers come. (It's LL(1), which is equivalent to pure recursive descent with one lookahead token, i.e. no backtracking.)
Here's Greg's grammar:
atom: ... | '[' [testlist [list_iter]] ']' | ... list_iter: list_for | list_if list_for: 'for' exprlist 'in' testlist [list_iter] list_if: 'if' test [list_iter]
Note that before, the list syntax was
'[' [testlist] ']'
. Let me explain it in different terms:The parser parses a series comma-separated expressions. Previously, it was expecting
']'
as the sole possible token following this. After the change,'for'
is another possible following token. This is no problem at all for any parser that knows how to parse matching parentheses!If you'd rather not support
[x, y for ...]
because it's ambiguous (to the human reader, not to the parser!), we can change the grammar to something like:
'[' test [',' testlist | list_iter] ']'
(Note that
|
binds less than concatenation, and[...]
means an optional part.)
Also see the next response in the thread, where Greg Ewing runs
>>> seq = [1,2,3,4,5]
>>> [x, x*2 for x in seq]
[(1, 2), (2, 4), (3, 6), (4, 8), (5, 10)]
on an early version of the list comprehension patch, and it works just fine.
Post a Comment for "Why Do Tuples In A List Comprehension Need Parentheses?"