Skip to content

statistics: covariance, correlation, and linear_regression do not accept iterator inputs #149244

@htjworld

Description

@htjworld

Bug description

statistics.covariance(), statistics.correlation(), and
statistics.linear_regression() raise TypeError when given iterator or
generator inputs, unlike the rest of the statistics module.

Reproduction

import statistics

# These functions accept iterators without issue:
statistics.mean(iter([1, 2, 3]))         # 2
statistics.variance(iter([1, 2, 3, 4]))  # 1.666...
statistics.stdev(iter([1, 2, 3, 4]))     # 1.290...

# These three raise TypeError:
statistics.covariance(iter([1, 2, 3, 4, 5]), iter([2, 4, 6, 8, 10]))
# TypeError: object of type 'list_iterator' has no len()

statistics.correlation(iter([1, 2, 3, 4, 5]), iter([2, 4, 6, 8, 10]))
# TypeError: object of type 'list_iterator' has no len()

statistics.linear_regression(iter([1, 2, 3, 4, 5]), iter([2, 4, 6, 8, 10]))
# TypeError: object of type 'list_iterator' has no len()

Expected behavior

All three functions should accept any iterable (lists, tuples, iterators,
generators), consistent with mean(), variance(), and stdev().

The private _ss() helper — used internally by variance() and stdev()
documents this design intent explicitly:

"Calculations are done in a single pass, allowing the input to be an iterator."

Root cause

All three functions call len(x) on the raw input at the top of the
function body, which fails for iterators that do not support len().

CPython version

Reproducible on main.

Linked PRs

Metadata

Metadata

Assignees

Labels

docsDocumentation in the Doc dirstdlibStandard Library Python modules in the Lib/ directory

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions