Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions src/components/fundable/descriptions/RunEndEncodedInArrowCpp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#### Overview

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics.

Run-End-Encoding is a composite Arrow datatype that efficiently represents data with many repeated values, such as timeseries.
While basic support is present, Run-End-Encoding is not universally supported in all Arrow C++ components.

We propose to finish implementing support for Run-End-Encoding in all components of Arrow C++:

* cast kernels (including implicit run-end-encoding or decoding where applicable)

* scalar kernels:
- all unary kernels (math, etc.)
- all binary kernels (arithmetic, comparisons, etc.)
- `coalesce`, `if_else`, `case_when`
- fix incorrect output in `is_null` and `true_unless_null`

* vector kernels:
- cumulative aggregations (`cumulative_max`, `cumulative_sum`, etc.)
- `pairwise_diff`
- `filter`, `take`, `scatter`
- `fill_null_forward`, `fill_null_backward`
- `replace_with_mask`
- `indices_non_zero`
- `sort_indices`, `rank`, `rank_normal`, `rank_quantile`
- `partition_nth_indices`
- `select_k_unstable`
- `unique`, `value_counts`
- `winsorize`
- fix incorrect output in `drop_null`

* aggregate kernels:
- `all`, `any`
- `count_distinct`
- `approximate_median`
- `sum`, `product`, `mean`, `mode`, `quantile`, `tdigest`
- `variance`, `stddev`, `skew`, `kurtosis`
- `first`, `last`, `min`, `max`
- `index`
- fix incorrect output in `count`

* Parquet reader and writer

* CSV reader

* ORC reader and writer

Funders can decide to fund the entire package, or choose the components they are interested in.

##### Are you interested in this project? Either entirely or partially, contact us for more information on how to help us fund it.
13 changes: 13 additions & 0 deletions src/components/fundable/projectsDetails.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import MatrixOperationsInXtensorMD from "@site/src/components/fundable/descripti
import BinaryViewInArrowCppMD from "@site/src/components/fundable/descriptions/BinaryViewInArrowCpp.md"
import Decimal32InArrowCppMD from "@site/src/components/fundable/descriptions/Decimal32InArrowCpp.md"
import Float16InArrowCppMD from "@site/src/components/fundable/descriptions/Float16InArrowCpp.md"
import RunEndEncodedInArrowCppMD from "@site/src/components/fundable/descriptions/RunEndEncodedInArrowCpp.md"
import ParquetNullOptimizationsMD from "@site/src/components/fundable/descriptions/ParquetNullOptimizations.md"

export const fundableProjectsDetails = {
Expand Down Expand Up @@ -140,6 +141,18 @@ export const fundableProjectsDetails = {
currentFundingPercentage: 0,
repoLink: "https://github.com/apache/arrow"
},
{
category: "Apache Arrow and Parquet",
title: "Complete Run-End-Encoded support in Arrow C++",
pageName: "RunEndEncodedInApacheArrow",
shortDescription: "Like dictionary encoding, run-end-encoding allows representing some kinds of data more efficiently.",
description: RunEndEncodedInArrowCppMD,
price: "TBD",
maxNbOfFunders: 4,
currentNbOfFunders: 0,
currentFundingPercentage: 0,
repoLink: "https://github.com/apache/arrow"
},
{
category: "Apache Arrow and Parquet",
title: "Parquet reader optimizations",
Expand Down
9 changes: 9 additions & 0 deletions src/pages/fundable/RunEndEncodedInApacheArrow/GetAQuote.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
import useDocusaurusContext from '@docusaurus/useDocusaurusContext';
import GetAQuotePage from '@site/src/components/fundable/GetAQuotePage';

export default function FundablePage() {
const { siteConfig } = useDocusaurusContext();
return (
<GetAQuotePage/>
);
}
9 changes: 9 additions & 0 deletions src/pages/fundable/RunEndEncodedInApacheArrow/index.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
import useDocusaurusContext from '@docusaurus/useDocusaurusContext';
import LargeProjectCardPage from '@site/src/components/fundable/LargeProjectCardPage';

export default function FundablePage() {
const { siteConfig } = useDocusaurusContext();
return (
<LargeProjectCardPage/>
);
}
Loading