Open Issues (last 90 days)

36

Issues from New Contributors

15

Open PRs (last 90 days)

18

PRs from New Contributors

13

Open issues from last 90 days (new contributors highlighted)
Date Title
2026-05-12 [Python] Azure with SAS Keys
2026-05-12 [Python] Improve Extension Types Support in PyArrow (umbrella issue)
2026-05-01 [Python] FixedShapedTensor to_pandas_dtype is returning NotImplementedError
2026-04-28 [C++][Python] Bind unresolved Substrait expressions using a supplied schema
2026-04-27 [Python] Convert a ListScalar of StructScalar to numpy.recarray
2026-04-22 [Python] test_fastparquet_cross_compatibility fails when using latest PyArrow and Pandas
2026-04-21 [Python] PyArrow 24.0+: ‘mypy’ reports ‘compute.all’ and ‘compute.equal’ as missing
2026-04-21 [Python] Scalar arithmetic dunders raise TypeError instead of returning NotImplemented
2026-04-21 [Release][Python] Python wheels fail to upload to PyPI due to quota
2026-04-21 [Python][Parquet] Per-column keys for low-level encryption/decryption properties API
2026-04-16 [Python] Allow thousands_sep as a option in pa.ParseOptions
2026-04-15 [Python][Docs] Add code examples for compute function any/all
2026-04-14 [Python] Add row_splits/offsets methods for VariableShapeTensorArray
2026-04-14 [C++][Python] _export_to_c segmentation fault for binary_view array
2026-04-13 [Python] Array.from_buffers docs should mention safety issues
2026-04-07 fixed_size_list<T>[0] parquet round-trips fail: writes fine, read-back raises ArrowInvalid
2026-04-07 [C++][Python] Implement search_sorted kernel for all primitive types and run-end encoded arrays
2026-04-06 [Python] Implement support for the experimental Async Device Stream Interface
2026-04-02 [Python] Support converting list of multi-dimensional arrays to FixedShapeTensor
2026-04-02 [Python] S3FileSystem ignores AWS_ENDPOINT_URL / AWS_ENDPOINT_URL_S3 environment variables
2026-04-01 [Python][C++] pyarrow._compute Contains AVX Instructions During Import
2026-04-01 [Python] Environment variables for PyArrow not needed anymore
2026-04-01 [Python][Interchange protocol] Deprecate the dataframe interchange protocol on PyArrow
2026-03-27 [Python] Possible follow-up on arithmetic support
2026-03-26 [Python] Pandas is deprecating dataframe interchange protocol
2026-03-20 [Python] pa.concat_batches() to include promote_options like pa.concat_tables()
2026-03-15 [C++][Python] Compute function to generate date from year / month / day
2026-03-13 [C++][Python] Table.join_asof occasionally fails in doctest
2026-03-13 [Python] Implement DictionaryArray converter for type pa.large_string and pa.large_binary
2026-03-04 [C++][Python] Investigate OpenTelemetry works as expected on Windows
2026-03-03 [Python][C++][GPU] Python Cuda jobs fail with ‘cuda.bindings.driver.CUcontext’ object has no attribute ‘value’
2026-02-28 [Python] MacOs stuck on importing (23.0.1) while works on (15.0.1)
2026-02-26 [Python] Investigate whether we can rely on pyproject.toml for build dependencies
2026-02-26 [Python] Validate PYARROW_GENERATE_COVERAGE works as expected
2026-02-24 [Python] Single chunk ChunkedArray doesn’t correctly respect copy in __array__method
2026-02-22 [Python][Dataset] Add filters parameter to orc.read_table() for predicate pushdown
Open PRs from last 90 days (new contributors highlighted)
Date Title
2026-05-12 GH-46856: [C++][Python] Add binary view comparison kernels
2026-05-08 GH-47447: [C++] Fix replace_with_mask for null type arrays
2026-04-27 GH-48344: [Python] Fix Table.from_struct_array for empty ChunkedArray
2026-04-25 GH-39961: [C++][Python] Propagate CSV parse delimiter to write options
2026-04-23 GH-49321: [C++][Python] Add ASAN / UBSAN pixi builds for Arrow and PyArrow
2026-04-22 GH-49826: [Python] Return NotImplemented from Scalar/Array arithmetic dunders for unsupported types
2026-04-21 GH-49826: [Python] Scalar arithmetic dunders return NotImplemented on unknown operand types
2026-04-09 GH-49058: [Python] Disallow non-UTF-8 bytes in custom metadata
2026-04-08 GH-15047: [Python]: switch from pytz to zoneinfo by default for string to tzinfo conversion
2026-04-07 GH-49677 [Python][C++][Compute] Add search sorted compute kernel
2026-04-06 GH-47435: [Python][Parquet] Add direct key encryption/decryption API
2026-04-03 GH-49505: [Python] Implement DictionaryArray converter for type pa.large_string and pa.large_binary
2026-03-24 GH-49232: [Python] deprecate feather python
2026-03-09 GH-49397: [Python] Fix PYARROW_GENERATE_COVERAGE end-to-end flow
2026-02-22 GH-41365: [Python] Fix S3 URI with whitespace silently falling back to LocalFileSystem
2026-02-22 GH-49358: [Python][Doc] Add import statement to filters_to_expression docstring example
2026-02-21 GH-49273: [Python] Move stub docstring script to _build_utils with gr…
2026-02-20 GH-49351: [C++] Check TZDIR environment variable in vendored date library

Open Issues (last 90 days)

8

Issues from New Contributors

1

Open PRs (last 90 days)

4

PRs from New Contributors

1

Open issues from last 90 days (new contributors highlighted)
Date Title
2026-05-06 [R] Update macOS CRAN job SDK from 11.3 to 14.5 to match R 4.6.0 build environment
2026-04-29 [R] Bump minimum supported R version to 4.2 now that 4.6 is out
2026-04-22 [R] Date Inf/ -Inf silently corrupted when converting to Arrow date32
2026-04-11 [R] array_to_vector() lacks coverage for several core Arrow types
2026-04-08 [R] Parquets do not support list-columns of ordered factors (ordered dictionaries)
2026-04-07 [R] CRAN packaging checklist for version 24.0.0
2026-03-27 [R] Add AGENTS.md file to R package
2026-02-24 [R][CI] r-devdocs crossbow job fails during gap between C++ and R releases
Open PRs from last 90 days (new contributors highlighted)
Date Title
2026-05-06 GH-49689: [R][C++] Parquets do not support list-columns of ordered factors (ordered dictionaries)
2026-05-05 GH-49901: [R] Bump minimum supported R version to 4.2 now that 4.6 is out
2026-03-19 GH-32123: [R] Expose azure blob filesystem
2026-03-02 Fixed usage of libarrow built from the git checkout not the released version from apt

Open Issues (last 90 days)

71

Issues from New Contributors

22

Open PRs (last 90 days)

47

PRs from New Contributors

20

Open issues from last 90 days (new contributors highlighted)
Date Title
2026-05-19 [C++] Fix Unity build ordering issue
2026-05-18 [Docs][C++][Parquet] Add API reference
2026-05-14 [C++][Gandiva] Add 2 arg REGEXP_EXTRACT function
2026-05-13 [C++] Fix remaining overflow and negative length handling issues in Gandiva string functions
2026-05-12 [C++] IPC file fuzzer fails when footer schema has differing endianness
2026-05-11 [C++][Parquet] Avoid unbounded temp alloc in BYTE_STREAM_SPLIT decoder
2026-05-09 [C++][Parquet] Uncontrolled Memory Allocation (OOM) in Parquet Delta decoders
2026-05-08 [C++][Gandiva] Use timegm in date_time_test utilities to avoid DST-dependent behavior
2026-05-08 [C++] HeadBucket called in S3FS breaking IAM scoped prefixes
2026-05-05 [C++] Bump xsimd to 14.2.0
2026-04-30 [C++] Deprecate RandomAccessFile::Read{At,Async} without allow_short_read
2026-04-28 [C++][Compute] true_unless_null kernel output incorrect when run-end encoded array contains null
2026-04-28 [C++][Python] Bind unresolved Substrait expressions using a supplied schema
2026-04-28 [C++] BufferBuilder integer overflow in size calculations reachable from JSON parsing
2026-04-28 [C++] Neon not enabled on Windows ARM64
2026-04-24 [C++] Compiler error in xsimd_ep_ep-install

_sse2.hpp with msvc on windows
2026-04-22 [CI][C++] Set test include/exclude by environment variable
2026-04-22 [C++] error: no member named ‘log2p1’ in namespace ‘std’
2026-04-22 [C++] 20 Re2 tests failed to run due to the executable files doesn’t generate
2026-04-21 [C++][FlightRPC][ODBC] linux-packages RPM installer
2026-04-21 [C++][FlightRPC][ODBC] linux-packages DEB installer
2026-04-20 [C++][FlightRPC][ODBC] Flakey GRPC_CALL_ERROR_TOO_MANY_OPERATIONS error in ODBC Linux
2026-04-20 [C++] arrow::Decimal128::FromString silently truncates when the input string has more than 38 significant digits
2026-04-20 [C++][FlightRPC][ODBC] Only run ODBC tests in ODBC CI
2026-04-20 [C++][FlightRPC][ODBC] Skip build for non-ODBC test binaries in CI
2026-04-20 [C++][FlightRPC][ODBC] Address Disconnect() in FlightSQL ODBC test suite
2026-04-18 [Format][FlightSQL] Add dialect-related SqlInfo codes (LIMIT/OFFSET syntax, NULLS ordering clause, boolean/datetime literals)
2026-04-17 [C++][FlightRPC][ODBC] Get ODBC tests passing on Linux
2026-04-17 [C++][FlightRPC][ODBC] Reuse ODBC connections across tests
2026-04-17 [C++][FlightRPC][ODBC] Implement support for SQL_DRIVER_AWARE_POOLING_SUPPORTED
2026-04-17 [C++][FlightRPC] crashes during gRPC status conversion with Clang-built Arrow 17.0.0
2026-04-16 [C++][FlightRPC][ODBC] Add Debug Linux Workflow
2026-04-15 [C++] Linux Flight/FlightSQL test randomly segfaults after gRPC version bump
2026-04-15 [C++][Gandiva] Some string functions may be overflowed
2026-04-15 [C++][Gandiva] gdv_hash_using_openssl() has some problems
2026-04-14 [Release][C++] C++ Extra workflow for the apache-arrow-24.0.0-rc0 was canceled
2026-04-14 [C++][Python] _export_to_c segmentation fault for binary_view array
2026-04-12 [C++][Parquet] Use PLAIN as default encoding for float32 and float64 columns in Parquet writer
2026-04-10 [C++][Parquet] Make writing of ColumnMetaData.path_in_schema optional
2026-04-10 [C++][Format] Can’t roundtrip Dictionary of ExtensionType over IPC
2026-04-09 [C++][Flight] ODBC BlockingQueue not working properly on Linux
2026-04-07 [C++] RecordBatch::MakeEmpty() may drop ordered flag and unsignedness of dictionary types when creating dictionary-encoded column
2026-04-02 [C++][FlightRPC][ODBC] Support static test linkage for macOS ODBC
2026-04-02 [C++] Arrow ipc format supports deduplication of dictionaries with dict_id, but there is no way to exercise it
2026-04-02 [C++] Lz4HadoopCodec::Compress writes single oversized block incompatible with Hadoop Lz4Decompressor
2026-03-27 Inspecting cmake presets options not available in cmake 4.x
2026-03-24 [C++][FlightRPC][ODBC] DEB Linux Installer
2026-03-23 [C++][FlightRPC][ODBC] Enable ODBC build on Debian
2026-03-18 [C++][FlightRPC][ODBC] Enable ODBC test build on Linux
2026-03-17 [C++][FlightRPC][ODBC] Change Windows ODBC to Static Linkage
2026-03-17 [C++][FlightRPC][ODBC] Add CI steps to support Windows DLL and MSI signing
2026-03-15 [C++][Python] Compute function to generate date from year / month / day
2026-03-13 [C++][Python] Table.join_asof occasionally fails in doctest
2026-03-12 [C++][Parquet] parquet-writer-test Large memory tests fail on TestColumnWriter.WriteLargeDictEncodedPage and TestColumnWriter.ThrowsOnDictIndicesTooLarge
2026-03-12 [C++][Parquet] I found formating errors in parquet.thrift
2026-03-11 [C++][FlightRPC][ODBC] Read booleans for applicable SqlInfoOptions
2026-03-10 [C++][CI] Investigate structure-aware fuzzing
2026-03-09 [C++][FlightRPC][ODBC] Fix inconsistent values in SQLGetInfo in global connection
2026-03-04 [C++][Python] Investigate OpenTelemetry works as expected on Windows
2026-03-04 Potential dereference of nullptr
2026-03-03 [Python][C++][GPU] Python Cuda jobs fail with ‘cuda.bindings.driver.CUcontext’ object has no attribute ‘value’
2026-02-27 [C++][CI][Packaging][FlightPRC] ODBC macOS PKG installer signing
2026-02-26 [C++] Allow filesystems to be implemented in separate library (+ move remote filesystems out of libarrow into their own shared libraries)
2026-02-26 [C++][Python] GcsFileSystem: can’t access Requester Pays buckets
2026-02-25 [C++] fatal error: ‘span’ file not found
2026-02-24 [C++][FlightRPC][ODBC] Enable DSN default values on macOS
2026-02-24 Parquet StreamReader should clarify its contract for parquet files without a schema.
2026-02-22 [C++][Dataset] ORC predicate pushdown: full operator and type coverage
2026-02-22 [C++][Dataset] Add OrcFileFragment with stripe filtering and predicate pushdown
2026-02-22 [C++][ORC] Add stripe statistics API to ORCFileReader
2026-02-20 [C++] Vendored date library does not respect TZDIR environment variable
Open PRs from last 90 days (new contributors highlighted)
Date Title
2026-05-19 GH-48801: [C++] Set CMAKE_POLICY_VERSION_MINIMUM for RapidJSON
2026-05-19 GH-49991: [C++] fix Unity build ordering issue
2026-05-18 GH-49985: Duplicate function aliases with same parameters
2026-05-18 GH-49973: [C++] Fix Gandiva string length checks
2026-05-14 GH-49977: [C++][Gandiva] Add regexp_extract optional third parameter function version
2026-05-12 GH-49966: [C++] Detect different endianness between IPC file and stream in IPC file fuzzer
2026-05-12 GH-46856: [C++][Python] Add binary view comparison kernels
2026-05-09 GH-49955: [C++] Fix OOM vulnerability in Parquet Delta decoders
2026-05-08 GH-49952: [C++][Gandiva] Use timegm in date_time_test utilities
2026-05-08 GH-39784: [C++][Gandiva] Fix decimal in_expr crash with cached object code
2026-05-08 GH-47447: [C++] Fix replace_with_mask for null type arrays
2026-05-06 GH-49689: [R][C++] Parquets do not support list-columns of ordered factors (ordered dictionaries)
2026-05-05 GH-49921: [C++] Bump xsimd to 14.2.0
2026-05-04 GH-49957 [C++][Parquet] Support reading dictionary encoded boolean pages
2026-05-02 GH-47662: [C++][Parquet] Reject metadata with null_count on required column
2026-04-29 GH-40024: [C++][Gandiva] Selectively register external C functions based on expression usage
2026-04-28 GH-41605: [C++][Gandiva] Stabilize TestFromUtcTimestamp around DST boundary
2026-04-28 GH-49884: [C++] Fix integer overflow in BufferBuilder reachable from JSON parsing
2026-04-28 GH-49881 [C++][Parquet] Support writing encrypted bloom filters
2026-04-26 GH-47642: [C++] Catch exceptions from initial_task in AsyncTaskSchedu…
2026-04-25 GH-39961: [C++][Python] Propagate CSV parse delimiter to write options
2026-04-24 GH-39808: [C++][Parquet] Evict pre-buffered row-group bytes after decode
2026-04-23 GH-49723: [C++][FlightRPC][ODBC] Update ODBC Documentation
2026-04-23 GH-46994: [C++][Parquet] Reuse BinaryView headers for repeated values in dictionary and DELTA_BYTE_ARRAY decoding
2026-04-21 GH-49817: [C++] Reject decimal strings that exceed the target precision
2026-04-20 GH-49753: [C++][Gandiva] Fix overflow in string functions
2026-04-20 GH-20314: [C++] Add GCS connection pool size option
2026-04-19 GH-49674: [C++][Array] Preserve ordered flag for DictionaryType in MakeEmptyArray
2026-04-19 GH-41017: [C++] Preserve ordered flag in DictionaryBuilder
2026-04-18 GH-49792: [Format][FlightSQL][C++] Add dialect-related SqlInfo codes
2026-04-17 GH-49783: [C++][FlightRPC][ODBC] Reuse connections across test suite
2026-04-16 GH-49752: [C++][Gandiva] Fix potential buffer overrun in gandiva ssl function
2026-04-16 GH-33823: [C++][IPC] Improve error messages when opening files that are the wrong format
2026-04-15 GH-47876: [C++][FlightRPC] ODBC: macOS .PKG installer for Intel and ARM
2026-04-08 GH-48068: [C++][FlightRPC] Linux ODBC: Configure Dremio instance to allow remote testing
2026-04-07 GH-49677 [Python][C++][Compute] Add search sorted compute kernel
2026-04-06 GH-49552: [C++][FlightRPC][ODBC] Enable ODBC test build on Linux
2026-04-03 GH-49505: [Python] Implement DictionaryArray converter for type pa.large_string and pa.large_binary
2026-04-02 GH-49641: [C++] Fix Lz4HadoopCodec to split large blocks for Hadoop compatibility
2026-03-26 GH-48590: [C++] Migrate SFINAE enable_if patterns to C++20 concepts
2026-03-24 GH-49538: [C++][FlightRPC][ODBC] Use static linkage in Windows FlightSQL ODBC driver
2026-03-14 GH-49502: [Parquet][C++] Fix missing overflow check for dictionary encoder indices count
2026-03-10 GH-49445: [C++]: Fix potential dereference of nullptr
2026-02-24 GH-49360: [C++][ORC] Add stripe statistics API to ORCFileReader
2026-02-22 GH-41365: [Python] Fix S3 URI with whitespace silently falling back to LocalFileSystem
2026-02-20 GH-49351: [C++] Check TZDIR environment variable in vendored date library
2026-02-20 MINOR: [C++] use std::move to avoid unnecessary copies
Activity Trends (2 years)
Dev Mailing List: LLM-generated summary of this month’s activity
  • Apache Arrow 24.0.0 Release: The community announced the general availability of version 24.0.0, which includes over 250 resolved issues across the ecosystem. Thread

  • Canonical BigDecimal Extension Type: A new proposal suggests a canonical extension type for arbitrary-precision decimals to support variable scales per row, improving compatibility with engines like PostgreSQL and Trino. Thread

  • ADBC Partitioned Bulk Ingest API: Discussion is underway for a new API allowing distributed writers like Spark and Flink to perform parallel bulk ingestion with atomic commit/abort semantics. Thread

  • Erlang Implementation Donation: The community voted to accept the donation of an Erlang implementation (arrow-erlang), currently built as bindings to the Rust implementation. Thread

  • SIMD Optimization for Parquet Bloom Filters: Proposal to port AVX2-accelerated Split Block Bloom Filter probes from Go to C++ using the xsimd library, demonstrating significant performance gains for in-memory filters. Thread

  • IBM Db2 Support via ADBC: Engineers from IBM are working to add Arrow support for Db2, starting with a dedicated ADBC driver and exploring Flight SQL for high-performance transport. Thread

  • UTF-8 Metadata Restrictions: A debate regarding whether to strictly enforce UTF-8 encoding for field and schema metadata in IPC and the C Data Interface to prevent interoperability issues. Thread

  • Flight SQL is_update Indicator: Proposal to add an is_update field to prepared statement results, enabling drivers to automatically choose the correct execution path for DML/DDL queries. Thread

  • ADBC Logo Selection: The project is discussing a new logo for the ADBC subproject to provide a distinct identity while maintaining visual consistency with the broader Arrow family. Thread

  • Rust Security Policy: Contributors are formalizing a security policy and guidelines for the Rust implementation to better manage vulnerability reporting and bug classification. Thread

PRs by contributor type (last 18 months)
Recent PRs from new contributors
Date Title Author State
2026-05-19 GH-48801: [C++] Set CMAKE_POLICY_VERSION_MINIMUM for RapidJSON KyleFromNVIDIA open
2026-05-19 GH-48801: [cmake] Update RapidJSON for CMake 4.0 compatibility KyleFromNVIDIA closed
2026-05-19 GH-49991: [C++] fix Unity build ordering issue spotaws open
2026-05-18 GH-49973: [C++] Fix Gandiva string length checks puneetdixit200 open
2026-05-12 GH-49967: [Python][CI] Raise oldest NumPy wheel-test requirement to a patched release arpitjain099 closed
2026-05-12 GH-46856: [C++][Python] Add binary view comparison kernels Periecle open
2026-05-12 MINOR: [CI] Bump matlab-actions/run-tests from 3.0.0 to 3.1.1 dependabot[bot] open
2026-05-09 GH-49955: [C++] Fix OOM vulnerability in Parquet Delta decoders sivaadityacoder open
2026-05-06 MINOR: [C++][Parquet][Doc] pointer dereference instead of dot in parquet.rst alexeyroytman open
2026-05-05 GH-49927: [Python][Parquet] Expose bloom_filter_offset and bloom_filter_length to Python in column chunk metadata haziqishere closed
2026-05-05 Fix/display bloom filter offset in column chunk meta data haziqishere closed
2026-05-05 GH-49917: [Python] Remove Py_XDECREF to avoid Use-After-Free on PyList_SetItem in SparseCSFTensorToNdarray wr-web closed
2026-04-28 GH-49875: [Python] Fix timezone dropped when converting tz-aware Categorical to Arrow array AnkitAhlawat7742 closed
2026-04-27 MINOR: [CI] Bump conda-incubator/setup-miniconda from 3.2.0 to 4.0.1 dependabot[bot] closed
2026-04-27 MINOR: [CI] Bump matlab-actions/run-tests from 3.0.0 to 3.1.0 dependabot[bot] closed
2026-04-27 GH-48344: [Python] Fix Table.from_struct_array for empty ChunkedArray 1fanwang open
2026-04-24 GH-39808: [C++][Parquet] Evict pre-buffered row-group bytes after decode justinli500 open
2026-04-24 MINOR: [R] cast() documented but not exported in NAMESPACE AnkitAhlawat7742 closed
2026-04-23 GH-43574: [Python][Parquet] do not add partition columns from file path when reading single file bkurtz closed
2026-04-22 GH-49826: [Python] Return NotImplemented from Scalar/Array arithmetic dunders for unsupported types alex-anast open
2026-04-22 GH-31318: [Python] Add fixed-offset timezones to Hypothesis test strategy alex-anast closed
2026-04-22 GH-45644: [Doc][Python] Document timezone loss when converting timestamp arrays to NumPy alex-anast closed
2026-04-21 GH-49826: [Python] Scalar arithmetic dunders return NotImplemented on unknown operand types SAY-5 open
2026-04-21 GH-49817: [C++] Reject decimal strings that exceed the target precision SAY-5 open
2026-04-21 ci: check upload quota before release jpopesculian open
2026-04-20 GH-49753: [C++][Gandiva] Fix overflow in string functions abtom87 open
2026-04-20 MINOR: [CI] Bump matlab-actions/run-tests from 3.0.0 to 3.1 dependabot[bot] closed
2026-04-20 GH-20314: [C++] Add GCS connection pool size option azhu248 open
2026-04-20 H-49753: [C++][Gandiva] Fix overflow in string functions. abtom87 closed
2026-04-20 GH-49789: [C++] Use CMAKE_INSTALL_DOCDIR instead of static share/doc/${PROJECT_NAME} Swaroop883 closed
2026-04-19 docs: add safety warning to Array.from_buffers docstring avasis-ai closed
2026-04-19 GH-41017: [C++] Preserve ordered flag in DictionaryBuilder tinezivic open
2026-04-18 GH-49792: [Format][FlightSQL][C++] Add dialect-related SqlInfo codes tokoko open
2026-04-16 GH-49719: [C++] Rename vendored date header guards hrishikeshh-shinde closed
2026-04-16 GH-33823: [C++][IPC] Improve error messages when opening files that are the wrong format RobertLD open
2026-04-15 GH-49759: [C++][Integration] Harden BinaryView JSON parsing with runtime validation metsw24-max closed
2026-04-14 GH-49751: [Python] Add raw fd support to pa.OSFile alippai closed
2026-04-13 MINOR: [CI] Bump actions/github-script from 8 to 9 dependabot[bot] closed
2026-04-13 MINOR: [CI] Bump matlab-actions/setup-matlab from 2.7.0 to 3.0.1 dependabot[bot] closed
2026-04-09 GH-49058: [Python] Disallow non-UTF-8 bytes in custom metadata nitrajen open
2026-04-07 GH-49677 [Python][C++][Compute] Add search sorted compute kernel Alex-PLACET open
2026-04-06 GH-47435: [Python][Parquet] Add direct key encryption/decryption API smaheshwar-pltr open
2026-04-06 MINOR: [CI] Bump matlab-actions/setup-matlab from 2.7.0 to 3.0.0 dependabot[bot] closed
2026-04-06 MINOR: [CI] Bump matlab-actions/run-tests from 2.3.1 to 3.0.0 dependabot[bot] closed
2026-04-06 MINOR: [CI] Bump docker/login-action from 4.0.0 to 4.1.0 dependabot[bot] closed
2026-04-05 GH-49433: [C++] Buffer ARROW_LOG output to prevent thread interleaving Shockp closed
2026-04-04 GH-49614: [C++] Report an error instead of silent truncation in base64_decode on invalid input Reranko05 closed
2026-04-03 GH-49505: [Python] Implement DictionaryArray converter for type pa.large_string and pa.large_binary fotinosk open
2026-04-01 GH-33390: [R] Field-level metadata max-romagnoli closed
2026-03-30 Feature/orc adapter 2 review cbb330 closed
2026-03-24 GH-49232: [Python] deprecate feather python piyushka-ally open
2026-03-19 GH-47685: [Docs][Python] Add nested grouping to Python docs TOC Desel72 closed
2026-03-19 GH-49555: [Python][Packaging] Add riscv64 manylinux wheel builds to release pipeline gounthar open
2026-03-19 MINOR: [Docs] Clarify struct validity masking with ‘hidden data’ example pucke-dev closed
2026-03-18 GH-47696: [Docs] PyCapsule protocol implementation status Desel72 closed
2026-03-18 GH-49539: [C++][Parquet] Fix argument count check in parquet_scan domibel closed
2026-03-16 GH-48467: [C++][Parquet] Add BufferedStats API to RowGroupWriter wecharyu closed
2026-03-16 GH-49503: [Docs][Python] Document that .pxi doctests are tested via lib.pyx BillionClaw closed
2026-03-16 GH-49503: [Docs][Python] Document that .pxi doctests are tested via lib.pyx BillionClaw closed
2026-03-15 GH-49503: [Docs][Python] Document .pxi doctests BillionClaw closed
2026-03-15 GH-49503: [Docs][Python] Documenting .pxi doctests are tested via lib.pyx bhavanaeh closed
2026-03-14 GH-49502: [Parquet][C++] Fix missing overflow check for dictionary encoder indices count aryansri05 open
2026-03-14 GH-32381: [C++] Improve error handling for hash table merges kris-gaudel closed
2026-03-11 GH-49497: [FlightRPC] Add is_update field to ActionCreatePreparedStatementResult ennuite open
2026-03-10 GH-49445: [C++]: Fix potential dereference of nullptr CT811 open
2026-03-09 MINOR: [CI] Bump docker/login-action from 3.7.0 to 4.0.0 dependabot[bot] closed
2026-03-09 GH-49397: [Python] Fix PYARROW_GENERATE_COVERAGE end-to-end flow omertt27 open
2026-03-05 GH-49272: [C++][CI] Fix intermittent segfault in arrow-json-test with MinGW vanshaj2023 closed
2026-03-05 GH-49454: [C++][Gandiva] Fix castVARCHAR_timestamp for pre-epoch timestamps dmitry-chirkov-dremio closed
2026-03-03 GH-49410: [C++] Fix if_else null-scalar fast paths for sliced BaseBinary arrays Ebraam-Ashraf closed
2026-03-03 GH-49441: [C++][Gandiva] Add rand_integer function dmitry-chirkov-dremio closed
2026-03-03 GH-49438: [C++][Gandiva] Optimize LPAD/RPAD functions dmitry-chirkov-dremio closed
2026-03-02 Fixed usage of libarrow built from the git checkout not the released version from apt GavkareShubham open
2026-03-02 MINOR: [CI] Bump actions/download-artifact from 7 to 8 dependabot[bot] closed
2026-03-02 MINOR: [CI] Bump actions/upload-artifact from 6 to 7 dependabot[bot] closed
2026-03-02 GH-49420: [C++][Gandiva] Fix castVARCHAR memory allocation and len<=0 handling dmitry-chirkov-dremio closed
2026-02-28 GH-37958: [MATLAB] Centralize Command Window hyperlink formatting kriyanshii open
2026-02-24 GH-49380: [R] Remove hidden CI test chunks from setup.Rmd to fix r-de… vanshaj2023 closed
2026-02-24 GH-49360: [C++][ORC] Add stripe statistics API to ORCFileReader cbb330 open
2026-02-23 GH-47279: [C++] Support BinaryView/StringView in ReferencedBufferSize veeceey closed
2026-02-22 GH-41365: [Python] Fix S3 URI with whitespace silently falling back to LocalFileSystem ernestprovo23 open
2026-02-21 GH-49273: [Python] Move stub docstring script to _build_utils with gr… vanshaj2023 open
2026-02-21 GH-49273: [Python] Move stub docstring script to _build_utils with gr… vanshaj2023 closed
2026-02-20 GH-49351: [C++] Check TZDIR environment variable in vendored date library canassa open
2026-02-20 GH-49340: [R] Preserve row order in write_dataset() marberts closed
2026-02-20 MINOR: [C++] use std::move to avoid unnecessary copies SYaoJun open
Good First Issues by Component
Good First Issues
63 issues across 10 components
Component Issues
C++ 21
Python 16
R 10
Documentation 6
MATLAB 4
Parquet 2
Archery 1
Benchmarking 1
Continuous Integration 1
Other 1
Needs Champion
Needs Champion
88 items - sorted by age (newest first)
Title Type Component Created Age (days)
[R] creating arrow supported expressions Issue R 2025-02-05 468
[Ruby] Improve JRuby Support Issue Ruby 2025-01-21 483
[R] arrow R package: multiple replacement disclaimers for str_replace_all Issue R 2025-01-20 484
[Parquet][C++] PageIndex is useless with current API Issue Parquet, C++ 2025-01-16 488
[C++][Python] Implement pc.equal for List arguments Issue C++, Python 2025-01-04 500
[Docs] Clean up LICENSE and NOTICE files Issue Documentation 2024-11-28 537
[R] Provide helpful hints for NotImplemented kernel errors Issue R 2024-11-15 550
[C++][Python] Potential improvements around supply chain security Issue C++, Python 2024-11-09 556
[Python] Efficient way to iterate over groups Issue Python 2024-11-07 558
[R] Add Cumsum and duplicated bindings to datasets in R Issue R 2024-11-07 558
[CI][Packaging][Python] Enable BuildKit for building wheel on Windows Issue Python, Continuous Integration, Packaging 2024-11-07 558
[Dev][Archery] Use --arrow-ref instead of --arrow-sha in archery crossbow submit Issue Archery, Developer Tools 2024-10-31 565
[Format][Docs] Describe C device interface on C data interface and C stream interface docs pages Issue Documentation, Format 2024-10-26 570
Misleading error message when casting Issue Python 2024-10-25 571
[CI][C++] Use a separated Docker image for Emscripten Issue C++, Continuous Integration 2024-10-18 578
[CI][C++] Add clang-cl job Issue C++ 2024-10-11 585
[GLib] Add a sub Buffer class for GBytes based buffer Issue GLib 2024-10-09 587
[C++] Binary View Compute Kernels Issue C++ 2024-10-08 588
[R] please write unregister_scalar_function and/or make registration local/temporary Issue R 2024-10-04 592
[Python] Allow PyCapsule Interface in pyarrow.scalar constructor? Issue Python 2024-09-25 601
[R] Support integer date and time classes from data.table Issue R 2024-09-19 607
[C++][Parquet] Add support for arrow::ArrayStatistics Issue Parquet, C++ 2024-08-04 653
[R] Subtracting X days from a given date in ymd format Issue R 2024-08-01 656
[Python] Move tests that are explicitly about conversion to/from numpy on test_array.py to a separate file Issue Python 2024-08-01 656
[R] Implement anonymous functions in calls to dplyr::across Issue R 2024-07-10 678
[CI][Dev] Add shell script formatter Issue Continuous Integration, Developer Tools 2024-06-28 690
[C++] Add support for system mimalloc Issue C++ 2024-06-19 699
[Python] Get size of IPC File ahead of time Issue Python 2024-06-07 711
[Python] Conversion to/from numpy 2.0+ new StringDType Issue Python 2024-06-06 712
[Docs][Format] Move IPC format spec back into a separate page Issue Documentation, Format 2024-05-15 734
[C++][Python] Update DLPack version Issue C++, Python 2024-05-15 734
[C++][Parquet] Predicate pushdown through arrow::dataset::ScanBuilder::Filter() not available on list fields Issue Parquet, C++ 2024-05-14 735
[R] Unable to disable url-encoding Issue R 2024-05-10 739
[Python] Pyarrow fs incorrectly resolves S3 URIs with white space as a local path Issue Python 2024-04-24 755
[CI][Archery] Archery linking should also check for undefined symbols Windows Issue Archery 2024-04-03 776
[CI][Archery] Archery linking should also check for undefined symbols macOS Issue Archery 2024-04-03 776
[CI][Archery] Archery linking should also check for undefined symbols Linux Issue Archery 2024-04-03 776
[Ruby] Improve Ruby’s GC integration Issue Ruby 2024-03-29 781
[Python] Allow pyarrow import to fail without triggering Py_FatalError Issue Python 2024-03-27 783
[Parquet] Make default fallback encoding choice smarter Issue Parquet, C++ 2024-03-18 792
[Python][Docs] Max batch size for Dataset Issue Python, Documentation 2024-03-15 795
[C++] Update vendored FlatBuffers to 24 Issue C++ 2024-03-14 796
[C++] String manipulation on a dictionary column Issue C++ 2024-03-08 802
[Python] Consider splitting _lib module into several parts Issue Python 2024-02-20 819
[Docs] Add a doc section for tensor arrays Issue Documentation 2024-02-08 831
[R] Export functions for low-level pointer operations Issue R 2024-01-25 845
[R] Update the docs to show how to avoid situations like data loss with leading zero in partition column Issue R 2024-01-17 853
[C++][Python] Floordiv compute kernel Issue C++, Python 2023-12-29 872
[Python] Add timezone information when printing TimestampArray Issue R, Python 2023-12-20 881
[Python] Clean up ExtensionType.__reduce__ Issue Python 2023-12-06 895
[R] Write metadata to parquet file as argument to write_parquet() Issue R 2023-11-19 912
[R][Documentation] Document add_filename on open_dataset help page Issue R, Documentation 2023-11-18 913
[R] preserve hive partitions when opening along a path / path vector Issue R, C++, Python 2023-11-15 916
[Python] Support serialization of Arrow files on disk without the identifier “Feather” Issue Python 2023-10-30 932
[R] Inconsistent naming Issue R 2023-10-25 937
[R] open_dataset - format is unclear Issue R 2023-10-10 952
[R] expose decimal_point argument in CSVConvertOptions Issue R 2023-10-03 959
[R] open_dataset() behavior with incorrectly quoted input data Issue R 2023-09-27 965
Update default version in parquet.rst PR Documentation 2023-08-17 1006
GH-36831: [C++] DictionaryArray support for MinMax Function PR C++ 2023-08-10 1013
Missing kernels for ordering with struct types Issue Python 2023-08-09 1014
[C++][Parquet] Process parquet rowgroups without Arrow conversion Issue Parquet, C++, Python 2023-05-17 1098
[C++] Why is arrow mmap marked MAP_PRIVATE (during read)? Issue C++ 2023-04-25 1120
[Python] Bindings for FixedShapeTensorType.FromTensor/ToTensor and FixedShapeTensorArray.strides Issue Python 2023-04-12 1133
[Python][Docs] Update/rearrange Data Types section and add FixedShapeTensorType Issue Python, Documentation 2023-04-12 1133
[R] Add an argument to open_csv_dataset() to repair duplicated column names or ignore them? Issue R 2023-04-07 1138
[Python] unexpected URL encoded path (white spaces) when uploading to S3 Issue Python 2023-04-05 1140
[Format][FlightRPC] Transfer FlightData in pieces Issue FlightRPC, Format 2023-03-07 1169
[C++] Create the first binary aggregate function kernel to serve as an example for other implementations Issue C++ 2023-01-30 1205
[C++][Python] Support parsing a StringArray full of JSON to a Table Issue C++, Python 2023-01-13 1222
[C++] Decide on duplicate column handling in scanner, add more tests Issue C++ 2022-11-22 1274
[C++] Add a “list_contains” kernel Issue C++ 2022-10-19 1308
[C++][Python] Allow an ExtensionType to register or implement custom casts Issue C++, Python 2022-09-29 1328
[R] Update make_date, make_datetime, ISOdate and ISOdatetime to use tz Issue R 2022-07-12 1407
“Edit this page” on docstring generated docs gives 404 Issue Documentation 2022-06-10 1439
[R] printing data in Table/RecordBatch print method Issue R, Python 2022-06-07 1442
[Archery] Add documentation for local development in archery/crossbow Issue Developer Tools 2022-04-12 1498
[R] Arrow/Parquet can’t open encrypted parquet files Issue R 2022-01-26 1574
[Python][Docs] Opening a partitioned dataset with schema and filter Issue Python, Documentation 2022-01-12 1588
[C++] Name the threads in thread pools Issue C++ 2022-01-07 1593
[Python] Support other interval types Issue Python 2021-10-07 1685
[Python] Array.__str__ shows misleading output for timestamp types with time zone set Issue C++, Python 2020-07-19 2130
[C++/Python] Kernel for SetItem(IntegerArray, values) (“replace_with_indices”) Issue C++, Python 2020-07-13 2136
[R] Add bindings to ConcatenateTables Issue R 2020-05-09 2201
[Crossbow] Eliminate libgit2 dependency Issue Developer Tools 2020-03-12 2259
[C++][Python] Support ExtensionType arrays in more kernels Issue C++ 2019-07-09 2506
[Doc] Better document the Tensor classes in the prose documentation Issue C++, Python, Documentation 2019-07-04 2511
[Python] Add documentation section for integrations with PyTorch, TensorFlow Issue Python, Documentation 2018-02-01 3029
Currently Failing on Main
Workflow Job Failing Since Days Broke Since
Upload R Nightly builds upload 2026-05-18 1 #49945
Crossbow Nightly Report