About 50 results
Open links in new tab
  1. What are the pros and cons of the Apache Parquet format …

    Apr 24, 2016 · 30,36,2 Parquet files are most commonly compressed with the Snappy compression algorithm. Snappy compressed files are splittable and quick to inflate. Big data …

  2. Extension of Apache parquet files, is it '.pqt' or '.parquet'?

    Oct 19, 2021 · I wonder if there is a consensus regarding the extension of parquet files. I have seen a shorter .pqt extension, which has typical 3-letters (like in csv, tsv, txt, etc) and then …

  3. Inspect Parquet from command line - Stack Overflow

    How do I inspect the content of a Parquet file from the command line? The only option I see now is $ hadoop fs -get my-path local-file $ parquet-tools head local-file | less I would like to avoid

  4. How to read a Parquet file into Pandas DataFrame?

    How to read a modestly sized Parquet data-set into an in-memory Pandas DataFrame without setting up a cluster computing infrastructure such as Hadoop or Spark? This is only a …

  5. Is it possible to read parquet files in chunks? - Stack Overflow

    Nov 29, 2019 · The Parquet format stores the data in chunks, but there isn't a documented way to read in it chunks like read_csv. Is there a way to read parquet files in chunks?

  6. What is actually meant when referring to parquet row-group size?

    Jul 27, 2023 · Parquet readers should be capable of applying skips at page-level resolution. Disclaimer: I work on arrow-c++/pyarrow. The pyarrow datasets reader's performance is very …

  7. What file extension is the correct way to name parquet files?

    Jun 3, 2025 · <file-name>.parquet : 1) This is the standard and most widely accepted naming convention. 2) The compression codec is stored in the Parquet file metadata, not in the …

  8. Updating values in apache parquet file - Stack Overflow

    Mar 3, 2015 · I have a quite hefty parquet file where I need to change values for one of the column. One way to do this would be to update those values in source text files and recreate …

  9. How to view Apache Parquet file in Windows? - Stack Overflow

    Jun 19, 2018 · 98 What is Apache Parquet? Apache Parquet is a binary file format that stores data in a columnar fashion. Data inside a Parquet file is similar to an RDBMS style table where …

  10. Is it better to have one large parquet file or lots of smaller parquet ...

    Mar 21, 2017 · The only downside of larger parquet files is it takes more memory to create them. So you can watch out if you need to bump up Spark executors' memory. row groups are a way …