Skip to content

Add test file for floating types with different column orders and nan_counts#104

Open
wgtmac wants to merge 1 commit intoapache:masterfrom
wgtmac:ieee754_nan
Open

Add test file for floating types with different column orders and nan_counts#104
wgtmac wants to merge 1 commit intoapache:masterfrom
wgtmac:ieee754_nan

Conversation

@wgtmac
Copy link
Copy Markdown
Member

@wgtmac wgtmac commented Mar 28, 2026

Generated by parquet-java with a tool provided in apache/parquet-java@68ed638.

» parquet-cli meta target/parquet-testing/data/floating_orders_nan_count.parquet

File path:  target/parquet-testing/data/floating_orders_nan_count.parquet
Created by: parquet-mr version 1.18.0-SNAPSHOT (build 9694db6d4f3bd32fe0c7068307b6cc324b1d0b56)
Properties:
  original.created.by: parquet-mr version 1.18.0-SNAPSHOT (build 9694db6d4f3bd32fe0c7068307b6cc324b1d0b56)
    writer.model.name: example
Schema:
message msg {
  required float float_ieee754;
  required float float_typedef;
  required double double_ieee754;
  required double double_typedef;
  required fixed_len_byte_array(2) float16_ieee754 (FLOAT16);
  required fixed_len_byte_array(2) float16_typedef (FLOAT16);
}


Row group 0:  count: 10  42.20 B records  start: 4  total(compressed): 422 B total(uncompressed):422 B
--------------------------------------------------------------------------------
                 type      encodings count     avg size   nulls   min / max
float_ieee754    FLOAT     _   _     10        6.30 B     0       "-2.0" / "5.0"
float_typedef    FLOAT     _   _     10        6.30 B     0       "-2.0" / "5.0"
double_ieee754   DOUBLE    _   _     10        10.50 B    0       "-2.0" / "5.0"
double_typedef   DOUBLE    _   _     10        10.50 B    0       "-2.0" / "5.0"
float16_ieee754  FIXED[2] _   _     10        4.30 B   0       "-2.0" / "5.0"
float16_typedef  FIXED[2] _   _     10        4.30 B   0       "-2.0" / "5.0"

Row group 1:  count: 10  42.20 B records  start: 426  total(compressed): 422 B total(uncompressed):422 B
--------------------------------------------------------------------------------
                 type      encodings count     avg size   nulls   min / max
float_ieee754    FLOAT     _   _     10        6.30 B     0       "-2.0" / "5.0"
float_typedef    FLOAT     _   _     10        6.30 B     0
double_ieee754   DOUBLE    _   _     10        10.50 B    0       "-2.0" / "5.0"
double_typedef   DOUBLE    _   _     10        10.50 B    0
float16_ieee754  FIXED[2] _   _     10        4.30 B   0       "-2.0" / "5.0"
float16_typedef  FIXED[2] _   _     10        4.30 B   0

Row group 2:  count: 10  42.20 B records  start: 848  total(compressed): 422 B total(uncompressed):422 B
--------------------------------------------------------------------------------
                 type      encodings count     avg size   nulls   min / max
float_ieee754    FLOAT     _   _     10        6.30 B     0       "NaN" / "NaN"
float_typedef    FLOAT     _   _     10        6.30 B     0
double_ieee754   DOUBLE    _   _     10        10.50 B    0       "NaN" / "NaN"
double_typedef   DOUBLE    _   _     10        10.50 B    0
float16_ieee754  FIXED[2] _   _     10        4.30 B   0       "NaN" / "NaN"
float16_typedef  FIXED[2] _   _     10        4.30 B   0

- column orders: IEEE754_TOTAL_ORDER vs TYPE_DEFINED_ORDER
- nan_counts: in both statistics and column index
@wgtmac
Copy link
Copy Markdown
Member Author

wgtmac commented Mar 28, 2026

cc @etseidl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant