Skip to content

Fix incorrect measurement schema during compaction#17297

Open
shuwenwei wants to merge 14 commits intomasterfrom
fetchSchemaWhenDataTypeNotMatchedInCompaction
Open

Fix incorrect measurement schema during compaction#17297
shuwenwei wants to merge 14 commits intomasterfrom
fetchSchemaWhenDataTypeNotMatchedInCompaction

Conversation

@shuwenwei
Copy link
Member

Description

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses incorrect/unstable measurement schema (notably data type) selection during compaction by making compaction choose the “latest” schema (from schema engine / table cache) when source TsFiles contain conflicting schemas, and by expanding/refactoring compaction tests to cover these scenarios.

Changes:

  • Update compaction executors/iterators to detect data type mismatches across files and reconcile schemas using the latest schema (tree model) or table cache (table model).
  • Add a test-only schema fetcher injection path and new/refactored compaction tests (including parameterized coverage across performer types).
  • Fix table schema copy behavior in TsTable (copy tag column index map rather than a removed/incorrect map).

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/schema/table/TsTable.java Fix copy-constructor to copy tag column index map; remove unused/incorrect map.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/utils/CompactionFakeSchemaFetcherImpl.java Add schema fetcher test helper exposing schema tree.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/tablemodel/TableModelCompactionWithTTLTest.java Use centralized performer selection helper.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/CompactionDataTypeNotMatchTest.java Move into alterDataType package; parameterize by performer; reuse shared helpers.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/CompactionDataTypeNotMatchAlterableDataTypeTest.java Refactor to shared base; reuse file-generation helpers.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/CompactionDataTypeAlterTest.java New parameterized alter-data-type compaction test for tree model.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/CompactionDataTypeAlterTableTest.java New parameterized alter-data-type compaction test for table model using DataNodeTableCache.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/AbstractCompactionAlterDataTypeTest.java New shared base wiring a test schema fetcher + reusable TsFile generators.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/CompactionDataTypeAlterTest.java Remove old non-parameterized test (replaced by new alterDataType package tests).
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/AbstractCompactionTest.java Add shared getPerformer(String) helper for tests.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/queryengine/plan/analyze/FakeSchemaFetcherImpl.java Make schema tree and generator overridable (protected) for test specialization.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/executor/readchunk/ReadChunkAlignedSeriesCompactionExecutor.java Choose schema collection strategy for table vs tree model; reconcile schema for tree model using latest schema when needed.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/executor/fast/FastNonAlignedSeriesCompactionExecutor.java Add “can alter” validation before rewriting chunk types.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/executor/batch/BatchedReadChunkAlignedSeriesCompactionExecutor.java Thread database name through to aligned executor for table model schema lookup.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/MultiTsFileDeviceIterator.java Add database accessor; add schema reconciliation via latest schema (tree) and table cache (table); propagate corrected final types.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/CompactionUtils.java Add test-only schema fetcher injection and helper to fetch latest measurement schemas for tree model.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/CompactionSeriesContext.java Track whether series needs type update and allow setting final type explicitly.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/performer/impl/ReadChunkCompactionPerformer.java Pass database into aligned executor; reconcile non-aligned chunk types using latest schema (tree).
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/performer/impl/FastCompactionPerformer.java Simplify non-aligned compaction setup to use updated iterator context map.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

}

@After
public void tearDown() throws IOException, StorageEngineException {
Comment on lines +173 to +174
TsTableColumnSchema schemaInTsTable =
tsTable.getColumnSchema(chunkMetadata.getMeasurementUid());
if (measurementSchema == null && !timeseriesMetadata.getChunkMetadataList().isEmpty()) {
measurementSchema =
reader.getMeasurementSchema(timeseriesMetadata.getChunkMetadataList());
}
Comment on lines 322 to +372
// find correct data type
TSDataType correctDataType = null;
for (int i = readerAndChunkMetadataList.size() - 1; i >= 0 && correctDataType == null; i--) {
List<ChunkMetadata> chunkMetadataList = readerAndChunkMetadataList.get(i).getRight();
if (chunkMetadataList == null || chunkMetadataList.isEmpty()) {
continue;
}
boolean hasDifferentDataTypes = false;
Iterator<Pair<TsFileSequenceReader, List<ChunkMetadata>>> descIterator =
readerAndChunkMetadataList.descendingIterator();
while (descIterator.hasNext()) {
Pair<TsFileSequenceReader, List<ChunkMetadata>> pair = descIterator.next();
List<ChunkMetadata> chunkMetadataList = pair.right;
TSDataType dataTypeInCurrentFile = null;
for (ChunkMetadata chunkMetadata : chunkMetadataList) {
if (chunkMetadata == null) {
continue;
if (chunkMetadata != null) {
dataTypeInCurrentFile = chunkMetadata.getDataType();
break;
}
correctDataType = chunkMetadata.getDataType();
}
if (correctDataType == null) {
correctDataType = dataTypeInCurrentFile;
} else if (correctDataType != dataTypeInCurrentFile) {
hasDifferentDataTypes = true;
break;
}
}
if (correctDataType == null) {
if (!hasDifferentDataTypes) {
return readerAndChunkMetadataList;
}

IMeasurementSchema schema =
CompactionUtils.getLatestMeasurementSchemasForTreeModel(
deviceID, Collections.singletonList(measurement))
.get(0);
if (schema != null) {
correctDataType = schema.getType();
}

LinkedList<Pair<TsFileSequenceReader, List<ChunkMetadata>>> result = new LinkedList<>();
// check data type consistent and skip compact files with wrong data type
for (Pair<TsFileSequenceReader, List<ChunkMetadata>> tsFileSequenceReaderListPair :
readerAndChunkMetadataList) {
boolean dataTypeConsistent = true;
for (ChunkMetadata chunkMetadata : tsFileSequenceReaderListPair.getRight()) {
if (chunkMetadata != null
&& !MetadataUtils.canAlter(chunkMetadata.getDataType(), correctDataType)) {
dataTypeConsistent = false;
if (chunkMetadata == null) {
continue;
}
if (chunkMetadata.getDataType() == correctDataType) {
break;
}
if (chunkMetadata != null && chunkMetadata.getDataType() != correctDataType) {
chunkMetadata.setNewType(correctDataType);
if (!MetadataUtils.canAlter(chunkMetadata.getDataType(), correctDataType)) {
dataTypeConsistent = false;
break;
}
chunkMetadata.setNewType(correctDataType);
@shuwenwei shuwenwei marked this pull request as ready for review March 17, 2026 01:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses incorrect measurement schema selection during compaction (especially when data types change across source files), by consulting the latest schema from the schema fetcher / table cache instead of relying solely on chunk headers from input files.

Changes:

  • Update compaction executors/iterators to resolve the “correct” measurement schema (tree model via schema fetcher; table model via DataNodeTableCache) when data types differ across files.
  • Add test-only schema fetcher injection and new/updated compaction tests (parameterized across performers) to cover data type mismatch/alter scenarios, including table model.
  • Fix a TsTable copy-constructor bug that copied the wrong index map.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/schema/table/TsTable.java Fix copy-constructor to copy tagColumnIndexMap (previously copied the wrong map).
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/utils/CompactionFakeSchemaFetcherImpl.java New fake schema fetcher for compaction tests; exposes schema tree for mutation in tests.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/tablemodel/TableModelCompactionWithTTLTest.java Use shared getPerformer(String) from AbstractCompactionTest; remove duplicated performer wiring.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/CompactionDataTypeNotMatchTest.java Move into alterDataType package, parameterize performers, and populate fake schema tree for “latest schema” resolution.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/CompactionDataTypeNotMatchAlterableDataTypeTest.java New parameterized tests for “alterable” type mismatch scenarios; populates fake schema tree.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/CompactionDataTypeAlterTest.java New parameterized tests validating value correctness after compaction with type alteration (aligned + non-aligned).
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/CompactionDataTypeAlterTableTest.java New parameterized table-model tests validating metadata correctness under type alteration and schema cache usage.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/AbstractCompactionAlterDataTypeTest.java New shared test base: thread naming, schema fetcher cleanup, helper file generators for different types.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/CompactionDataTypeNotMatchAlterableDataTypeTest.java Deleted old non-parameterized version (superseded by alterDataType package tests).
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/CompactionDataTypeAlterTest.java Deleted old non-parameterized version (superseded by alterDataType package tests).
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/AbstractCompactionTest.java Inject compaction fake schema fetcher in setup; centralize performer selection helper; cleanup schema cache in teardown.
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/queryengine/plan/analyze/FakeSchemaFetcherImpl.java Make schema tree field/method protected to allow compaction test subclassing.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/executor/readchunk/ReadChunkAlignedSeriesCompactionExecutor.java Resolve aligned value schemas via latest tree schema or table cache; add table-model collection path.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/executor/fast/FastNonAlignedSeriesCompactionExecutor.java Reject files whose datatype cannot be altered to the chosen final type; only rewrite when alterable.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/executor/batch/BatchedReadChunkAlignedSeriesCompactionExecutor.java Thread database name into aligned read-chunk executor for table schema lookup.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/MultiTsFileDeviceIterator.java Enhance schema collection for table model and fetch latest schema for tree model when type differs; expose database name.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/CompactionUtils.java Add getLatestMeasurementSchemasForTreeModel(...) and test-only schema fetcher injection; use cache API with explicit flag.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/utils/CompactionSeriesContext.java Track whether a series needs datatype update and allow overwriting final type.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/performer/impl/ReadChunkCompactionPerformer.java Pass database name into aligned executor; choose correct datatype using latest schema when mixed types exist.
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/storageengine/dataregion/compaction/execute/performer/impl/FastCompactionPerformer.java Simplify non-aligned series setup to rely on updated CompactionSeriesContext for final type resolution.
Comments suppressed due to low confidence (2)

iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/CompactionDataTypeNotMatchTest.java:59

  • This test declares a new device field that shadows the device field inherited from AbstractCompactionAlterDataTypeTest. Even though they currently resolve to the same path, field shadowing is error-prone and makes it easy to accidentally use different device IDs between helpers (which use the superclass field) and assertions (which use this field). Prefer removing this field and using the inherited device everywhere (or rename if a different device is truly intended).
    iotdb-core/datanode/src/test/java/org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/CompactionDataTypeNotMatchTest.java:141
  • measurementSchemas1 is constructed but not used (the aligned INT32 file is generated via generateInt32AlignedSeriesFile(...)). Consider removing this unused list to keep the setup clear.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

}

@Test
public void testCompactAlignedSeriesWithReadChunkCompactionPerformer()
Comment on lines +127 to +130
List<IMeasurementSchema> measurementSchemas1 = new ArrayList<>();
measurementSchemas1.add(new MeasurementSchema("s1", TSDataType.INT32));
measurementSchemas1.add(new MeasurementSchema("s2", TSDataType.INT32));

@codecov
Copy link

codecov bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 95.87629% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.78%. Comparing base (e7f8097) to head (bd712ac).
⚠️ Report is 10 commits behind head on master.

Files with missing lines Patch % Lines
...ction/execute/utils/MultiTsFileDeviceIterator.java 97.12% 4 Missing ⚠️
...gion/compaction/execute/utils/CompactionUtils.java 85.00% 3 Missing ⚠️
...e/performer/impl/ReadChunkCompactionPerformer.java 94.28% 2 Missing ⚠️
...hunk/ReadChunkAlignedSeriesCompactionExecutor.java 97.53% 2 Missing ⚠️
...org/apache/iotdb/commons/schema/table/TsTable.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17297      +/-   ##
============================================
+ Coverage     39.72%   39.78%   +0.06%     
  Complexity      282      282              
============================================
  Files          5101     5099       -2     
  Lines        342143   342513     +370     
  Branches      43583    43660      +77     
============================================
+ Hits         135914   136280     +366     
- Misses       206229   206233       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


CompactionTsFileReader reader = (CompactionTsFileReader) readerMap.get(resource);

for (Map.Entry<String, Pair<TimeseriesMetadata, Pair<Long, Long>>> entrySet :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entrySet -> entry

Comment on lines +165 to +168
schemaFetcher
.getSchemaTree()
.appendSingleMeasurementPath(
new MeasurementPath(device, "s1", new MeasurementSchema("s1", TSDataType.DOUBLE)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a case where s1 is still int32?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in org/apache/iotdb/db/storageengine/dataregion/compaction/alterDataType/CompactionDataTypeNotMatchTest.java

InnerSpaceCompactionTask task =
new InnerSpaceCompactionTask(
0, tsFileManager, seqResources, true, new ReadChunkCompactionPerformer(), 0);
Assert.assertTrue(task.start());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the target file?

Comment on lines +93 to +94
Assert.assertEquals(
1, ((long) tsFileManager.getTsFileList(true).get(0).getStartTime(device).get()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check end time?

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants