This file provides context for AI coding agents working with the StarRocks codebase. Compatible with: Claude Code, OpenCode, Cursor, Gemini CLI, Windsurf, Aider, Continue, Cline, and other MCP-compatible tools.
StarRocks is a high-performance, real-time analytical database. It delivers sub-second query latency for complex analytics workloads through its MPP (Massively Parallel Processing) architecture.
Key capabilities:
- Real-time analytics with sub-second latency
- High-concurrency, low-latency queries
- Support for both shared-nothing and shared-data deployments
- Materialized views for query acceleration
- External data source federation (Hive, Iceberg, Hudi, Delta Lake, JDBC, etc.)
StarRocks uses a decoupled frontend-backend architecture:
┌─────────────────────────────────────────────────────────────┐
│ Frontend (FE) │
│ Java │ SQL Parsing │ Query Planning │ Metadata Management │
└────────────────────────────┬────────────────────────────────┘
│ Thrift RPC
┌────────────────────────────▼────────────────────────────────┐
│ Backend (BE) │
│ C++ │ Query Execution │ Storage Engine │ Data Processing │
└─────────────────────────────────────────────────────────────┘
- FE (Frontend): Java-based. Handles SQL parsing, query optimization, metadata management, and cluster coordination.
- BE (Backend): C++-based. Handles query execution, storage management, and data processing.
starrocks/
├── be/ # Backend (C++) - Query execution & storage
├── fe/ # Frontend (Java) - SQL & metadata
│ ├── fe-core/ # Core FE logic
│ ├── fe-parser/ # SQL parser
│ ├── fe-grammar/ # ANTLR grammar files
│ ├── fe-type/ # Type system
│ ├── fe-spi/ # Service Provider Interfaces
│ ├── connector/ # External data source connectors
│ └── plugin/ # FE plugins
├── java-extensions/ # JNI connectors for external sources
├── gensrc/ # Generated code (Thrift, Protobuf)
├── test/ # SQL integration tests
├── docs/ # Documentation (Docusaurus)
├── thirdparty/ # Third-party dependencies
└── docker/ # Docker build files
# Build everything
./build.sh --fe --be --clean
# Build only Backend
./build.sh --be
# Build only Frontend
./build.sh --fe
# Build with specific type (Release/Debug/ASAN)
BUILD_TYPE=Debug ./build.sh --be
BUILD_TYPE=ASAN ./build.sh --be# Run all FE unit tests
./run-fe-ut.sh
# Run specific FE test
./run-fe-ut.sh --test com.starrocks.sql.plan.TPCHPlanTest
# Run all BE unit tests
./run-be-ut.sh
# Run specific BE test
./run-be-ut.sh --test CompactionUtilsTest
# Run with gtest filter
./run-be-ut.sh --gtest_filter "TabletUpdatesTest*"
# Run SQL integration tests (requires running cluster)
cd test && python3 run.py -v# Format C++ code (BE)
clang-format -i <file.cpp>
# Check Java code style (FE)
cd fe && mvn checkstyle:check- Style: Google C++ Style (with modifications)
- Config:
.clang-formatat project root - Indent: 4 spaces
- Line limit: 120 characters
- Pointer alignment: Left (
int* ptr)
- Style: Google Java Style (with modifications)
- Config:
fe/checkstyle.xml - Indent: 4 spaces
- Line limit: 130 characters
- Import order: Third-party, then Java standard, then static
- Message names:
PascalCasePB(e.g.,MyMessagePB) - Field names:
snake_case - Never use
requiredfields - Never change field ordinals
- Struct names:
TPascalCase(e.g.,TMyStruct) - Field names:
snake_case - Never use
requiredfields - Never change field ordinals
When you add, modify, or remove configuration parameters, you MUST update the corresponding documentation:
| Component | Documentation File |
|---|---|
| FE Config | docs/en/administration/management/FE_configuration.md |
| BE Config | docs/en/administration/management/BE_configuration.md |
Required documentation for each config parameter:
- Parameter name
- Default value
- Value range (if applicable)
- Description of what it controls
- When to use/modify it
- Whether it requires restart
When you add, modify, or remove metrics, you MUST update the corresponding documentation:
| Metrics Type | Documentation File |
|---|---|
| General Metrics | docs/en/administration/management/monitoring/metrics.md |
| Shared-Data Metrics | docs/en/administration/management/monitoring/metrics-shared-data.md |
| MV Metrics | docs/en/administration/management/monitoring/metrics-materialized_view.md |
| BE Metrics (SQL) | docs/en/sql-reference/information_schema/be_metrics.md |
| FE Metrics (SQL) | docs/en/sql-reference/information_schema/fe_metrics.md |
Required documentation for each metric:
- Metric name
- Type (Counter, Gauge, Histogram)
- Labels (if any)
- Description of what it measures
- Unit (if applicable)
When your PR includes configuration or metrics changes:
- Updated corresponding documentation in
docs/en/ - Updated Chinese documentation in
docs/zh/(if exists) - Documented default values and valid ranges
- Added deprecation notice if replacing old config/metric
- Write in English
- Start with a verb in imperative mood (e.g., "Fix", "Add", "Update")
- Be concise but descriptive
- Example:
Fix null pointer exception in tablet compaction
[Type] Brief description
Types:
| Type | Usage |
|---|---|
[BugFix] |
Bug fixes |
[Feature] |
New features |
[Enhancement] |
Improvements to existing features |
[Refactor] |
Code refactoring (no behavior change) |
[UT] |
Unit test additions/fixes |
[Doc] |
Documentation changes |
[Tool] |
Tooling/build changes |
Examples:
[BugFix] Fix memory leak in hash join operator[Feature] Add support for ARRAY_AGG function[Enhancement] Improve partition pruning performance
Every PR must follow this template:
## Why I'm doing:
<!-- Explain the motivation and context -->
Describe why this change is needed.
## What I'm doing:
<!-- Describe what changes you made -->
Explain what this PR does.
Fixes #issue_number
## What type of PR is this:
- [ ] BugFix
- [ ] Feature
- [ ] Enhancement
- [ ] Refactor
- [ ] UT
- [ ] Doc
- [ ] Tool
## Does this PR entail a change in behavior?
- [ ] Yes, this PR will result in a change in behavior.
- [ ] No, this PR will not result in a change in behavior.
## If yes, please specify the type of change:
- [ ] Interface/UI changes: syntax, type conversion, expression evaluation, display information
- [ ] Parameter changes: default values, similar parameters but with different default values
- [ ] Policy changes: use new policy to replace old one, functionality automatically enabled
- [ ] Feature removed
- [ ] Miscellaneous: upgrade & downgrade compatibility, etc.
## Checklist:
- [ ] I have added test cases for my bug fix or my new feature
- [ ] This PR needs user documentation (for new or modified features or behaviors)
- [ ] I have added documentation for my new feature or new function
- [ ] This is a backport PR
## Bugfix cherry-pick branch check:
- [ ] I have checked the version labels which the PR will be auto-backported to the target branch
- [ ] 4.1
- [ ] 4.0
- [ ] 3.5
- [ ] 3.4- One Commit per PR: Squash multiple commits before merging
- Link Issue: Reference related issue with
Fixes #issue_number - Tests Required:
- Bug fixes must include regression tests
- New features must include unit tests
- Documentation: Update docs for user-facing changes
- Fill Template: Complete all sections of PR template
- CI Must Pass: All automated checks must be green
If your PR changes behavior, classify the change type:
| Change Type | Examples |
|---|---|
| Interface/UI | New SQL syntax, changed output format, type conversion changes |
| Parameter | Changed default values, new config parameters |
| Policy | Auto-enabled features, changed default policies |
| Feature removed | Deprecated functionality removed |
| Compatibility | Upgrade/downgrade impacts |
For bug fixes that need to be backported to release branches:
- Add version labels (e.g.,
4.1,4.0,3.5) to your PR - The CI will auto-create backport PRs after merge
- Verify backport PRs are created and merged
- Minimum 2 Approvals: At least 2 committers must approve
- CI Checks: All automated checks must pass:
- Code style (checkstyle, clang-format)
- Unit tests (FE and BE)
- Build verification
- Address Feedback: Respond to all review comments
- CLA Required: Sign CLA once: https://cla-assistant.io/StarRocks/starrocks
Your PR will trigger these checks:
| Check | Description |
|---|---|
PR CHECKER |
Basic PR validation |
FE UT |
Frontend unit tests |
BE UT |
Backend unit tests |
Build |
Full build verification |
Checkstyle |
Java code style |
Clang-format |
C++ code style |
| Issue | Solution |
|---|---|
| CI timeout | Re-run failed jobs; check for flaky tests |
| Checkstyle failure | Run mvn checkstyle:check locally |
| Build failure | Run ./build.sh --fe --be locally |
| Merge conflicts | Rebase on latest main branch |
- BE: Use Google Test framework. Tests in
be/test/mirror source structure. - FE: Use JUnit 5. Tests in
fe/fe-core/src/test/java/.
- Use the SQL-tester framework in
test/ - See
test/README.mdfor detailed documentation - Basic execution:
python3 run.py -v
- All new features must have corresponding tests
- Bug fixes should include regression tests
- Maintain or improve test coverage
- Use SIMD instructions where applicable (AVX2/AVX512)
- Prefer vectorized processing over row-by-row
- Use
ColumnandChunkabstractions for data processing - Be mindful of memory allocations in hot paths
- Profile with
perfor async-profiler before optimization
- Avoid creating unnecessary objects in hot paths
- Use appropriate data structures for the use case
- Be cautious with synchronized blocks
- Validate all user inputs at system boundaries
- Never log sensitive information (passwords, tokens, PII)
- Follow existing authentication/authorization patterns
- Use parameterized queries to prevent SQL injection
For module-specific guidelines, refer to:
be/AGENTS.md- Backend C++ developmentfe/AGENTS.md- Frontend Java developmentjava-extensions/AGENTS.md- JNI connector developmentgensrc/AGENTS.md- Generated code handlingtest/AGENTS.md- SQL integration testingdocs/AGENTS.md- Documentation contribution
StarRocks is licensed under the Apache License 2.0.
All source files must include the appropriate license header:
// Copyright 2021-present StarRocks, Inc. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// ...